zoukankan      html  css  js  c++  java
  • Python笔记_第五篇_Python数据分析基础教程_文件的读写

      1. 读写文件(基本)

      savetxt、loadtxt

    i2 = np.eye(2)
    print(i2)
    np.savetxt(r"C:UsersThomasDesktopeye.txt",i2)
    
    c,v = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(6,7),unpack=True)
    print(c,v)
    #[336.1  339.32 345.03 344.32 343.44 346.5  351.88 355.2  358.16 354.54
    # 356.85 359.18 359.9  363.13 358.3  350.56 338.61 342.62 342.88 348.16
    # 353.21 349.31 352.12 359.56 360.   355.36 355.76 352.47 346.67 351.99] [21144800. 13473000. 15236800.  9242600. 14064100. 11494200. 17322100.
    # 13608500. 17240800. 33162400. 13127500. 11086200. 10149000. 17184100.
    # 18949000. 29144500. 31162200. 23994700. 17853500. 13572000. 14395400.
    # 16290300. 21521000. 17885200. 16188000. 19504300. 12718000. 16192700.
    # 18138800. 16824200.]
    

      delimiter=用什么进行分隔符,一般csv文件都是逗号

      usecols=6,7,表示获取第七和第八字段数据,也就是股票的收盘价和成交量。

      unpack变量为真:拆分存储不同列的数据,即分别将收盘价和成交量的数据赋值给c和v,也就是分开显示的意思。

      2. 加权平均价格:average

         VWAP

    import numpy as np
    
    # 加权平均价格
    c,v = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(6,7),unpack=True)
    vwap = np.average(c,weights=v)
    print("VWAP = ", vwap)
    #VWAP =  350.5895493532009

      

      3. 算术平均值:mean

    import numpy as np
    
    # 加权平均价格
    c,v = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(6,7),unpack=True)
    mean = np.mean(c)
    print("mean = ", mean)
    #mean =  351.0376666666667

      4. 时间加权平均价格:

      TWAP

    import numpy as np
    
    # 加权平均价格
    c,v = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(6,7),unpack=True)
    t = np.arange(len(c))
    twap = np.average(c,weights=t)
    print("twap = ", twap)
    #twap =  352.4283218390804

     

      5. 最大值、最小值、极差值

      max、min、ptp:

    import numpy as np
    
    # 最大值、最小值、极差值
    h,l = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(4,5),unpack=True)
    highest = np.max(h)
    lowest = np.min(l)
    spread_highest = np.ptp(h)
    spread_lowest = np.ptp(l)
    
    print("highest = ", highest)
    print("lowest = ", lowest)
    print("spread_highest = ", spread_highest)
    print("spread_lowest = ", spread_lowest)
    #highest =  364.9
    #lowest =  333.53
    #spread_highest =  24.859999999999957
    #spread_lowest =  26.970000000000027

     

      6. 中位数:median  

      排序函数:msort

      方差:var

      标准差:std

    import numpy as np
    
    # 中位数
    c = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(6,),unpack=True)
    print("median = ",np.median(c))
    
    # 排序函数
    print("sorted_close = ",np.msort(c))
    
    # 方差函数
    print("var = ",np.var(c))
    
    # 标准差函数
    print("std = ",np.std(c))
    
    #median =  352.055
    #sorted_close =  [336.1  338.61 339.32 342.62 342.88 343.44 344.32 345.03 346.5  346.67
    # 348.16 349.31 350.56 351.88 351.99 352.12 352.47 353.21 354.54 355.2
    # 355.36 355.76 356.85 358.16 358.3  359.18 359.56 359.9  360.   363.13]
    #var =  50.126517888888884
    #std =  7.080008325481608
    

      7. 差分函数:diff

      条件选择函数:where

    # 差分函数
    c = np.loadtxt(r"C:UsersThomasDesktopdata.csv",delimiter=',',usecols=(6,),unpack=True)
    print("diff = ",np.diff(c))
    
    # 条件选择函数
    print("price > 0",np.where(c > 0))
    
    #diff =  [  3.22   5.71  -0.71  -0.88   3.06   5.38   3.32   2.96  -3.62   2.31
    #   2.33   0.72   3.23  -4.83  -7.74 -11.95   4.01   0.26   5.28   5.05
    #  -3.9    2.81   7.44   0.44  -4.64   0.4   -3.29  -5.8    5.32]
    #price > 0 (array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16,
    #       17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29], dtype=int64),)
    

      

      8. 日期分析:

    import numpy as np
    from datetime import datetime
    
    def datestr2num(s):
       return datetime.strptime(s.decode('ascii'), "%d-%m-%Y").date().weekday()
    
    dates, close=np.loadtxt(r"C:UsersThomasDesktopdata.csv", delimiter=',', usecols=(1,6), converters={1: datestr2num}, unpack=True)
    print("dates = ",dates)
    
    #dates =  [4. 0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 0. 1. 2. 3. 4. 1. 2. 3. 4. 0. 1. 2. 3.
    # 4. 0. 1. 2. 3. 4.]

       注意:这里的s要解析ascii码

      9. summarize函数:对轴或者维度的编号进行定义

      apply_along_axis:这个函数会调用另外一个有我们给出的函数,作用于每一个数组元素上。目前我们的数组总有3个元素,分别用于示例数据总的3个星期,元素中的索引值对应于实例数据中的1天。

    23

    1

    23

  • 相关阅读:
    聊聊WS-Federation
    用双十一的故事串起碎片的网络协议(上)
    责任链模式的使用-Netty ChannelPipeline和Mina IoFilterChain分析
    最小化局部边际的合并聚类算法(中篇)
    最小化局部边际的合并聚类算法(上篇)
    UVaLive 7371 Triangle (水题,判矩形)
    UVaLive 7372 Excellence (水题,贪心)
    POJ 3312 Mahershalalhashbaz, Nebuchadnezzar, and Billy Bob Benjamin Go to the Regionals (水题,贪心)
    UVa 1252 Twenty Questions (状压DP+记忆化搜索)
    UVa 10817 Headmaster's Headache (状压DP+记忆化搜索)
  • 原文地址:https://www.cnblogs.com/noah0532/p/11273611.html
Copyright © 2011-2022 走看看