zoukankan      html  css  js  c++  java
  • Pandas快速入门(深度学习入门2)

    源地址为:http://pandas.pydata.org/pandas-docs/stable/10min.html#min

    Pandas(Python Data Analysis Library) 是基于NumPy 的一种工具,该工具是为了解决数据分析任务而创建的。Pandas 纳入了大量库和一些标准的数据模型,提供了高效地操作大型数据集所需的工具。pandas提供了大量能使我们快速便捷地处理数据的函数和方法。你很快就会发现,它是使Python成为强大而高效的数据分析环境的重要因素之一。

    Pandas的数据结构:

    Series:一维数组,与Numpy中的一维array类似。二者与Python基本的数据结构List也很相近,其区别是:List中的元素可以是不同的数据类型,而Array和Series中则只允许存储相同的数据类型,这样可以更有效的使用内存,提高运算效率。
     
    Time- Series:以时间为索引的Series。
     
    DataFrame:二维的表格型数据结构。很多功能与R中的data.frame类似。可以将DataFrame理解为Series的容器。以下的内容主要以DataFrame为主。
     
    Panel :三维的数组,可以理解为DataFrame的容器。 
     

    Pandas一般导入文件为:

    In [69]:
    import pandas as pd
    import numpy  as np
    import matplotlib.pyplot as plt
    
     

    对象建立

    创建Series,在pandas中,Series是代labels的数组。 首先用list创建数据,pandas会自动创建labels

    In [70]:
    s = pd.Series([1,3,5,np.nan,6,8])
    s
    
    Out[70]:
    0    1.0
    1    3.0
    2    5.0
    3    NaN
    4    6.0
    5    8.0
    dtype: float64
     

    通过传递numpy数组,使用datetime索引和标记列来创建DataFrame:

    In [71]:
    dates = pd.date_range('20130101', periods=6)
    dates
    
    Out[71]:
    DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
                   '2013-01-05', '2013-01-06'],
                  dtype='datetime64[ns]', freq='D')
    In [72]:
    df = pd.DataFrame(np.random.randn(6,4), index=dates, columns=list('ABCD'))
    df
    
    Out[72]:
     ABCD
    2013-01-01 -0.234526 -0.758501 -0.609560 -0.541163
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614
    2013-01-03 0.345764 0.031672 0.660530 0.240518
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324
     

    通过传递可以转换为类似Series的对象的Dict来创建DataFrame。

    In [73]:
    df2 = pd.DataFrame({ 'A' : 1.,
                         'B' : pd.Timestamp('20130102'),
                         'C' : pd.Series(1,index=list(range(4)),dtype='float32'),
                         'D' : np.array([3] * 4,dtype='int32'),
                         'E' : pd.Categorical(["test","train","test","train"]),
                         'F' : 'foo' })
    df2
    
    Out[73]:
     ABCDEF
    0 1.0 2013-01-02 1.0 3 test foo
    1 1.0 2013-01-02 1.0 3 train foo
    2 1.0 2013-01-02 1.0 3 test foo
    3 1.0 2013-01-02 1.0 3 train foo
     

    显示其类型:

    In [74]:
    df2.dtypes
    
    Out[74]:
    A           float64
    B    datetime64[ns]
    C           float32
    D             int32
    E          category
    F            object
    dtype: object
     

    查看数据

    查看开始或结尾几行数据

    In [75]:
    df.head(2)
    
    Out[75]:
     ABCD
    2013-01-01 -0.234526 -0.758501 -0.609560 -0.541163
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614
    In [76]:
    df.tail(3)
    
    Out[76]:
     ABCD
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324
     

    查看索引、列名称、numpy数据

    In [77]:
    df.index
    
    Out[77]:
    DatetimeIndex(['2013-01-01', '2013-01-02', '2013-01-03', '2013-01-04',
                   '2013-01-05', '2013-01-06'],
                  dtype='datetime64[ns]', freq='D')
    In [78]:
    df.columns
    
    Out[78]:
    Index(['A', 'B', 'C', 'D'], dtype='object')
    In [79]:
    df.values
    
    Out[79]:
    array([[-0.23452598, -0.75850137, -0.60955962, -0.54116297],
           [-1.37778166, -0.80120685,  2.16304408,  0.48161429],
           [ 0.34576422,  0.03167245,  0.66053017,  0.24051834],
           [-0.89690453, -1.61403402, -0.53950167, -1.5127315 ],
           [ 0.11994836,  0.93271156, -0.37552097, -1.90197732],
           [ 1.1536568 ,  1.55695533, -1.17266403, -0.097324  ]])
     

    描述显示数据的快速统计摘要

    In [80]:
    df.describe()
    
    Out[80]:
     ABCD
    count 6.000000 6.000000 6.000000 6.000000
    mean -0.148307 -0.108734 0.021055 -0.555177
    std 0.904502 1.187138 1.207575 0.964228
    min -1.377782 -1.614034 -1.172664 -1.901977
    25% -0.731310 -0.790530 -0.592045 -1.269839
    50% -0.057289 -0.363414 -0.457511 -0.319243
    75% 0.289310 0.707452 0.401517 0.156058
    max 1.153657 1.556955 2.163044 0.481614
     

    数据转置:

    In [81]:
    df.T
    
    Out[81]:
     2013-01-01 00:00:002013-01-02 00:00:002013-01-03 00:00:002013-01-04 00:00:002013-01-05 00:00:002013-01-06 00:00:00
    A -0.234526 -1.377782 0.345764 -0.896905 0.119948 1.153657
    B -0.758501 -0.801207 0.031672 -1.614034 0.932712 1.556955
    C -0.609560 2.163044 0.660530 -0.539502 -0.375521 -1.172664
    D -0.541163 0.481614 0.240518 -1.512732 -1.901977 -0.097324
     

    数据按轴排序

    In [82]:
    df.sort_index(axis=1, ascending=False)
    
    Out[82]:
     DCBA
    2013-01-01 -0.541163 -0.609560 -0.758501 -0.234526
    2013-01-02 0.481614 2.163044 -0.801207 -1.377782
    2013-01-03 0.240518 0.660530 0.031672 0.345764
    2013-01-04 -1.512732 -0.539502 -1.614034 -0.896905
    2013-01-05 -1.901977 -0.375521 0.932712 0.119948
    2013-01-06 -0.097324 -1.172664 1.556955 1.153657
     

    按值排序

    In [83]:
    df.sort_values(by='B')
    
    Out[83]:
     ABCD
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614
    2013-01-01 -0.234526 -0.758501 -0.609560 -0.541163
    2013-01-03 0.345764 0.031672 0.660530 0.240518
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324
     

    选择

    注意:对于标准的Python/Numpy方法,选择与设置元素是直观并且易于交互的。但,对于生成代码,建议采用Pandas方法, .at, .iat, .loc, .iloc and .ix.

     

    获取数据

    选择标记为‘A’的一列

    In [84]:
    df['A']
    
    Out[84]:
    2013-01-01   -0.234526
    2013-01-02   -1.377782
    2013-01-03    0.345764
    2013-01-04   -0.896905
    2013-01-05    0.119948
    2013-01-06    1.153657
    Freq: D, Name: A, dtype: float64
     

    通过[]切分行:

    In [85]:
    df[0:3]
    
    Out[85]:
     ABCD
    2013-01-01 -0.234526 -0.758501 -0.609560 -0.541163
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614
    2013-01-03 0.345764 0.031672 0.660530 0.240518
    In [86]:
    df['20130102':'20130104']
    
    Out[86]:
     ABCD
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614
    2013-01-03 0.345764 0.031672 0.660530 0.240518
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732
     

    通过label选择数据

    通过label获取对应数据

    In [87]:
    df.loc[dates[0]]
    
    Out[87]:
    A   -0.234526
    B   -0.758501
    C   -0.609560
    D   -0.541163
    Name: 2013-01-01 00:00:00, dtype: float64
     

    通过列label选择多列数据

    In [88]:
    df.loc[:,['A','B']]
    
    Out[88]:
     AB
    2013-01-01 -0.234526 -0.758501
    2013-01-02 -1.377782 -0.801207
    2013-01-03 0.345764 0.031672
    2013-01-04 -0.896905 -1.614034
    2013-01-05 0.119948 0.932712
    2013-01-06 1.153657 1.556955
     

    通过行label与列label选择数据

    In [89]:
    df.loc['20130102':'20130104',['A','B']]
    
    Out[89]:
     AB
    2013-01-02 -1.377782 -0.801207
    2013-01-03 0.345764 0.031672
    2013-01-04 -0.896905 -1.614034
     

    降低返回对象的维度

    In [90]:
    df.loc['20130102', ['A','B']]
    
    Out[90]:
    A   -1.377782
    B   -0.801207
    Name: 2013-01-02 00:00:00, dtype: float64
     

    获得标量值

    In [91]:
    df.loc[dates[0], 'A']
    
    Out[91]:
    -0.23452598284858567
     

    快速获得标量(同前一个方法)

    In [92]:
    df.at[dates[0], 'A']
    
    Out[92]:
    -0.23452598284858567
     

    通过位置选择数据

    通过传递整数位置选择数据

    In [93]:
    df.iloc[3]
    
    Out[93]:
    A   -0.896905
    B   -1.614034
    C   -0.539502
    D   -1.512732
    Name: 2013-01-04 00:00:00, dtype: float64
     

    通过行与列的位置选择数据,类似于python与numpy

    In [94]:
    df.iloc[3:5, 0:2]
    
    Out[94]:
     AB
    2013-01-04 -0.896905 -1.614034
    2013-01-05 0.119948 0.932712
     

    通过list来选择特定行或列的元素

    In [95]:
    df.iloc[[0, 2, 4], [0, 2]]
    
    Out[95]:
     AC
    2013-01-01 -0.234526 -0.609560
    2013-01-03 0.345764 0.660530
    2013-01-05 0.119948 -0.375521
     

    通过list选择行

    In [96]:
    df.iloc[1:3,:]
    
    Out[96]:
     ABCD
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614
    2013-01-03 0.345764 0.031672 0.660530 0.240518
     

    通过list选择列

    In [97]:
    df.iloc[:,1:3]
    
    Out[97]:
     BC
    2013-01-01 -0.758501 -0.609560
    2013-01-02 -0.801207 2.163044
    2013-01-03 0.031672 0.660530
    2013-01-04 -1.614034 -0.539502
    2013-01-05 0.932712 -0.375521
    2013-01-06 1.556955 -1.172664
     

    通过行列元素选择特定的元素

    In [98]:
    df.iloc[1, 1]
    
    Out[98]:
    -0.80120684660851138
     

    快速访问某一个元素(同上一个方法)

    In [99]:
    df.iat[1,1]
    
    Out[99]:
    -0.80120684660851138
     

    布尔型(条件)索引

    用单列的值来选择元素

    In [100]:
    df[df.A > 0]
    
    Out[100]:
     ABCD
    2013-01-03 0.345764 0.031672 0.660530 0.240518
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324
     

    选择全部数据满足条件的数据

    In [101]:
    df[df > 0]
    
    Out[101]:
     ABCD
    2013-01-01 NaN NaN NaN NaN
    2013-01-02 NaN NaN 2.163044 0.481614
    2013-01-03 0.345764 0.031672 0.660530 0.240518
    2013-01-04 NaN NaN NaN NaN
    2013-01-05 0.119948 0.932712 NaN NaN
    2013-01-06 1.153657 1.556955 NaN NaN
     

    通过isin()方法过滤数据

    In [102]:
    df2 = df.copy()
    df2['E'] = ['one', 'two', 'three', 'four', 'five', 'six']
    df2
    
    Out[102]:
     ABCDE
    2013-01-01 -0.234526 -0.758501 -0.609560 -0.541163 one
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614 two
    2013-01-03 0.345764 0.031672 0.660530 0.240518 three
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732 four
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977 five
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324 six
    In [103]:
    df2[df2['E'].isin(['two','four'])]
    
    Out[103]:
     ABCDE
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614 two
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732 four
     

    设置数据

     

    设置一个新列会根据索引自动对齐数据

    In [104]:
    s1 = pd.Series([1,2,3,4,5,6], index = pd.date_range('20130102', periods = 6))
    s1
    
    Out[104]:
    2013-01-02    1
    2013-01-03    2
    2013-01-04    3
    2013-01-05    4
    2013-01-06    5
    2013-01-07    6
    Freq: D, dtype: int64
    In [105]:
    df['F'] = s1
    df
    
    Out[105]:
     ABCDF
    2013-01-01 -0.234526 -0.758501 -0.609560 -0.541163 NaN
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614 1.0
    2013-01-03 0.345764 0.031672 0.660530 0.240518 2.0
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732 3.0
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977 4.0
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324 5.0
     

    通过label设置数据

    In [106]:
    df.at[dates[0],'A'] = 0
    df
    
    Out[106]:
     ABCDF
    2013-01-01 0.000000 -0.758501 -0.609560 -0.541163 NaN
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614 1.0
    2013-01-03 0.345764 0.031672 0.660530 0.240518 2.0
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732 3.0
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977 4.0
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324 5.0
     

    通过位置设置数据

    In [107]:
    df.iat[0, 1] = 0
    df
    
    Out[107]:
     ABCDF
    2013-01-01 0.000000 0.000000 -0.609560 -0.541163 NaN
    2013-01-02 -1.377782 -0.801207 2.163044 0.481614 1.0
    2013-01-03 0.345764 0.031672 0.660530 0.240518 2.0
    2013-01-04 -0.896905 -1.614034 -0.539502 -1.512732 3.0
    2013-01-05 0.119948 0.932712 -0.375521 -1.901977 4.0
    2013-01-06 1.153657 1.556955 -1.172664 -0.097324 5.0
     

    通过numpy设置整列数据

    In [109]:
    df.loc[:,'D'] = np.array([5] * len(df))
    df
    
    Out[109]:
     ABCDF
    2013-01-01 0.000000 0.000000 -0.609560 5 NaN
    2013-01-02 -1.377782 -0.801207 2.163044 5 1.0
    2013-01-03 0.345764 0.031672 0.660530 5 2.0
    2013-01-04 -0.896905 -1.614034 -0.539502 5 3.0
    2013-01-05 0.119948 0.932712 -0.375521 5 4.0
    2013-01-06 1.153657 1.556955 -1.172664 5 5.0
     

    通过where操作设置数据

    In [112]:
    df2 = df.copy()
    df2[df2 > 0] = -df2
    df2
    
    Out[112]:
     ABCDF
    2013-01-01 0.000000 0.000000 -0.609560 -5 NaN
    2013-01-02 -1.377782 -0.801207 -2.163044 -5 -1.0
    2013-01-03 -0.345764 -0.031672 -0.660530 -5 -2.0
    2013-01-04 -0.896905 -1.614034 -0.539502 -5 -3.0
    2013-01-05 -0.119948 -0.932712 -0.375521 -5 -4.0
    2013-01-06 -1.153657 -1.556955 -1.172664 -5 -5.0
     

    缺失数据

     

    Pandas基本采用np.nan表示缺失数据,缺失数据默认不会参加运算。

     

    重新索引允许您更改/添加/删除指定行或列上的索引。 这将返回数据的副本。

    In [113]:
    df1 = df.reindex(index=dates[0:4], columns=list(df.columns) + ['E'])
    df1.loc[dates[0]:dates[1],'E'] = 1
    df1
    
    Out[113]:
     ABCDFE
    2013-01-01 0.000000 0.000000 -0.609560 5 NaN 1.0
    2013-01-02 -1.377782 -0.801207 2.163044 5 1.0 1.0
    2013-01-03 0.345764 0.031672 0.660530 5 2.0 NaN
    2013-01-04 -0.896905 -1.614034 -0.539502 5 3.0 NaN
     

    舍去任何包含缺失数据的行

    In [114]:
    df1.dropna(how='any')
    
    Out[114]:
     ABCDFE
    2013-01-02 -1.377782 -0.801207 2.163044 5 1.0 1.0
     

    填充缺失数据

    In [116]:
    df1.fillna(value=5)
    
    Out[116]:
     ABCDFE
    2013-01-01 0.000000 0.000000 -0.609560 5 5.0 1.0
    2013-01-02 -1.377782 -0.801207 2.163044 5 1.0 1.0
    2013-01-03 0.345764 0.031672 0.660530 5 2.0 5.0
    2013-01-04 -0.896905 -1.614034 -0.539502 5 3.0 5.0
     

    返回布尔数据,如果为nan返回true,否则返回false

    In [117]:
    pd.isnull(df1)
    
    Out[117]:
     ABCDFE
    2013-01-01 False False False False True False
    2013-01-02 False False False False False False
    2013-01-03 False False False False False True
    2013-01-04 False False False False False True
     

    数据操作

     

    概率统计操作

     

    这些操作一般不包括缺失数据

     

    列方向均值操作:

    In [118]:
    df.mean()
    
    Out[118]:
    A   -0.109219
    B    0.017683
    C    0.021055
    D    5.000000
    F    3.000000
    dtype: float64
     

    行方向均值

    In [119]:
    df.mean(1)
    
    Out[119]:
    2013-01-01    1.097610
    2013-01-02    1.196811
    2013-01-03    1.607593
    2013-01-04    0.989912
    2013-01-05    1.935428
    2013-01-06    2.307590
    Freq: D, dtype: float64
     

    操作具有不同维度和需要对齐的对象。 此外,Pandas会沿着指定的维度自动广播。

    In [120]:
    s = pd.Series([1,3,5,np.nan,6,8], index=dates).shift(2)
    s
    
    Out[120]:
    2013-01-01    NaN
    2013-01-02    NaN
    2013-01-03    1.0
    2013-01-04    3.0
    2013-01-05    5.0
    2013-01-06    NaN
    Freq: D, dtype: float64
    In [121]:
    df.sub(s, axis='index')
    
    Out[121]:
     ABCDF
    2013-01-01 NaN NaN NaN NaN NaN
    2013-01-02 NaN NaN NaN NaN NaN
    2013-01-03 -0.654236 -0.968328 -0.339470 4.0 1.0
    2013-01-04 -3.896905 -4.614034 -3.539502 2.0 0.0
    2013-01-05 -4.880052 -4.067288 -5.375521 0.0 -1.0
    2013-01-06 NaN NaN NaN NaN NaN
     

    Apply

     

    应用方程到数据

    In [122]:
    df.apply(np.cumsum)
    
    Out[122]:
     ABCDF
    2013-01-01 0.000000 0.000000 -0.609560 5 NaN
    2013-01-02 -1.377782 -0.801207 1.553484 10 1.0
    2013-01-03 -1.032017 -0.769534 2.214015 15 3.0
    2013-01-04 -1.928922 -2.383568 1.674513 20 6.0
    2013-01-05 -1.808974 -1.450857 1.298992 25 10.0
    2013-01-06 -0.655317 0.106098 0.126328 30 15.0
    In [123]:
    df.apply(lambda x: x.max() - x.min())
    
    Out[123]:
    A    2.531438
    B    3.170989
    C    3.335708
    D    0.000000
    F    4.000000
    dtype: float64
     

    直方图

    In [125]:
    s = pd.Series(np.random.randint(0, 7, size=10))
    s
    
    Out[125]:
    0    5
    1    4
    2    0
    3    6
    4    1
    5    0
    6    4
    7    1
    8    1
    9    4
    dtype: int32
    In [126]:
    s.value_counts()
    
    Out[126]:
    4    3
    1    3
    0    2
    6    1
    5    1
    dtype: int64
     

    字符串方法

     

    Series在string中提供配置了一组字符串处理方法,可以方便地对数组的每个元素进行操作。

    In [128]:
    s = pd.Series(['A', 'B', 'C', 'Aaba', 'Baca', np.nan, 'CABA', 'dog', 'cat'])
    s
    
    Out[128]:
    0       A
    1       B
    2       C
    3    Aaba
    4    Baca
    5     NaN
    6    CABA
    7     dog
    8     cat
    dtype: object
    In [129]:
    s.str.lower()
    
    Out[129]:
    0       a
    1       b
    2       c
    3    aaba
    4    baca
    5     NaN
    6    caba
    7     dog
    8     cat
    dtype: object
     

    合并数据

     

    concat操作

     

    Pandas提供一系列合并数据的操作,例如:Series, DataFrame, and Panel操作

    In [133]:
    df = pd.DataFrame(np.random.randn(10, 4))
    df
    
    Out[133]:
     0123
    0 -0.382355 -1.457945 -1.058879 1.106354
    1 1.120769 -0.559993 -2.319053 1.404413
    2 -0.382069 1.362170 0.289857 1.009210
    3 -1.280824 1.158049 -0.356128 0.023975
    4 0.797376 0.384624 0.152443 -1.392915
    5 -1.886364 -0.276635 -1.159430 2.355427
    6 -1.851654 0.781897 1.054916 0.323811
    7 0.708013 1.648491 -0.031482 -1.245125
    8 0.502539 -1.013665 -1.653738 0.748612
    9 -0.773031 -1.085371 -1.070529 -0.190790
    In [135]:
    pieces = [df[:3], df[3:7], df[7:]]
    pd.concat(pieces)
    
    Out[135]:
     0123
    0 -0.382355 -1.457945 -1.058879 1.106354
    1 1.120769 -0.559993 -2.319053 1.404413
    2 -0.382069 1.362170 0.289857 1.009210
    3 -1.280824 1.158049 -0.356128 0.023975
    4 0.797376 0.384624 0.152443 -1.392915
    5 -1.886364 -0.276635 -1.159430 2.355427
    6 -1.851654 0.781897 1.054916 0.323811
    7 0.708013 1.648491 -0.031482 -1.245125
    8 0.502539 -1.013665 -1.653738 0.748612
    9 -0.773031 -1.085371 -1.070529 -0.190790
     

    join操作

    In [137]:
    left = pd.DataFrame({'key': ['foo', 'foo'], 'lval': [1, 2]})
    right = pd.DataFrame({'key': ['foo', 'foo'], 'rval': [4, 5]})
    left
    
    Out[137]:
     keylval
    0 foo 1
    1 foo 2
    In [138]:
    right
    
    Out[138]:
     keyrval
    0 foo 4
    1 foo 5
    In [139]:
    pd.merge(left, right, on='key')
    
    Out[139]:
     keylvalrval
    0 foo 1 4
    1 foo 1 5
    2 foo 2 4
    3 foo 2 5
     

    另一种方式

    In [141]:
    left= pd.DataFrame({'key': ['foo', 'bar'], 'lval': [1, 2]})
    right = pd.DataFrame({'key': ['foo', 'bar'], 'rval': [4, 5]})
    left
    
    Out[141]:
     keylval
    0 foo 1
    1 bar 2
    In [142]:
    right
    
    Out[142]:
     keyrval
    0 foo 4
    1 bar 5
    In [143]:
    pd.merge(left, right, on='key')
    
    Out[143]:
     keylvalrval
    0 foo 1 4
    1 bar 2 5
     

    Append操作

    In [145]:
    df = pd.DataFrame(np.random.randn(8, 4), columns=['A','B','C','D'])
    df
    
    Out[145]:
     ABCD
    0 -0.789698 0.807736 -0.535839 1.633274
    1 1.122605 0.550968 -0.337497 -0.287827
    2 0.231271 0.273175 0.891461 -0.531991
    3 0.902182 0.802540 -0.319774 0.131826
    4 -1.999558 -0.114146 0.449653 0.174982
    5 1.688851 0.789424 -0.194151 2.017002
    6 0.738474 -0.682955 -1.420737 -0.978726
    7 -0.234849 0.682447 0.753980 0.327666
    In [146]:
    s = df.iloc[3]
    df.append(s, ignore_index=True)
    
    Out[146]:
     ABCD
    0 -0.789698 0.807736 -0.535839 1.633274
    1 1.122605 0.550968 -0.337497 -0.287827
    2 0.231271 0.273175 0.891461 -0.531991
    3 0.902182 0.802540 -0.319774 0.131826
    4 -1.999558 -0.114146 0.449653 0.174982
    5 1.688851 0.789424 -0.194151 2.017002
    6 0.738474 -0.682955 -1.420737 -0.978726
    7 -0.234849 0.682447 0.753980 0.327666
    8 0.902182 0.802540 -0.319774 0.131826
     

    Group操作

     

    group by操作可以分为如下操作:1.基于某些规则分组数据;

    In [147]:
    df = pd.DataFrame({'A' : ['foo', 'bar', 'foo', 'bar', 'foo', 'bar', 'foo', 'foo'],
                       'B' : ['one', 'one', 'two', 'three', 'two', 'two', 'one', 'three'],
                       'C' : np.random.randn(8),
                       'D' : np.random.randn(8)})
    df
    
    Out[147]:
     ABCD
    0 foo one 0.645354 0.110824
    1 bar one 1.090199 -1.103622
    2 foo two 0.492488 0.918727
    3 bar three -1.938242 0.122475
    4 foo two -0.416253 -1.056679
    5 bar two -0.616998 0.482990
    6 foo one 0.311077 0.641402
    7 foo three -1.400911 -0.126283
     

    2.对每组数据应用方程处理,例如:sum

    In [148]:
    df.groupby('A').sum()
    
    Out[148]:
     CD
    A  
    bar -1.465042 -0.498157
    foo -0.368245 0.487990
     

    3.组合数据结果

    In [149]:
    df.groupby(['A','B']).sum()
    
    Out[149]:
      CD
    AB  
    barone 1.090199 -1.103622
    three -1.938242 0.122475
    two -0.616998 0.482990
    fooone 0.956431 0.752225
    three -1.400911 -0.126283
    two 0.076235 -0.137952
     

    reshape

     

    stack操作

    In [151]:
     tuples = list(zip(*[['bar', 'bar', 'baz', 'baz', 'foo', 'foo', 'qux', 'qux'],
                         ['one', 'two', 'one', 'two', 'one', 'two', 'one', 'two']]))
    index = pd.MultiIndex.from_tuples(tuples, names=['first', 'second'])
    df = pd.DataFrame(np.random.randn(8, 2), index=index, columns=['A', 'B'])
    df2 = df[:4]
    df2
    
    Out[151]:
      AB
    firstsecond  
    barone 0.482226 -1.085646
    two 0.805270 1.785375
    bazone -0.860945 -1.018662
    two 0.921586 -0.216946
     

    stack()方法压缩DataFrame列中数据

    In [152]:
    stacked = df2.stack()
    stacked
    
    Out[152]:
    first  second   
    bar    one     A    0.482226
                   B   -1.085646
           two     A    0.805270
                   B    1.785375
    baz    one     A   -0.860945
                   B   -1.018662
           two     A    0.921586
                   B   -0.216946
    dtype: float64
     

    unstack()方法,是stack()方法的反操作,例如:

    In [153]:
    stacked.unstack()
    
    Out[153]:
      AB
    firstsecond  
    barone 0.482226 -1.085646
    two 0.805270 1.785375
    bazone -0.860945 -1.018662
    two 0.921586 -0.216946
    In [154]:
    stacked.unstack(1)
    
    Out[154]:
     secondonetwo
    first   
    barA 0.482226 0.805270
    B -1.085646 1.785375
    bazA -0.860945 0.921586
    B -1.018662 -0.216946
    In [155]:
    stacked.unstack(0)
    
    Out[155]:
     firstbarbaz
    second   
    oneA 0.482226 -0.860945
    B -1.085646 -1.018662
    twoA 0.805270 0.921586
    B 1.785375 -0.216946
     

    数据透视表

    In [156]:
    df = pd.DataFrame({'A' : ['one', 'one', 'two', 'three'] * 3,
                       'B' : ['A', 'B', 'C'] * 4,
                       'C' : ['foo', 'foo', 'foo', 'bar', 'bar', 'bar'] * 2,
                       'D' : np.random.randn(12),
                       'E' : np.random.randn(12)})
    df
    
    Out[156]:
     ABCDE
    0 one A foo -1.863191 -0.420588
    1 one B foo 1.308556 1.373186
    2 two C foo 0.969745 -0.244350
    3 three A bar 0.308731 0.904415
    4 one B bar 1.918195 -0.113832
    5 one C bar 0.121828 -0.364853
    6 two A foo 0.240318 -0.536804
    7 three B foo -1.067721 0.743653
    8 one C foo -0.837986 0.775367
    9 one A bar -0.622907 1.653944
    10 two B bar 0.221569 -2.451701
    11 three C bar -2.312243 0.502830
     

    可以获取透视数据:

    In [157]:
    pd.pivot_table(df, values='D', index=['A', 'B'], columns=['C'])
    
    Out[157]:
     Cbarfoo
    AB  
    oneA -0.622907 -1.863191
    B 1.918195 1.308556
    C 0.121828 -0.837986
    threeA 0.308731 NaN
    B NaN -1.067721
    C -2.312243 NaN
    twoA NaN 0.240318
    B 0.221569 NaN
    C NaN 0.969745
     

    Time Series(时间序列分析)

     

    Pandas具有简单,强大和高效的功能,用于在频率转换期间执行重采样操作

    In [158]:
    rng = pd.date_range('1/1/2012', periods=100, freq='S')
    ts = pd.Series(np.random.randint(0, 500, len(rng)), index=rng)
    ts.resample('5Min').sum()
    
    Out[158]:
    2012-01-01    26325
    Freq: 5T, dtype: int32
     

    时区表示

    In [159]:
    rng = pd.date_range('3/6/2012 00:00', periods=5, freq='D')
    ts = pd.Series(np.random.randn(len(rng)), rng)
    ts
    
    Out[159]:
    2012-03-06   -0.756171
    2012-03-07   -1.818210
    2012-03-08   -1.742229
    2012-03-09   -0.666278
    2012-03-10   -0.246013
    Freq: D, dtype: float64
    In [160]:
    ts_utc = ts.tz_localize('UTC')
    ts_utc
    
    Out[160]:
    2012-03-06 00:00:00+00:00   -0.756171
    2012-03-07 00:00:00+00:00   -1.818210
    2012-03-08 00:00:00+00:00   -1.742229
    2012-03-09 00:00:00+00:00   -0.666278
    2012-03-10 00:00:00+00:00   -0.246013
    Freq: D, dtype: float64
     

    转换成其他时区

    In [161]:
    ts_utc.tz_convert('US/Eastern')
    
    Out[161]:
    2012-03-05 19:00:00-05:00   -0.756171
    2012-03-06 19:00:00-05:00   -1.818210
    2012-03-07 19:00:00-05:00   -1.742229
    2012-03-08 19:00:00-05:00   -0.666278
    2012-03-09 19:00:00-05:00   -0.246013
    Freq: D, dtype: float64
     

    在时间跨度表示之间转换

    In [162]:
    rng = pd.date_range('1/1/2012', periods=5, freq='M')
    ts = pd.Series(np.random.randn(len(rng)), index=rng)
    ts
    
    Out[162]:
    2012-01-31   -2.280217
    2012-02-29   -1.265666
    2012-03-31   -0.693234
    2012-04-30   -0.160583
    2012-05-31    0.394237
    Freq: M, dtype: float64
    In [163]:
    ps = ts.to_period()
    ps
    
    Out[163]:
    2012-01   -2.280217
    2012-02   -1.265666
    2012-03   -0.693234
    2012-04   -0.160583
    2012-05    0.394237
    Freq: M, dtype: float64
    In [164]:
    ps.to_timestamp()
    
    Out[164]:
    2012-01-01   -2.280217
    2012-02-01   -1.265666
    2012-03-01   -0.693234
    2012-04-01   -0.160583
    2012-05-01    0.394237
    Freq: MS, dtype: float64
     

    在周期和时间戳之间进行转换,可以使用一些方便的数学函数。

    In [165]:
    prng = pd.period_range('1990Q1', '2000Q4', freq='Q-NOV')
    ts = pd.Series(np.random.randn(len(prng)), prng)
    ts.index = (prng.asfreq('M', 'e') + 1).asfreq('H', 's') + 9
    ts.head()
    
    Out[165]:
    1990-03-01 09:00   -0.116439
    1990-06-01 09:00    0.689428
    1990-09-01 09:00    1.603478
    1990-12-01 09:00   -0.074146
    1991-03-01 09:00   -0.642818
    Freq: H, dtype: float64
     

    Categorical Data

     

    pandas自0.15版本后,Pandas包含了Categorical Data

    In [166]:
    df = pd.DataFrame({"id":[1,2,3,4,5,6], "raw_grade":['a', 'b', 'b', 'a', 'a', 'e']})
    
    In [167]:
    df["grade"] = df["raw_grade"].astype("category")
    df["grade"]
    
    Out[167]:
    0    a
    1    b
    2    b
    3    a
    4    a
    5    e
    Name: grade, dtype: category
    Categories (3, object): [a, b, e]
     

    将Categorical Data重命名为更有意义的名称

    In [168]:
    df["grade"].cat.categories = ["very good", "good", "very bad"]
    
    In [169]:
    df["grade"] = df["grade"].cat.set_categories(["very bad", "bad", "medium", "good", "very good"])
    df["grade"]
    
    Out[169]:
    0    very good
    1         good
    2         good
    3    very good
    4    very good
    5     very bad
    Name: grade, dtype: category
    Categories (5, object): [very bad, bad, medium, good, very good]
     

    排序

    In [170]:
    df.sort_values(by="grade")
    
    Out[170]:
     idraw_gradegrade
    5 6 e very bad
    1 2 b good
    2 3 b good
    0 1 a very good
    3 4 a very good
    4 5 a very good
     

    按分类列分组还显示空类别:

    In [171]:
    df.groupby("grade").size()
    
    Out[171]:
    grade
    very bad     1
    bad          0
    medium       0
    good         2
    very good    3
    dtype: int64
     

    Plotting

    In [174]:
    ts = pd.Series(np.random.randn(1000), index=pd.date_range('1/1/2000', periods=1000))
    ts = ts.cumsum()
    ts.plot()
    
    Out[174]:
    <matplotlib.axes._subplots.AxesSubplot at 0x15444d8a518>
    In [176]:
    df = pd.DataFrame(np.random.randn(1000, 4), index=ts.index,
                      columns=['A', 'B', 'C', 'D'])
    df = df.cumsum()
    plt.figure(); df.plot(); plt.legend(loc='best')
    
    Out[176]:
    <matplotlib.legend.Legend at 0x154451d2048>
     

    获取数据

     

    CSV

    In [177]:
    df.to_csv('foo.csv')
    
    In [178]:
    pd.read_csv('foo.csv')
    
    Out[178]:
     Unnamed: 0ABCD
    0 2000-01-01 -1.393111 1.514788 0.280721 -0.294551
    1 2000-01-02 -0.642525 2.195259 0.592064 -0.057531
    2 2000-01-03 -1.001919 1.979959 2.552074 1.107618
    3 2000-01-04 0.949372 1.834230 2.081279 1.344826
    4 2000-01-05 1.031375 2.840040 1.999606 2.436098
    5 2000-01-06 0.709656 3.388670 1.952937 1.355730
    6 2000-01-07 1.723083 3.503932 0.290802 1.950817
    7 2000-01-08 2.374127 5.203101 -0.330689 3.287472
    8 2000-01-09 2.674692 8.018739 -0.036404 3.521545
    9 2000-01-10 2.630701 8.189795 0.385022 4.931897
    10 2000-01-11 3.126444 7.579697 0.679853 5.769419
    11 2000-01-12 2.709425 8.329440 -1.069793 5.963032
    12 2000-01-13 2.860447 7.253739 -1.309274 5.421635
    13 2000-01-14 3.314667 8.721476 -1.965677 4.064228
    14 2000-01-15 2.349331 8.863615 -2.761195 4.930499
    15 2000-01-16 1.958310 7.119070 -2.625528 2.593476
    16 2000-01-17 2.070067 9.004844 -2.052658 2.760979
    17 2000-01-18 1.965295 8.512214 -2.359618 0.699228
    18 2000-01-19 0.900034 8.133227 -1.982716 1.624482
    19 2000-01-20 0.035752 6.445181 -2.893294 1.594081
    20 2000-01-21 -2.014831 6.620557 -4.773599 1.002027
    21 2000-01-22 -0.923967 8.693741 -3.695795 0.558935
    22 2000-01-23 -1.668225 7.790933 -3.573539 0.506860
    23 2000-01-24 -2.056015 6.824887 -2.564427 -0.662262
    24 2000-01-25 -1.843934 8.955545 -2.683112 -0.809258
    25 2000-01-26 -1.856735 8.607138 -2.440229 -2.196696
    26 2000-01-27 -3.463581 8.966359 -3.873806 -1.524501
    27 2000-01-28 -4.788617 7.752132 -4.553026 -2.399321
    28 2000-01-29 -5.976016 7.036011 -4.460207 -1.605071
    29 2000-01-30 -5.034437 6.138244 -4.451524 -1.628326
    ... ... ... ... ... ...
    970 2002-08-28 7.087637 -10.915543 -58.482667 -74.821127
    971 2002-08-29 6.120060 -10.997083 -57.452650 -73.742471
    972 2002-08-30 7.691062 -11.176013 -57.091866 -73.281751
    973 2002-08-31 8.452480 -11.293687 -56.313169 -74.869274
    974 2002-09-01 7.553114 -10.510296 -56.900910 -75.732503
    975 2002-09-02 8.275944 -9.958087 -56.768495 -75.445624
    976 2002-09-03 10.713038 -10.430143 -56.750073 -76.824079
    977 2002-09-04 12.299215 -11.939397 -58.104176 -77.024297
    978 2002-09-05 13.160027 -12.323100 -58.055834 -79.144961
    979 2002-09-06 14.413937 -11.905493 -57.145252 -78.776055
    980 2002-09-07 14.514871 -11.271294 -58.854309 -78.735910
    981 2002-09-08 14.828057 -10.749718 -59.836208 -77.881782
    982 2002-09-09 14.525107 -9.981579 -59.973495 -77.019424
    983 2002-09-10 13.248493 -10.430729 -60.947579 -76.936804
    984 2002-09-11 13.866722 -9.581368 -61.176041 -76.366701
    985 2002-09-12 13.711199 -9.163406 -60.199849 -76.076018
    986 2002-09-13 12.278210 -9.744253 -61.488643 -77.830263
    987 2002-09-14 13.872331 -9.769701 -60.596225 -79.301200
    988 2002-09-15 13.742700 -11.035063 -61.132231 -79.930390
    989 2002-09-16 13.117977 -10.444062 -61.367571 -79.468517
    990 2002-09-17 12.547621 -13.403313 -61.542260 -77.685127
    991 2002-09-18 12.156448 -14.764817 -62.347422 -77.060493
    992 2002-09-19 12.356847 -14.440572 -62.154593 -75.846297
    993 2002-09-20 12.967553 -13.593848 -62.368733 -76.248747
    994 2002-09-21 12.188960 -13.960158 -64.801744 -77.560436
    995 2002-09-22 10.930343 -14.180484 -61.880778 -78.704844
    996 2002-09-23 11.546164 -14.640839 -60.011920 -79.081217
    997 2002-09-24 12.010131 -13.948313 -58.988687 -79.707070
    998 2002-09-25 10.203590 -14.170849 -59.165740 -78.678488
    999 2002-09-26 9.985623 -13.509717 -57.876236 -78.605853

    1000 rows × 5 columns

     

    HDF5

    In [179]:
    df.to_hdf('foo.h5','df')
    
    In [180]:
     pd.read_hdf('foo.h5','df')
    
    Out[180]:
     ABCD
    2000-01-01 -1.393111 1.514788 0.280721 -0.294551
    2000-01-02 -0.642525 2.195259 0.592064 -0.057531
    2000-01-03 -1.001919 1.979959 2.552074 1.107618
    2000-01-04 0.949372 1.834230 2.081279 1.344826
    2000-01-05 1.031375 2.840040 1.999606 2.436098
    2000-01-06 0.709656 3.388670 1.952937 1.355730
    2000-01-07 1.723083 3.503932 0.290802 1.950817
    2000-01-08 2.374127 5.203101 -0.330689 3.287472
    2000-01-09 2.674692 8.018739 -0.036404 3.521545
    2000-01-10 2.630701 8.189795 0.385022 4.931897
    2000-01-11 3.126444 7.579697 0.679853 5.769419
    2000-01-12 2.709425 8.329440 -1.069793 5.963032
    2000-01-13 2.860447 7.253739 -1.309274 5.421635
    2000-01-14 3.314667 8.721476 -1.965677 4.064228
    2000-01-15 2.349331 8.863615 -2.761195 4.930499
    2000-01-16 1.958310 7.119070 -2.625528 2.593476
    2000-01-17 2.070067 9.004844 -2.052658 2.760979
    2000-01-18 1.965295 8.512214 -2.359618 0.699228
    2000-01-19 0.900034 8.133227 -1.982716 1.624482
    2000-01-20 0.035752 6.445181 -2.893294 1.594081
    2000-01-21 -2.014831 6.620557 -4.773599 1.002027
    2000-01-22 -0.923967 8.693741 -3.695795 0.558935
    2000-01-23 -1.668225 7.790933 -3.573539 0.506860
    2000-01-24 -2.056015 6.824887 -2.564427 -0.662262
    2000-01-25 -1.843934 8.955545 -2.683112 -0.809258
    2000-01-26 -1.856735 8.607138 -2.440229 -2.196696
    2000-01-27 -3.463581 8.966359 -3.873806 -1.524501
    2000-01-28 -4.788617 7.752132 -4.553026 -2.399321
    2000-01-29 -5.976016 7.036011 -4.460207 -1.605071
    2000-01-30 -5.034437 6.138244 -4.451524 -1.628326
    ... ... ... ... ...
    2002-08-28 7.087637 -10.915543 -58.482667 -74.821127
    2002-08-29 6.120060 -10.997083 -57.452650 -73.742471
    2002-08-30 7.691062 -11.176013 -57.091866 -73.281751
    2002-08-31 8.452480 -11.293687 -56.313169 -74.869274
    2002-09-01 7.553114 -10.510296 -56.900910 -75.732503
    2002-09-02 8.275944 -9.958087 -56.768495 -75.445624
    2002-09-03 10.713038 -10.430143 -56.750073 -76.824079
    2002-09-04 12.299215 -11.939397 -58.104176 -77.024297
    2002-09-05 13.160027 -12.323100 -58.055834 -79.144961
    2002-09-06 14.413937 -11.905493 -57.145252 -78.776055
    2002-09-07 14.514871 -11.271294 -58.854309 -78.735910
    2002-09-08 14.828057 -10.749718 -59.836208 -77.881782
    2002-09-09 14.525107 -9.981579 -59.973495 -77.019424
    2002-09-10 13.248493 -10.430729 -60.947579 -76.936804
    2002-09-11 13.866722 -9.581368 -61.176041 -76.366701
    2002-09-12 13.711199 -9.163406 -60.199849 -76.076018
    2002-09-13 12.278210 -9.744253 -61.488643 -77.830263
    2002-09-14 13.872331 -9.769701 -60.596225 -79.301200
    2002-09-15 13.742700 -11.035063 -61.132231 -79.930390
    2002-09-16 13.117977 -10.444062 -61.367571 -79.468517
    2002-09-17 12.547621 -13.403313 -61.542260 -77.685127
    2002-09-18 12.156448 -14.764817 -62.347422 -77.060493
    2002-09-19 12.356847 -14.440572 -62.154593 -75.846297
    2002-09-20 12.967553 -13.593848 -62.368733 -76.248747
    2002-09-21 12.188960 -13.960158 -64.801744 -77.560436
    2002-09-22 10.930343 -14.180484 -61.880778 -78.704844
    2002-09-23 11.546164 -14.640839 -60.011920 -79.081217
    2002-09-24 12.010131 -13.948313 -58.988687 -79.707070
    2002-09-25 10.203590 -14.170849 -59.165740 -78.678488
    2002-09-26 9.985623 -13.509717 -57.876236 -78.605853

    1000 rows × 4 columns

     

    Excel

    In [181]:
    df.to_excel('foo.xlsx', sheet_name='Sheet1')
    
    In [182]:
     pd.read_excel('foo.xlsx', 'Sheet1', index_col=None, na_values=['NA'])
    
    Out[182]:
     ABCD
    2000-01-01 -1.393111 1.514788 0.280721 -0.294551
    2000-01-02 -0.642525 2.195259 0.592064 -0.057531
    2000-01-03 -1.001919 1.979959 2.552074 1.107618
    2000-01-04 0.949372 1.834230 2.081279 1.344826
    2000-01-05 1.031375 2.840040 1.999606 2.436098
    2000-01-06 0.709656 3.388670 1.952937 1.355730
    2000-01-07 1.723083 3.503932 0.290802 1.950817
    2000-01-08 2.374127 5.203101 -0.330689 3.287472
    2000-01-09 2.674692 8.018739 -0.036404 3.521545
    2000-01-10 2.630701 8.189795 0.385022 4.931897
    2000-01-11 3.126444 7.579697 0.679853 5.769419
    2000-01-12 2.709425 8.329440 -1.069793 5.963032
    2000-01-13 2.860447 7.253739 -1.309274 5.421635
    2000-01-14 3.314667 8.721476 -1.965677 4.064228
    2000-01-15 2.349331 8.863615 -2.761195 4.930499
    2000-01-16 1.958310 7.119070 -2.625528 2.593476
    2000-01-17 2.070067 9.004844 -2.052658 2.760979
    2000-01-18 1.965295 8.512214 -2.359618 0.699228
    2000-01-19 0.900034 8.133227 -1.982716 1.624482
    2000-01-20 0.035752 6.445181 -2.893294 1.594081
    2000-01-21 -2.014831 6.620557 -4.773599 1.002027
    2000-01-22 -0.923967 8.693741 -3.695795 0.558935
    2000-01-23 -1.668225 7.790933 -3.573539 0.506860
    2000-01-24 -2.056015 6.824887 -2.564427 -0.662262
    2000-01-25 -1.843934 8.955545 -2.683112 -0.809258
    2000-01-26 -1.856735 8.607138 -2.440229 -2.196696
    2000-01-27 -3.463581 8.966359 -3.873806 -1.524501
    2000-01-28 -4.788617 7.752132 -4.553026 -2.399321
    2000-01-29 -5.976016 7.036011 -4.460207 -1.605071
    2000-01-30 -5.034437 6.138244 -4.451524 -1.628326
    ... ... ... ... ...
    2002-08-28 7.087637 -10.915543 -58.482667 -74.821127
    2002-08-29 6.120060 -10.997083 -57.452650 -73.742471
    2002-08-30 7.691062 -11.176013 -57.091866 -73.281751
    2002-08-31 8.452480 -11.293687 -56.313169 -74.869274
    2002-09-01 7.553114 -10.510296 -56.900910 -75.732503
    2002-09-02 8.275944 -9.958087 -56.768495 -75.445624
    2002-09-03 10.713038 -10.430143 -56.750073 -76.824079
    2002-09-04 12.299215 -11.939397 -58.104176 -77.024297
    2002-09-05 13.160027 -12.323100 -58.055834 -79.144961
    2002-09-06 14.413937 -11.905493 -57.145252 -78.776055
    2002-09-07 14.514871 -11.271294 -58.854309 -78.735910
    2002-09-08 14.828057 -10.749718 -59.836208 -77.881782
    2002-09-09 14.525107 -9.981579 -59.973495 -77.019424
    2002-09-10 13.248493 -10.430729 -60.947579 -76.936804
    2002-09-11 13.866722 -9.581368 -61.176041 -76.366701
    2002-09-12 13.711199 -9.163406 -60.199849 -76.076018
    2002-09-13 12.278210 -9.744253 -61.488643 -77.830263
    2002-09-14 13.872331 -9.769701 -60.596225 -79.301200
    2002-09-15 13.742700 -11.035063 -61.132231 -79.930390
    2002-09-16 13.117977 -10.444062 -61.367571 -79.468517
    2002-09-17 12.547621 -13.403313 -61.542260 -77.685127
    2002-09-18 12.156448 -14.764817 -62.347422 -77.060493
    2002-09-19 12.356847 -14.440572 -62.154593 -75.846297
    2002-09-20 12.967553 -13.593848 -62.368733 -76.248747
    2002-09-21 12.188960 -13.960158 -64.801744 -77.560436
    2002-09-22 10.930343 -14.180484 -61.880778 -78.704844
    2002-09-23 11.546164 -14.640839 -60.011920 -79.081217
    2002-09-24 12.010131 -13.948313 -58.988687 -79.707070
    2002-09-25 10.203590 -14.170849 -59.165740 -78.678488
    2002-09-26 9.985623 -13.509717 -57.876236 -78.605853

    1000 rows × 4 columns

     
    In [ ]:
     
    In [ ]:
     
  • 相关阅读:
    JavaScript的for循环
    javaScript的执行机制-同步任务-异步任务-微任务-宏任务
    js排他性算法
    js倒计时
    微信小程序 简单获取屏幕视口高度
    小程序scroll-view实现回到顶部
    Vue使用js鼠标蜘蛛特效
    小程序获取当前播放长度和视频总长度,可在播放到某一时长暂停或停止视频
    Django学习笔记
    SQLite简单介绍
  • 原文地址:https://www.cnblogs.com/hgl0417/p/7625928.html
Copyright © 2011-2022 走看看