zoukankan      html  css  js  c++  java
  • Pandas 操作

    一、Series的创建: 

    pd.Series([ 数据 ])

    In [17]: import pandas as pd
    
    In [18]: import numpy as np
    
    In [19]: s = pd.Series([1,1,1,1,np.nan])
    
    In [20]: s
    Out[20]:
    0    1.0
    1    1.0
    2    1.0
    3    1.0
    4    NaN
    dtype: float64

    二、生成DataFrame

    1,Numpy 产生随机数组

    In [17]: np.random.rand(5,5)  # 生成5 x 5 的数组
    Out[17]:
    array([[ 0.67935636,  0.75784959,  0.85428253,  0.73356   ,  0.60555467],
           [ 0.93576592,  0.81445114,  0.18213442,  0.4784346 ,  0.14720462],
           [ 0.57083505,  0.62618339,  0.13534874,  0.19513107,  0.7190744 ],
           [ 0.66931535,  0.50888897,  0.00685189,  0.16140523,  0.68407209],
           [ 0.91081342,  0.67702016,  0.32823171,  0.43670926,  0.98735408]])

    2,Pandas 生成连续日期

    In [18]: pd.date_range('20180101',periods=6)
    Out[18]:
    DatetimeIndex(['2018-01-01', '2018-01-02', '2018-01-03', '2018-01-04',
                   '2018-01-05', '2018-01-06'],
                  dtype='datetime64[ns]', freq='D')

    3,生成带index和columns的DataFrame

    In [19]: df = pd.DataFrame(np.random.rand(6,4),index=pd.date_range('20180101',periods=6),columns=['a','b','c','d']) # 第一个参数为数据,第二个参数index为索引,第三个参数columns为列名
    
    In [20]: df
    Out[20]:
                       a         b         c         d
    2018-01-01  0.202113  0.205094  0.456936  0.535537
    2018-01-02  0.912747  0.812827  0.856495  0.872259
    2018-01-03  0.303067  0.832261  0.279915  0.297952
    2018-01-04  0.480393  0.315161  0.333675  0.072642
    2018-01-05  0.965324  0.561682  0.565189  0.503561
    2018-01-06  0.959792  0.227326  0.970319  0.757595

     4,Pandas 生成二维数组和一维数组

    In [6]: arr = np.arange(12) # 一维数组
    In [7]: arr
    Out[7]: array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])

     In [9]: arr = np.arange(12).reshape(3,4)  # 二维数组

     In [10]: arr
     Out[10]:
     array([[ 0, 1, 2, 3],
     [ 4, 5, 6, 7],
     [ 8, 9, 10, 11]])

    5,生成一个没有定义index和column的DataFrame (如果没有定义,index和column则为数字)

    In [11]: df = pd.DataFrame(arr) # 直接将二维数组传入即可
    
    In [12]: df
    Out[12]:
       0  1   2   3
    0  0  1   2   3
    1  4  5   6   7
    2  8  9  10  11

    6,利用数组生成DataFrame

    In [13]: arr = {"a":[1]*3,"b":[2]*3,"c":[3]*3} # 定义数组
    In [14]: arr
    Out[14]: {'a': [1, 1, 1], 'b': [2, 2, 2], 'c': [3, 3, 3]}

    #生成DataFrame
    In [16]: df = pd.DataFrame(arr)
    In [17]:
    In [17]: df
    Out[17]:
       a  b  c
    0  1  2  3
    1  1  2  3
    2  1  2  3

    三、DataFrame的基本操作

    1,取某一列

    In [20]: df['a']
    Out[20]:
    0    1
    1    1
    2    1
    Name: a, dtype: int64

    2,查看数组类型dtypes

    In [21]: df.dtypes
    Out[21]:
    a    int64
    b    int64
    c    int64
    dtype: object

    3,查看索引index

    In [23]: df.index
    Out[23]: RangeIndex(start=0, stop=3, step=1)

    4,查看列cloumns

    In [24]: df.columns
    Out[24]: Index([u'a', u'b', u'c'], dtype='object')

    5,查看值values

    In [25]: df.values
    Out[25]:
    array([[1, 2, 3],
           [1, 2, 3],
           [1, 2, 3]], dtype=int64)

    6,查看数据的总结describe

    In [32]: df.describe()
    Out[32]:
             a    b    c
    count  3.0  3.0  3.0
    mean   1.0  2.0  3.0
    std    0.0  0.0  0.0
    min    1.0  2.0  3.0
    25%    1.0  2.0  3.0
    50%    1.0  2.0  3.0
    75%    1.0  2.0  3.0
    max    1.0  2.0  3.0

    7,翻转数据 transpose、T

    In [37]: df.transpose()
    Out[37]:
       0  1  2
    a  1  1  1
    b  2  2  2
    c  3  3  3
    
    In [38]: df.T
    Out[38]:
       0  1  2
    a  1  1  1
    b  2  2  2
    c  3  3  3

    四,对DataFrame的索引(index)进行排序

    df.sort_index(axis=0,ascending=True) #axis为0代表对行排序1代表对列。ascending为True代表正序,False代表反序

    # 行,正序
    In [43]: df.sort_index(axis=0,ascending=True)
    Out[43]:
       a  b  c
    0  1  2  3
    1  1  2  3
    2  1  2  3

    # 行,反序

    In [44]: df.sort_index(axis=0,ascending=False)
    Out[44]:
       a  b  c
    2  1  2  3
    1  1  2  3
    0  1  2  3

    #列,正序

    In [46]: df.sort_index(axis=1,ascending=True)
    Out[46]:
       a  b  c
    0  1  2  3
    1  1  2  3
    2  1  2  3

    #列,反序

    In [45]: df.sort_index(axis=1,ascending=False)
    Out[45]:
       c  b  a
    0  3  2  1
    1  3  2  1
    2  3  2  1

    五,对DataFrame的值(values)进行排序

    # F列,正序
    In [205]: df.sort_values(by='F',ascending=True) # by 是根据某列排序,ascending代表的是正序或者反序 Out[205]: A B C D E F 2013-01-01 0 1 2 3 NaN 11 2013-01-02 0 0 0 0 NaN 22 2013-01-03 0 0 0 0 NaN 33 2013-01-04 0 0 0 0 NaN 44 2013-01-05 0 0 0 0 NaN 55 2013-01-06 0 0 0 0 NaN 66
    # F列,反序 In [
    206]: df.sort_values(by='F',ascending=False) Out[206]: A B C D E F 2013-01-06 0 0 0 0 NaN 66 2013-01-05 0 0 0 0 NaN 55 2013-01-04 0 0 0 0 NaN 44 2013-01-03 0 0 0 0 NaN 33 2013-01-02 0 0 0 0 NaN 22 2013-01-01 0 1 2 3 NaN 11

    六,对DataFrame数据进行选择

    1,简单筛选 df.A = df['A']

    # 生成一个DataFrame
    In [58]: dates = pd.date_range('20130101', periods=6) ...: df = pd.DataFrame(np.arange(24).reshape((6,4)),index=dates, columns=['A','B','C','D']) ...:

    In [63]: df.A
    Out[63]:
    2013-01-01     0
    2013-01-02     4
    2013-01-03     8
    2013-01-04    12
    2013-01-05    16
    2013-01-06    20
    Freq: D, Name: A, dtype: int32
    
    In [64]: df['A']
    Out[64]:
    2013-01-01     0
    2013-01-02     4
    2013-01-03     8
    2013-01-04    12
    2013-01-05    16
    2013-01-06    20
    Freq: D, Name: A, dtype: int32

    2,按索引或者索引下标进行筛选

    In [60]: df[0:3]  # 按照索引进行筛选
    Out[60]:
                A  B   C   D
    2013-01-01  0  1   2   3
    2013-01-02  4  5   6   7
    2013-01-03  8  9  10  11
    
    In [61]: df['20130101':'20130102']  # 按照索引下标值进行筛选
    Out[61]:
                A  B  C  D
    2013-01-01  0  1  2  3
    2013-01-02  4  5  6  7

    3,使用loc进行标签筛选 df.loc['行',[列]]

          列的格式为list,其中 : 代表全部。行只能筛选一行,列能筛选多列

    # 只有行
    In [66]: df.loc['20130102'] # 只有一个参数,默认是对行 Out[66]: A 4 B 5 C 6 D 7 Name: 2013-01-02 00:00:00, dtype: int32

    # 只有列
    In [71]: df.loc[:,['A']]
    Out[71]:
                 A
    2013-01-01   0
    2013-01-02   4
    2013-01-03   8
    2013-01-04  12
    2013-01-05  16
    2013-01-06  20
    # 全部行,特定列
    In [72]: df.loc[:,['A','B','C']]
    Out[72]:
                 A   B   C
    2013-01-01   0   1   2
    2013-01-02   4   5   6
    2013-01-03   8   9  10
    2013-01-04  12  13  14
    2013-01-05  16  17  18
    2013-01-06  20  21  22
    # 全部列,特定行(只能筛选一行)
    In [81]: df.loc['20130101',:]
    Out[81]:
    A    0
    B    1
    C    2
    D    3
    Name: 2013-01-01 00:00:00, dtype: int32
    # 特定列,特定行
    In [82]: df.loc['20130101',['A','B','C']]
    Out[82]:
    A    0
    B    1
    C    2
    Name: 2013-01-01 00:00:00, dtype: int32

    4,使用iloc进行位置筛选 df.loc[[行],[列]]

          行和列的格式为list,其中 : 代表全部。这里的取值并不是list中的范围,例如 [0:5] 并不代表第0条到第5条,而是代表第0条和第5条

    In [90]: df.iloc[[0,1],[0,3]]
    Out[90]:
                A  D
    2013-01-01  0  3
    2013-01-02  4  7
    
    In [91]: df.iloc[[1],[0,3]]
    Out[91]:
                A  D
    2013-01-02  4  7
    
    In [92]: df.iloc[[2,4],[0,3]]
    Out[92]:
                 A   D
    2013-01-03   8  11
    2013-01-05  16  19

     5,使用ix进行混合选择

    # 前三行,A,B列
    In [111]: df.ix[:3,['A','B']]   
    Out[111]:
                A  B
    2013-01-01  0  1
    2013-01-02  4  5
    2013-01-03  8  9
    # 前三行,前两列
    In [112]: df.ix[:3,:2]
    Out[112]:
                A  B
    2013-01-01  0  1
    2013-01-02  4  5
    2013-01-03  8  9
    # 20130101 行,A,B列
    In [113]: df.ix['20130101',['A','B']]
    Out[113]:
    A    0
    B    1
    Name: 2013-01-01 00:00:00, dtype: int32

    总结 : 使用loc以及,ix对DataFrame的行和列的操作的时候,如果对行的筛选条件为名称筛选,那么只能筛选一行

    例如: 

    df.loc['20130101',['A','B','C']]
    df.ix['20130101',['A','B']]

    6,使用判断进行筛选

    In [125]: df[df['A']>8]  # A列的值大于8
    Out[125]:
                 A   B   C   D
    2013-01-04  12  13  14  15
    2013-01-05  16  17  18  19
    2013-01-06  20  21  22  23
    
    In [126]: df[df.iloc[:]>8]  # 所有的值大于8
    Out[126]:
                   A     B     C     D
    2013-01-01   NaN   NaN   NaN   NaN
    2013-01-02   NaN   NaN   NaN   NaN
    2013-01-03   NaN   9.0  10.0  11.0
    2013-01-04  12.0  13.0  14.0  15.0
    2013-01-05  16.0  17.0  18.0  19.0
    2013-01-06  20.0  21.0  22.0  23.0

    七,为DataFrame设置新的值

    1,iloc定位赋值

    In [136]: df  # 先查看df
    Out[136]:
                 A   B   C   D
    2013-01-01   0   1   2   3
    2013-01-02   4   5   6   7
    2013-01-03   8   9  10  11
    2013-01-04  12  13  14  15
    2013-01-05  16  17  18  19
    2013-01-06  20  21  22  23
    
    In [137]: df.iloc[0,0] # 定位第0行第0个元素
    Out[137]: 0
    
    In [138]: df.iloc[0,0] = 10 # 将第0行第0个元素的值赋为10
    
    In [139]: df  # 查看结果
    Out[139]:
                 A   B   C   D
    2013-01-01  10   1   2   3
    2013-01-02   4   5   6   7
    2013-01-03   8   9  10  11
    2013-01-04  12  13  14  15
    2013-01-05  16  17  18  19
    2013-01-06  20  21  22  23

    2,loc定位赋值

    In [140]: df.loc['20130101','A'] # 定位行为'20130101',列为'A'的元素
    Out[140]: 10
    
    In [141]: df.loc['20130101','A'] = 0 # 将值赋为0
    
    In [142]: df # 查看结果
    Out[142]:
                 A   B   C   D
    2013-01-01   0   1   2   3
    2013-01-02   4   5   6   7
    2013-01-03   8   9  10  11
    2013-01-04  12  13  14  15
    2013-01-05  16  17  18  19
    2013-01-06  20  21  22  23

    3,使用判断赋值

    In [156]: df[df.B>4]=0 # B列的值大于4,就将值赋为0
    
    In [157]: df
    Out[157]:
                A  B  C  D
    2013-01-01  0  1  2  3
    2013-01-02  0  0  0  0
    2013-01-03  0  0  0  0
    2013-01-04  0  0  0  0
    2013-01-05  0  0  0  0
    2013-01-06  0  0  0  0

    4,新增一列,并赋值为NaN

    In [159]: df['E']=np.nan  # 新增一列 E ,并且赋值
    
    In [160]: df
    Out[160]:
                A  B  C  D   E
    2013-01-01  0  1  2  3 NaN
    2013-01-02  0  0  0  0 NaN
    2013-01-03  0  0  0  0 NaN
    2013-01-04  0  0  0  0 NaN
    2013-01-05  0  0  0  0 NaN
    2013-01-06  0  0  0  0 NaN

    5,新增一列,并赋值

    In [163]: df['F'] = pd.Series([11,22,33,44,55,66,], index=pd.date_range('20130101',periods=6))
    
    In [164]: df
    Out[164]:
                A  B  C  D   E   F
    2013-01-01  0  1  2  3 NaN  11
    2013-01-02  0  0  0  0 NaN  22
    2013-01-03  0  0  0  0 NaN  33
    2013-01-04  0  0  0  0 NaN  44
    2013-01-05  0  0  0  0 NaN  55
    2013-01-06  0  0  0  0 NaN  66

    八,DataFrame处理NaN

    1,直接删除含有NaN的行或列

    In [181]: df
    Out[181]:
                A  B  C  D   E   F
    2013-01-01  0  1  2  3 NaN  11
    2013-01-02  0  0  0  0 NaN  22
    2013-01-03  0  0  0  0 NaN  33
    2013-01-04  0  0  0  0 NaN  44
    2013-01-05  0  0  0  0 NaN  55
    2013-01-06  0  0  0  0 NaN  66
    
    In [182]: df.dropna(axis=0,how='any') # 0: 对行进行操作; 1: 对列进行操作 how='any' ,'any': 只要存在 NaN 就 drop 掉; 'all': 必须全部是 NaN 才 drop 。不会对原DataFrame操作,会返回一个新的DataFrame
    Out[182]: Empty DataFrame Columns: [A, B, C, D, E, F] Index: []  # 返回的结果

    2,将值为NaN的值替换为指定的值

    In [186]: df.fillna(value=10)  # 放值为NaN时,将值设置10
    Out[186]:
                A  B  C  D     E   F
    2013-01-01  0  1  2  3  10.0  11
    2013-01-02  0  0  0  0  10.0  22
    2013-01-03  0  0  0  0  10.0  33
    2013-01-04  0  0  0  0  10.0  44
    2013-01-05  0  0  0  0  10.0  55
    2013-01-06  0  0  0  0  10.0  66

    3,返回布尔值(为NaN则返回True,反之返回False)

    In [187]: df.isnull()
    Out[187]:
                    A      B      C      D     E      F
    2013-01-01  False  False  False  False  True  False
    2013-01-02  False  False  False  False  True  False
    2013-01-03  False  False  False  False  True  False
    2013-01-04  False  False  False  False  True  False
    2013-01-05  False  False  False  False  True  False
    2013-01-06  False  False  False  False  True  False

    4,检测数据中是否存在NaN

    In [193]: np.any(df.isnull()) == True
    Out[193]: True

    九,文件的读取和操作

    1,读取Excel或者csv

     pd.read_csv('test.csv',encoding='utf-8')
     pd.read_excel('test.xlsx',encoding='utf-8')

    2,将结果写入pickle

    data.to_pickle('test.pickle')

    十,读取文件拓展(写为函数)

    1,读取文件

    def get_dataframe_by_file(abs_filename, skiprows=0, nrows=None):
        ext = os.path.splitext(abs_filename)[1].lower()
        if ext == '.csv':
            try:
                df = pd.read_csv(abs_filename, encoding='utf8', skiprows=skiprows, nrows=nrows)
            except UnicodeDecodeError:
                df = pd.read_csv(abs_filename, encoding='gb18030', skiprows=skiprows, nrows=nrows)
        elif ext in ['.xlsx', '.xls']:
            df = pd.read_excel(abs_filename, skiprows=skiprows, nrows=nrows)
        else:
            raise Exception('not supported yet')
        return df

    2,将数据转为utf-8

    def convert_csv_to_utf8(csv_file):
        newpath = os.path.join(DATA_FILE_PATH, 'csvdata_utf8')
        if not os.path.exists(newpath):
            os.makedirs(newpath)
        basename = os.path.basename(csv_file)
        newname = os.path.join(newpath, basename)
        file(newname, 'wb').write(
            file(csv_file).read().decode('gb18030').encode('utf-8'))
        return newname

    十一,Panads DataFrame合并

    1,concat合并行

    In [207]: df1 = pd.DataFrame(np.ones((3,4))*0, columns=['a','b','c','d'])
         ...: df2 = pd.DataFrame(np.ones((3,4))*1, columns=['a','b','c','d'])
         ...: df3 = pd.DataFrame(np.ones((3,4))*2, columns=['a','b','c','d'])
         ...:
    
    In [208]:
    
    In [208]:
    
    In [208]: df1
    Out[208]:
         a    b    c    d
    0  0.0  0.0  0.0  0.0
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    
    In [209]: df2
    Out[209]:
         a    b    c    d
    0  1.0  1.0  1.0  1.0
    1  1.0  1.0  1.0  1.0
    2  1.0  1.0  1.0  1.0
    
    In [210]: df3
    Out[210]:
         a    b    c    d
    0  2.0  2.0  2.0  2.0
    1  2.0  2.0  2.0  2.0
    2  2.0  2.0  2.0  2.0
    
    In [211]: re = pd.concat([df1,df2,df3],axis=0)  # pd.concat([DataFrame列表],axis=0) axis=0代表行合并,1代表列合并。合并后index是重复的,可使用ignore_index = True让索引不重复pd.concat([df1, df2, df3], axis=0, ignore_index=True)
    In [212]: re
    Out[212]:
         a    b    c    d
    0  0.0  0.0  0.0  0.0
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    0  1.0  1.0  1.0  1.0
    1  1.0  1.0  1.0  1.0
    2  1.0  1.0  1.0  1.0
    0  2.0  2.0  2.0  2.0
    1  2.0  2.0  2.0  2.0
    2  2.0  2.0  2.0  2.0
    
    In [213]:

    2,concat合并列

    In [213]: re = pd.concat([df1,df2,df3],axis=1)
    
    In [214]: re
    Out[214]:
         a    b    c    d    a    b    c    d    a    b    c    d
    0  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  2.0  2.0  2.0  2.0
    1  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  2.0  2.0  2.0  2.0
    2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  2.0  2.0  2.0  2.0

    3,concat join='outer' 合并(相同的columns或者index进行合并,独自的index或者columns自成行或列。没有值的为NaN)

    # 先准备两个df
    In [227]: df1 Out[227]: a b c d 1 0.0 0.0 0.0 0.0 2 0.0 0.0 0.0 0.0 3 0.0 0.0 0.0 0.0 In [228]: df2 Out[228]: b c d e 2 1.0 1.0 1.0 1.0 3 1.0 1.0 1.0 1.0 4 1.0 1.0 1.0 1.0

    # join = 'outer',axis = 0 合并 (可以理解为按照行取并集合并)

    In [231]: re = pd.concat([df1,df2],axis=0,join='outer',ignore_index=True)
    
    In [232]: re
    Out[232]:
         a    b    c    d    e
    0  0.0  0.0  0.0  0.0  NaN
    1  0.0  0.0  0.0  0.0  NaN
    2  0.0  0.0  0.0  0.0  NaN
    3  NaN  1.0  1.0  1.0  1.0
    4  NaN  1.0  1.0  1.0  1.0
    5  NaN  1.0  1.0  1.0  1.0

    # join = 'outer',axis = 1 合并 (可以理解为按照列取交集进行合并)

    In [233]: re = pd.concat([df1,df2],axis=1,join='outer',ignore_index=True)
    
    In [234]: re
    Out[234]:
         0    1    2    3    4    5    6    7
    1  0.0  0.0  0.0  0.0  NaN  NaN  NaN  NaN
    2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
    3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
    4  NaN  NaN  NaN  NaN  1.0  1.0  1.0  1.0

    4,concat join='inner' 合并(相同的columns或者index进行合并,独自的index或者columns自成行或列。没有值的为NaN)

    # 准备两个df
    In [240]: df1
    Out[240]:
         a    b    c    d
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    3  0.0  0.0  0.0  0.0
    
    In [241]: df2
    Out[241]:
         b    c    d    e
    2  1.0  1.0  1.0  1.0
    3  1.0  1.0  1.0  1.0
    4  1.0  1.0  1.0  1.0

    # join = 'inner', axis = 0 合并(相当于行取交集进行合并)

    In [235]: re = pd.concat([df1,df2],axis=0,join='inner',ignore_index=True)
    
    In [236]: re
    Out[236]:
         b    c    d
    0  0.0  0.0  0.0
    1  0.0  0.0  0.0
    2  0.0  0.0  0.0
    3  1.0  1.0  1.0
    4  1.0  1.0  1.0
    5  1.0  1.0  1.0

    # join = 'inner', axis = 1 合并(相当于列取交集进行合并)  

    In [237]: re = pd.concat([df1,df2],axis=1,join='inner',ignore_index=True)
    
    In [238]: re
    Out[238]:
         0    1    2    3    4    5    6    7
    2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
    3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

    5,concat  join_axes合并(可以按照某个DataFrame的index或者columns进行合并) 

    # 准备两个df
    In [253]: df1
    Out[253]:
         a    b    c    d
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    3  0.0  0.0  0.0  0.0
    
    In [254]: df2
    Out[254]:
         b    c    d    e
    2  1.0  1.0  1.0  1.0
    3  1.0  1.0  1.0  1.0
    4  1.0  1.0  1.0  1.0

    # 按照df1的index进行合并

    In [255]: res = pd.concat([df1, df2], axis=1, join_axes=[df1.index])  # 这里需要注意,join_axes是要进行合并的某个df的index或者columns,axis=1 代表行。这里需要同步,既,后面为index,axis为1,后面为columns,axis为0
    
    In [256]: res
    Out[256]:
         a    b    c    d    b    c    d    e
    1  0.0  0.0  0.0  0.0  NaN  NaN  NaN  NaN
    2  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0
    3  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0

    # 按照df2的columns进行合并

    In [257]: res = pd.concat([df1, df2], axis=0, join_axes=[df1.columns])
    
    In [258]: res
    Out[258]:
         a    b    c    d
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    3  0.0  0.0  0.0  0.0
    2  NaN  1.0  1.0  1.0
    3  NaN  1.0  1.0  1.0
    4  NaN  1.0  1.0  1.0

    6,append追加合并(追加只能行合并,不能列合并)

    # 创建两个df
    In [263]: df1
    Out[263]:
         a    b    c    d
    0  0.0  0.0  0.0  0.0
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    
    In [264]: df2
    Out[264]:
         a    b    c    d
    0  1.0  1.0  1.0  1.0
    1  1.0  1.0  1.0  1.0
    2  1.0  1.0  1.0  1.0

    # 进行append操作

    In [265]: res = df1.append(df2, ignore_index=True)
    
    In [266]: res
    Out[266]:
         a    b    c    d
    0  0.0  0.0  0.0  0.0
    1  0.0  0.0  0.0  0.0
    2  0.0  0.0  0.0  0.0
    3  1.0  1.0  1.0  1.0
    4  1.0  1.0  1.0  1.0
    5  1.0  1.0  1.0  1.0
    更多内容,可访问:http://rexyan.cn
  • 相关阅读:
    【SCOI 2011】 糖果
    【POJ 3159】 Candies
    【POJ 1716】 Integer Intervals
    【POJ 2983】 Is the information reliable?
    【POJ 1364】 King
    【POJ 1201】 Intervals
    【POJ 1804】 Brainman
    6月10日省中提高组题解
    【POJ 3352】 Road Construction
    【POJ 1144】 Network
  • 原文地址:https://www.cnblogs.com/rexyan/p/7295602.html
Copyright © 2011-2022 走看看