zoukankan      html  css  js  c++  java
  • 利用Python进行数据分析_Numpy_基础_2

     

    Numpy数据类型包括:

    int8、uint8、int16、uint16、int32、uint32、int64、uint64、float16、float32、float64、float128、complex64、complex128、complex256、bool、object、string_、unicode_

    astype

    显示转换数组类型的方法

    例如:

    image

     

     

     

     

     

     

     

     

     

     

    NumPy数组的索引和切片

    索引

    和python列表差不多,基本上没啥区别

    切片

    NumPy数组的切片出来的数值改变,就会改变NumPy数组的源数组的值。NumPy数组的切片是源数组的视图,而不是新复制出来的一个数组。从下面的例子,我们可以看到arr[1,1]=0 ,arr的数组变化了,data数组对应位置的数值也变化了。

    In [101]: data = np.random.randn(4,4)
    
    In [102]: data
    Out[102]:
    array([[-1.68867271, -0.89369286, -0.0288363 ,  0.73855122],
           [-0.13084603,  0.43972144,  0.73542583,  1.99925332],
           [ 0.04291022, -0.91963212,  3.09214837, -0.6070068 ],
           [-0.01416294, -1.46576298,  1.42196278,  0.84758994]])
    
    In [103]: arr = data[2:,1:]
    
    In [104]: arr
    Out[104]:
    array([[-0.91963212,  3.09214837, -0.6070068 ],
           [-1.46576298,  1.42196278,  0.84758994]])
    
    In [105]: arr = 0
    
    In [106]: data
    Out[106]:
    array([[-1.68867271, -0.89369286, -0.0288363 ,  0.73855122],
           [-0.13084603,  0.43972144,  0.73542583,  1.99925332],
           [ 0.04291022, -0.91963212,  3.09214837, -0.6070068 ],
           [-0.01416294, -1.46576298,  1.42196278,  0.84758994]])
    
    In [107]: arr
    Out[107]: 0
    
    In [108]: arr = data[2:,1:]
    
    In [109]: arr
    Out[109]:
    array([[-0.91963212,  3.09214837, -0.6070068 ],
           [-1.46576298,  1.42196278,  0.84758994]])
    
    In [110]: arr == 0
    Out[110]:
    array([[False, False, False],
           [False, False, False]], dtype=bool)
    
    In [111]: arr
    Out[111]:
    array([[-0.91963212,  3.09214837, -0.6070068 ],
           [-1.46576298,  1.42196278,  0.84758994]])
    
    In [112]: arr[1,1]=0
    
    In [113]: arr
    Out[113]:
    array([[-0.91963212,  3.09214837, -0.6070068 ],
           [-1.46576298,  0.        ,  0.84758994]])
    
    In [114]: data
    Out[114]:
    array([[-1.68867271, -0.89369286, -0.0288363 ,  0.73855122],
           [-0.13084603,  0.43972144,  0.73542583,  1.99925332],
           [ 0.04291022, -0.91963212,  3.09214837, -0.6070068 ],
           [-0.01416294, -1.46576298,  0.        ,  0.84758994]])
    
    In [115]:

    如果要复制NumPy数组的切片,则可以使用显示复制方法copy()

    In [116]: data
    Out[116]:
    array([[-1.68867271, -0.89369286, -0.0288363 ,  0.73855122],
           [-0.13084603,  0.43972144,  0.73542583,  1.99925332],
           [ 0.04291022, -0.91963212,  3.09214837, -0.6070068 ],
           [-0.01416294, -1.46576298,  0.        ,  0.84758994]])
    
    In [117]: arr = data
    
    In [118]: arr
    Out[118]:
    array([[-1.68867271, -0.89369286, -0.0288363 ,  0.73855122],
           [-0.13084603,  0.43972144,  0.73542583,  1.99925332],
           [ 0.04291022, -0.91963212,  3.09214837, -0.6070068 ],
           [-0.01416294, -1.46576298,  0.        ,  0.84758994]])
    
    In [119]: arr = np.copy(data)
    
    In [120]: arr
    Out[120]:
    array([[-1.68867271, -0.89369286, -0.0288363 ,  0.73855122],
           [-0.13084603,  0.43972144,  0.73542583,  1.99925332],
           [ 0.04291022, -0.91963212,  3.09214837, -0.6070068 ],
           [-0.01416294, -1.46576298,  0.        ,  0.84758994]])

    布尔类型索引

    假设每个字符串对应data数组一行数据。需要注意布尔型数组的长度必须与被索引的轴长度一致。

    通过布尔型索引查找数组数值的方式如下:

    In [140]: names = np.array(['aaa','bbb','ccc','ddd','eee','fff'])

    In [141]: data = np.random.randn(6,4)

    In [142]: names
    Out[142]:
    array(['aaa', 'bbb', 'ccc', 'ddd', 'eee', 'fff'],
           dtype='<U3')

    In [143]: data
    Out[143]:
    array([[ 0.49394026, -0.65887621, -0.26946242,  0.22042355],
            [-1.11606179, -1.94945158, -0.4866134 ,  0.67712409],
            [-2.33792045,  0.01639887, -0.46020647,  0.84180777],
            [-1.99622938,  1.937877  , -0.17134376,  0.56915872],
            [ 1.50980905,  0.07244016, -0.95650922,  1.23508517],
            [ 0.74706519, -0.03149619, -0.38235363,  0.69786257]])

    In [144]: names == 'aaa'
    Out[144]: array([ True, False, False, False, False, False], dtype=bool)

    In [145]: data[names=='aaa']
    Out[145]: array([[ 0.49394026, -0.65887621, -0.26946242,  0.22042355]])

    In [146]: names =='ccc'
    Out[146]: array([False, False,  True, False, False, False], dtype=bool)

    In [147]: data[names=='ccc']
    Out[147]: array([[-2.33792045,  0.01639887, -0.46020647,  0.84180777]])

    布尔数组索引结合切片进行查找数组的数值:

    In [148]: data[names=='aaa',2]
    Out[148]: array([-0.26946242])
    
    In [149]: data[names=='aaa',2:]
    Out[149]: array([[-0.26946242,  0.22042355]])
    
    In [150]: data[names=='aaa',1:]
    Out[150]: array([[-0.65887621, -0.26946242,  0.22042355]])

    反向查找

    In [155]: names !='aaa'
    Out[155]: array([False,  True,  True,  True,  True,  True], dtype=bool)
    
    In [156]: data[names!='aaa']
    Out[156]:
    array([[-1.11606179, -1.94945158, -0.4866134 ,  0.67712409],
           [-2.33792045,  0.01639887, -0.46020647,  0.84180777],
           [-1.99622938,  1.937877  , -0.17134376,  0.56915872],
           [ 1.50980905,  0.07244016, -0.95650922,  1.23508517],
           [ 0.74706519, -0.03149619, -0.38235363,  0.69786257]])

    组合查找

    In [171]: mask = (names == 'aaa')|(names == 'ccc')
    
    In [172]: mask
    Out[172]: array([ True, False,  True, False, False, False], dtype=bool)
    
    In [173]: data[mask]
    Out[173]:
    array([[ 0.49394026, -0.65887621, -0.26946242,  0.22042355],
           [-2.33792045,  0.01639887, -0.46020647,  0.84180777]])

    花式索引

    其实就是利用整数列表或数组进行索引查找。花式索引与数组切片不同,花式索引会将数据复制到新的数组。

    整数列表

    创建一个二维数组arr,然后传入[3,1],意思就是按 arr [3,:]、arr[1,:]的顺序显示出来。

    In [203]: arr = np.array(([1,2,3,4],[2,3,4,5],[3,4,5,6],[7,8,9,10]))
    
    In [204]: arr
    Out[204]:
    array([[ 1,  2,  3,  4],
           [ 2,  3,  4,  5],
           [ 3,  4,  5,  6],
           [ 7,  8,  9, 10]])
    
    In [205]: arr[[3,1]]
    Out[205]:
    array([[ 7,  8,  9, 10],
           [ 2,  3,  4,  5]])

    传入多个整数数组

    一次传入多个整数数组,返回的是一个一维数组。

    数组转置对轴对换

    数组转置,是指将原数组A的行与列交换得到的一个新数组。

    比如:

    的转置是的转置是

    方法1:T

    In [227]: arr = np.random.randn(10)
    
    In [228]: arr
    Out[228]:
    array([-1.42853867,  1.54300781, -0.74079757, -1.20272388, -1.00416459,
           -0.59571731,  1.16744662,  0.05739806,  1.01660691, -0.84625494])
    
    In [229]: arr.T
    Out[229]:
    array([-1.42853867,  1.54300781, -0.74079757, -1.20272388, -1.00416459,
           -0.59571731,  1.16744662,  0.05739806,  1.01660691, -0.84625494])
    
    In [230]: arr = np.random.randn(3,5)
    
    In [231]: arr
    Out[231]:
    array([[ 1.36114118,  0.48455027,  0.64847485,  0.01691785, -0.03622465],
           [-2.31302164,  1.14992892, -1.47836923,  1.08003907, -1.33663009],
           [-0.38005499,  1.3517217 ,  2.52024026, -0.3576492 ,  0.46016645]])
    
    In [232]: arr.T
    Out[232]:
    array([[ 1.36114118, -2.31302164, -0.38005499],
           [ 0.48455027,  1.14992892,  1.3517217 ],
           [ 0.64847485, -1.47836923,  2.52024026],
           [ 0.01691785,  1.08003907, -0.3576492 ],
           [-0.03622465, -1.33663009,  0.46016645]])

    方法2:transpose

    三维数组 arr:4个3*4的数组

    In [275]: arr = np.arange(48).reshape(4,3,4)

    In [276]: arr
    Out[276]:
    array([[[ 0,  1,  2,  3],
             [ 4,  5,  6,  7],
             [ 8,  9, 10, 11]],

           [[12, 13, 14, 15],
             [16, 17, 18, 19],
             [20, 21, 22, 23]],

           [[24, 25, 26, 27],
             [28, 29, 30, 31],
             [32, 33, 34, 35]],

           [[36, 37, 38, 39],
             [40, 41, 42, 43],
             [44, 45, 46, 47]]])

         
     

    transpose参数的真正意义在于这个shape元组的索引(轴编号)。

    In [278]: arr.shape
    Out[278]: (4, 3, 4)

    arr数组的索引(轴编号):0、1、2

    下面是按索引 2、0、1进行对换

    In [277]: arr.transpose(2,0,1)
     Out[277]:
     array([[[ 0,  4,  8],
             [12, 16, 20],
             [24, 28, 32],
             [36, 40, 44]],
    
           [[ 1,  5,  9],
             [13, 17, 21],
             [25, 29, 33],
             [37, 41, 45]],
    
           [[ 2,  6, 10],
             [14, 18, 22],
             [26, 30, 34],
             [38, 42, 46]],
    
           [[ 3,  7, 11],
             [15, 19, 23],
             [27, 31, 35],
             [39, 43, 47]]])

    然后,我们再按(轴编号)0、1、2 对换回到原来的样子

    In [279]: arr.transpose(0,1,2)
    Out[279]:
    array([[[ 0,  1,  2,  3],
            [ 4,  5,  6,  7],
            [ 8,  9, 10, 11]],
    
           [[12, 13, 14, 15],
            [16, 17, 18, 19],
            [20, 21, 22, 23]],
    
           [[24, 25, 26, 27],
            [28, 29, 30, 31],
            [32, 33, 34, 35]],
    
           [[36, 37, 38, 39],
            [40, 41, 42, 43],
            [44, 45, 46, 47]]])

    方法3:swapaxes

    swapaxes返回的是源数组的视图。

    相比于transpose是需要传入一个索引元组(轴编号),swapaxes只需要一对索引元组(轴编号)。

    In [283]: arr.swapaxes(2,1)
    Out[283]:
    array([[[ 0,  4,  8],
            [ 1,  5,  9],
            [ 2,  6, 10],
            [ 3,  7, 11]],
    
           [[12, 16, 20],
            [13, 17, 21],
            [14, 18, 22],
            [15, 19, 23]],
    
           [[24, 28, 32],
            [25, 29, 33],
            [26, 30, 34],
            [27, 31, 35]],
    
           [[36, 40, 44],
            [37, 41, 45],
            [38, 42, 46],
            [39, 43, 47]]])
  • 相关阅读:
    HDU 3033 I love sneakers!
    HDU 1712 ACboy needs your help
    FZU 1608 Huge Mission
    HDU 3394 Railway
    【MySQL】20个经典面试题,全部答对月薪10k+
    mysql故障解决笔记
    mysql 索引类型
    linux禁用锁定和解除解锁用户账号的方法
    Linux服务器制定mysql数据库备份的计划任务
    网站服务器安全防范小知识
  • 原文地址:https://www.cnblogs.com/zhouwp/p/8425164.html
Copyright © 2011-2022 走看看