zoukankan      html  css  js  c++  java
  • pd.Series()函数解析(最清晰的解释)

    欢迎关注WX公众号:【程序员管小亮】

    1. Series介绍

    Pandas模块的数据结构主要有两:1、Series ;2、DataFrame

    series是一个一维数组,是基于NumPy的ndarray结构。Pandas会默然用0到n-1来作为series的index,但也可以自己指定index(可以把index理解为dict里面的key)。

    2. Series创建

    1. pd.Series([list],index=[list])

    参数为list;index为可选参数,若不填写则默认index从0开始;若填写则index长度应该与value长度相等。

    import pandas as pd
    
    s=pd.Series([1,2,3,4,5],index=['a','b','c','f','e'])
    print s
    
    1. pd.Series({dict})

    以一字典结构为参数。

    import pandas as pd
    
    s=pd.Series({'a':1,'b':2,'c':3,'f':4,'e':5})
    print s
    

    3. Series取值

    s[index] or s[[index的list]]

    取值操作类似数组,当取不连续的多个值时可以以list为参数

    import pandas as pd
    import numpy as np
    
    v = np.random.random_sample(50)
    s = pd.Series(v)
    s1 = s[[3, 13, 23, 33]]
    s2 = s[3:13]
    s3 = s[43]
    print("s1", s1)
    print("s2", s2)
    print("s3", s3)
    
    s1 3     0.064095
    13    0.354023
    23    0.225739
    33    0.959288
    dtype: float64
    
    s2 3     0.064095
    4     0.405651
    5     0.024181
    6     0.367606
    7     0.844005
    8     0.405313
    9     0.102824
    10    0.806400
    11    0.950502
    12    0.735310
    dtype: float64
    
    s3 0.42803253918
    

    4. Series取头和尾的值

    .head(n).tail(n)

    取出头n行或尾n行,n为可选参数,若不填默认5

    import pandas as pd
    import numpy as np
    
    v = np.random.random_sample(50)
    s = pd.Series(v)
    print("s.head()", s.head())
    print("s.head(3)", s.head(3))
    print("s.tail()", s.tail())
    print("s.head(3)", s.head(3))
    
    s.head() 0    0.714136
    1    0.333600
    2    0.683784
    3    0.044002
    4    0.147745
    dtype: float64
    s.head(3) 0    0.714136
    1    0.333600
    2    0.683784
    dtype: float64
    s.tail() 45    0.779509
    46    0.778341
    47    0.331999
    48    0.444811
    49    0.028520
    dtype: float64
    s.head(3) 0    0.714136
    1    0.333600
    2    0.683784
    dtype: float64
    

    5. Series常用操作

    import pandas as pd
    import numpy as np
    
    v = [10, 3, 2, 2, np.nan]
    v = pd.Series(v)
    print("len():", len(v))  # Series长度,包括NaN
    print("shape():", np.shape(v))  # 矩阵形状,(,)
    print("count():", v.count())  # Series长度,不包括NaN
    print("unique():", v.unique())  # 出现不重复values值
    print("value_counts():
    ", v.value_counts())  # 统计value值出现次数
    
    len(): 5
    shape(): (5,)
    count(): 4
    unique(): [ 10.   3.   2.  nan]
    value_counts():
    2.0     2
    3.0     1
    10.0    1
    dtype: int64
    

    6. Series加法

    import pandas as pd
    import numpy as np
    
    v = [10, 3, 2, 2, np.nan]
    v = pd.Series(v)
    sum = v[1:3] + v[1:3]
    sum1 = v[1:4] + v[1:4]
    sum2 = v[1:3] + v[1:4]
    sum3 = v[:3] + v[1:]
    print("sum", sum)
    print("sum1", sum1)
    print("sum2", sum2)
    print("sum3", sum3)
    
    sum 1    6.0
    2    4.0
    dtype: float64
    
    sum1 1    6.0
    2    4.0
    3    4.0
    dtype: float64
    
    sum2 1    6.0
    2    4.0
    3    NaN
    dtype: float64
    
    sum3 0    NaN
    1    6.0
    2    4.0
    3    NaN
    4    NaN
    dtype: float64
    

    7. Series查找

    1. 范围查找
    import pandas as pd
    import numpy as np
     
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    sa = pd.Series(s, name="age")
    print(sa[sa>19])
    
    jim    22.0
    lj     24.0
    ton    20.0
    Name: age, dtype: float64
    
    1. 中位数
    import pandas as pd
    import numpy as np
    
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    sa = pd.Series(s, name="age")
    print("sa.median()", sa.median())
    
    sa.median() 20.0
    

    8. Series赋值

    import pandas as pd
    import numpy as np
     
    s = {"ton": 20, "mary": 18, "jack": 19, "jim": 22, "lj": 24, "car": None}
    sa = pd.Series(s, name="age")
    print(s)
    print('----------------')
    sa['ton'] = 99
    print(sa)
    
    {'ton': 20, 'mary': 18, 'jack': 19, 'jim': 22, 'lj': 24, 'car': None}
    ----------------
    car      NaN
    jack    19.0
    jim     22.0
    lj      24.0
    mary    18.0
    ton     99.0
    Name: age, dtype: float64
    

    python课程推荐。
    在这里插入图片描述

    参考文章

  • 相关阅读:
    CentOS6.5卸载自带的Mysql软件
    Oracle 监听hang住
    mysql忘记root登录密码
    根据linux自带的JDK,配置JAVA_HOME目录
    nbu还原集群数据库异常问题
    Oracle11g RAC安装
    linux系统安装步骤
    oracle11g安装补丁升级
    linux系统安装Oracle11g详细步骤
    Express之Hello World示例
  • 原文地址:https://www.cnblogs.com/hzcya1995/p/13302751.html
Copyright © 2011-2022 走看看