zoukankan      html  css  js  c++  java
  • pandas之简单数据统计描述

    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame(np.arange(50).reshape(10, 5), columns=list('abcde'))
    print(df)
    
    print(df.describe())
    print(df.sem())
    
    df1 = pd.Series(['a', 'b', 'c', 'd', 'a', 'a', 'f', 'd'])
    print(df1.value_counts())
    print(df1.describe())
    
    
    
    输出结果:
        a   b   c   d   e
    0   0   1   2   3   4
    1   5   6   7   8   9
    2  10  11  12  13  14
    3  15  16  17  18  19
    4  20  21  22  23  24
    5  25  26  27  28  29
    6  30  31  32  33  34
    7  35  36  37  38  39
    8  40  41  42  43  44
    9  45  46  47  48  49
                   a          b          c          d          e
    count  10.000000  10.000000  10.000000  10.000000  10.000000
    mean   22.500000  23.500000  24.500000  25.500000  26.500000
    std    15.138252  15.138252  15.138252  15.138252  15.138252
    min     0.000000   1.000000   2.000000   3.000000   4.000000
    25%    11.250000  12.250000  13.250000  14.250000  15.250000
    50%    22.500000  23.500000  24.500000  25.500000  26.500000
    75%    33.750000  34.750000  35.750000  36.750000  37.750000
    max    45.000000  46.000000  47.000000  48.000000  49.000000
    a    4.787136
    b    4.787136
    c    4.787136
    d    4.787136
    e    4.787136
    dtype: float64
    a    3
    d    2
    b    1
    c    1
    f    1
    dtype: int64
    count     8
    unique    5
    top       a
    freq      3
    dtype: object
  • 相关阅读:
    代码模板
    DNSget Ip
    WC约束示使用
    下雨了
    Xml文件保存值不能及时更新
    代码不是艺术,而是达到目的的一种手段
    网站TCP链接暴增
    吐个槽吧
    正则表达式使用小注意
    Sereja and Two Sequences CodeForces
  • 原文地址:https://www.cnblogs.com/yuxiangyang/p/11265797.html
Copyright © 2011-2022 走看看