zoukankan      html  css  js  c++  java
  • pandas-08 pd.cut()的功能和作用

    pandas-08 pd.cut()的功能和作用

    pd.cut()的作用,有点类似给成绩设定优良中差,比如:0-59分为差,60-70分为中,71-80分为优秀等等,在pandas中,也提供了这样一个方法来处理这些事儿。直接上代码:

    import numpy as np
    import pandas as pd
    from pandas import Series, DataFrame
    
    np.random.seed(666)
    
    score_list = np.random.randint(25, 100, size=20)
    print(score_list)
    # [27 70 55 87 95 98 55 61 86 76 85 53 39 88 41 71 64 94 38 94]
    
    # 指定多个区间
    bins = [0, 59, 70, 80, 100]
    
    score_cut = pd.cut(score_list, bins)
    print(type(score_cut)) # <class 'pandas.core.arrays.categorical.Categorical'>
    print(score_cut)
    '''
    [(0, 59], (59, 70], (0, 59], (80, 100], (80, 100], ..., (70, 80], (59, 70], (80, 100], (0, 59], (80, 100]]
    Length: 20
    Categories (4, interval[int64]): [(0, 59] < (59, 70] < (70, 80] < (80, 100]]
    '''
    print(pd.value_counts(score_cut)) # 统计每个区间人数
    '''
    (80, 100]    8
    (0, 59]      7
    (59, 70]     3
    (70, 80]     2
    dtype: int64
    '''
    
    df = DataFrame()
    df['score'] = score_list
    df['student'] = [pd.util.testing.rands(3) for i in range(len(score_list))]
    print(df)
    '''
        score student
    0      27     1ul
    1      70     yuK
    2      55     WWK
    3      87     EU6
    4      95     Vqn
    5      98     KAf
    6      55     QNT
    7      61     HaE
    8      86     aBo
    9      76     MMa
    10     85     Ctc
    11     53     5BI
    12     39     wBp
    13     88     WMB
    14     41     q5t
    15     71     MjZ
    16     64     nTc
    17     94     Kyx
    18     38     Rlh
    19     94     2uV
    '''
    
    # 使用cut方法进行分箱
    print(pd.cut(df['score'], bins))
    '''
    0       (0, 59]
    1      (59, 70]
    2       (0, 59]
    3     (80, 100]
    4     (80, 100]
    5     (80, 100]
    6       (0, 59]
    7      (59, 70]
    8     (80, 100]
    9      (70, 80]
    10    (80, 100]
    11      (0, 59]
    12      (0, 59]
    13    (80, 100]
    14      (0, 59]
    15     (70, 80]
    16     (59, 70]
    17    (80, 100]
    18      (0, 59]
    19    (80, 100]
    Name: score, dtype: category
    Categories (4, interval[int64]): [(0, 59] < (59, 70] < (70, 80] < (80, 100]]
    '''
    
    df['Categories'] = pd.cut(df['score'], bins)
    print(df)
    '''
        score student Categories
    0      27     1ul    (0, 59]
    1      70     yuK   (59, 70]
    2      55     WWK    (0, 59]
    3      87     EU6  (80, 100]
    4      95     Vqn  (80, 100]
    5      98     KAf  (80, 100]
    6      55     QNT    (0, 59]
    7      61     HaE   (59, 70]
    8      86     aBo  (80, 100]
    9      76     MMa   (70, 80]
    10     85     Ctc  (80, 100]
    11     53     5BI    (0, 59]
    12     39     wBp    (0, 59]
    13     88     WMB  (80, 100]
    14     41     q5t    (0, 59]
    15     71     MjZ   (70, 80]
    16     64     nTc   (59, 70]
    17     94     Kyx  (80, 100]
    18     38     Rlh    (0, 59]
    19     94     2uV  (80, 100]
    '''
    
    # 但是这样的方法不是很适合阅读,可以使用cut方法中的label参数
    # 为每个区间指定一个label
    df['Categories'] = pd.cut(df['score'], bins, labels=['low', 'middle', 'good', 'perfect'])
    print(df)
    '''
        score student Categories
    0      27     1ul        low
    1      70     yuK     middle
    2      55     WWK        low
    3      87     EU6    perfect
    4      95     Vqn    perfect
    5      98     KAf    perfect
    6      55     QNT        low
    7      61     HaE     middle
    8      86     aBo    perfect
    9      76     MMa       good
    10     85     Ctc    perfect
    11     53     5BI        low
    12     39     wBp        low
    13     88     WMB    perfect
    14     41     q5t        low
    15     71     MjZ       good
    16     64     nTc     middle
    17     94     Kyx    perfect
    18     38     Rlh        low
    19     94     2uV    perfect
    '''
    
  • 相关阅读:
    windbg条件断点总结
    使用openssl命令剖析RSA私钥文件格式
    RSA读取密钥——使用openssl编程
    OPENSSL中RSA私钥文件(PEM格式)解析【一】
    电商系统架构——系统鸟瞰图
    构建高并发高可用的电商平台架构实践
    一些PHP性能的优化
    CentOS的Gearman安装
    php安装gearman扩展实现异步分步式任务
    使用 Gearman 实现分布式处理
  • 原文地址:https://www.cnblogs.com/wenqiangit/p/11252758.html
Copyright © 2011-2022 走看看