zoukankan      html  css  js  c++  java
  • Pandas笔记:数据离散化(one-hot)

    import pandas as pd
     
    data = pd.Series([176, 174, 160, 180, 159, 163, 192, 184],
                     index=["No1:176", "No2:174", "No3:160", "No4:180", "No5:159", "No6:163", "No7:192", "No8:184"])
    print(data)
    str = pd.qcut(data, 3)
    print()  # 自动分组
    print(pd.get_dummies(str, prefix="height"))  # one-hot
    # 自定义
    bins = [150, 165, 180, 195]
    str = pd.cut(data, bins)
    print(str)
    print(str.value_counts())
    print(pd.get_dummies(str, prefix="身高"))
     
    No1:176    176
    No2:174    174
    No3:160    160
    No4:180    180
    No5:159    159
    No6:163    163
    No7:192    192
    No8:184    184
    dtype: int64
     
             height_(158.999, 166.667]  ...  height_(178.667, 192.0]
    No1:176                          0  ...                        0
    No2:174                          0  ...                        0
    No3:160                          1  ...                        0
    No4:180                          0  ...                        1
    No5:159                          1  ...                        0
    No6:163                          1  ...                        0
    No7:192                          0  ...                        1
    No8:184                          0  ...                        1
     
    [8 rows x 3 columns]
    No1:176    (165, 180]
    No2:174    (165, 180]
    No3:160    (150, 165]
    No4:180    (165, 180]
    No5:159    (150, 165]
    No6:163    (150, 165]
    No7:192    (180, 195]
    No8:184    (180, 195]
    dtype: category
    Categories (3, interval[int64]): [(150, 165] < (165, 180] < (180, 195]]
    (165, 180]    3
    (150, 165]    3
    (180, 195]    2
    dtype: int64
             身高_(150, 165]  身高_(165, 180]  身高_(180, 195]
    No1:176              0              1              0
    No2:174              0              1              0
    No3:160              1              0              0
    No4:180              0              1              0
    No5:159              1              0              0
    No6:163              1              0              0
    No7:192              0              0              1
    No8:184              0              0              1
     
    Process finished with exit code 0
     
  • 相关阅读:
    Todo
    我的类
    Python socket编程之七:多窗口的应用
    iOS与PHP/Android AES128 ECB NoPadding加密
    JSONModel简便应用
    iOS开发系列--UITableView全面解析
    常用方法
    OC和C++混编
    数据层
    block
  • 原文地址:https://www.cnblogs.com/jumpkin1122/p/11509777.html
Copyright © 2011-2022 走看看