zoukankan      html  css  js  c++  java
  • Pandas笔记:数据离散化(one-hot)

    import pandas as pd
     
    data = pd.Series([176, 174, 160, 180, 159, 163, 192, 184],
                     index=["No1:176", "No2:174", "No3:160", "No4:180", "No5:159", "No6:163", "No7:192", "No8:184"])
    print(data)
    str = pd.qcut(data, 3)
    print()  # 自动分组
    print(pd.get_dummies(str, prefix="height"))  # one-hot
    # 自定义
    bins = [150, 165, 180, 195]
    str = pd.cut(data, bins)
    print(str)
    print(str.value_counts())
    print(pd.get_dummies(str, prefix="身高"))
     
    No1:176    176
    No2:174    174
    No3:160    160
    No4:180    180
    No5:159    159
    No6:163    163
    No7:192    192
    No8:184    184
    dtype: int64
     
             height_(158.999, 166.667]  ...  height_(178.667, 192.0]
    No1:176                          0  ...                        0
    No2:174                          0  ...                        0
    No3:160                          1  ...                        0
    No4:180                          0  ...                        1
    No5:159                          1  ...                        0
    No6:163                          1  ...                        0
    No7:192                          0  ...                        1
    No8:184                          0  ...                        1
     
    [8 rows x 3 columns]
    No1:176    (165, 180]
    No2:174    (165, 180]
    No3:160    (150, 165]
    No4:180    (165, 180]
    No5:159    (150, 165]
    No6:163    (150, 165]
    No7:192    (180, 195]
    No8:184    (180, 195]
    dtype: category
    Categories (3, interval[int64]): [(150, 165] < (165, 180] < (180, 195]]
    (165, 180]    3
    (150, 165]    3
    (180, 195]    2
    dtype: int64
             身高_(150, 165]  身高_(165, 180]  身高_(180, 195]
    No1:176              0              1              0
    No2:174              0              1              0
    No3:160              1              0              0
    No4:180              0              1              0
    No5:159              1              0              0
    No6:163              1              0              0
    No7:192              0              0              1
    No8:184              0              0              1
     
    Process finished with exit code 0
     
  • 相关阅读:
    ubuntu 15.10 64bit 下 steam无法启动
    ubuntu下使用OBS开斗鱼直播
    sql server 2008 management studio安装教程
    navicat for mysql 破解版
    nginx 重写去掉index.php
    phpstorm 注册码破解
    tp where使用数组条件,如何设置or,and
    PHPstorm 配置主题
    IE下无法保存Cookie和Session问题
    GitLab的安装及使用
  • 原文地址:https://www.cnblogs.com/jumpkin1122/p/11509777.html
Copyright © 2011-2022 走看看