zoukankan      html  css  js  c++  java
  • Python 数据分析:Pandas 缺省值的判断

    Python 数据分析:Pandas 缺省值的判断

    背景

    我们从数据库中取出数据存入 Pandas None 转换成 NaN 或 NaT。但是,我们将 Pandas 数据写入数据库时又需要转换成 None,不然就会报错。因此,我们就需要处理 Pandas 的缺省值。

    样本数据

       id         name  password  sn  sex  age  amount  content  remark  login_date login_at    created_at  
    0   1  123456789.0       NaN NaN  NaN   20     NaN      NaN     NaN  NaN        NaT         2019-08-10 10:00:00  
    1   2          NaN       NaN NaN  NaN   20     NaN      NaN     NaN  NaN        NaT         2019-08-10 10:00:00 
    

    判断缺省值

    如果 column 是缺省值,则统一处理为 None。

    def judge_null(column):
        if pd.isnull(column):
            return None
        return column
    

    处理缺省值

    按列处理缺省值。

    df['id'] = df.apply(lambda row: judge_null(row['id']), axis=1)
    df['name'] = df.apply(lambda row: judge_null(row['name']), axis=1)
    df['password'] = df.apply(lambda row: judge_null(row['password']), axis=1)
    df['sn'] = df.apply(lambda row: judge_null(row['sn']), axis=1)
    df['sex'] = df.apply(lambda row: judge_null(row['sex']), axis=1)
    df['age'] = df.apply(lambda row: judge_null(row['age']), axis=1)
    df['amount'] = df.apply(lambda row: judge_null(row['amount']), axis=1)
    df['content'] = df.apply(lambda row: judge_null(row['content']), axis=1)
    df['remark'] = df.apply(lambda row: judge_null(row['remark']), axis=1)
    df['login_date'] = df.apply(lambda row: judge_null(row['login_date']), axis=1)
    df['login_at'] = df.apply(lambda row: judge_null(row['login_at']), axis=1)
    df['created_at'] = df.apply(lambda row: judge_null(row['created_at']), axis=1)
    

    处理完成之后的数据

       id         name  password  sn    sex    age   amount    content  remark  login_date  login_at  created_at  
    0   1  123456789.0      None  None  None   20    None      None     None    None        None      2019-08-10 10:00:00  
    1   2         None      None  None  None   20    None      None     None    None        None      2019-08-10 10:00:00 
    

    补充

    设置显示所有的行、列及值得长度。

    # 显示所有列
    pd.set_option('display.max_columns', None)
    # 显示所有行
    pd.set_option('display.max_rows', None)
    # 设置value的显示长度为100,默认为50
    pd.set_option('max_colwidth', 100)
    

    对应的数据库建表语句

    create table test
    (
      id         int(10)        not null primary key,
      name       varchar(32)    null,
      password   char(10)       null,
      sn         bigint         null,
      sex        tinyint(1)     null,
      age        int(5)         null,
      amount     decimal(10, 2) null,
      content    text           null,
      remark     json           null,
      login_date date           null,
      login_at   datetime       null,
      created_at timestamp      null
    );
    
  • 相关阅读:
    Redis源码分析(二十一)--- anet网络通信的封装
    leetcode 总结part1
    leetcode String to Integer (atoi)
    leetcode 165. Compare Version Numbers
    leetcode 189. Rotate Array
    leetcode 168. Excel Sheet Column Title
    leetcode 155. Min Stack
    leetcode 228. Summary Ranges
    leetcode 204. Count Primes
    leetcode 6. ZigZag Conversion
  • 原文地址:https://www.cnblogs.com/yxhblogs/p/11330927.html
Copyright © 2011-2022 走看看