zoukankan      html  css  js  c++  java
  • 元素过滤

    关于过滤的配合使用:

    notnull 配合all  isnull配合any

    比如:对下列数据过滤获取有用数据

    方法一:空值排除法

    import pandas as pd
    
    import numpy as np
    
    from pandas import Series,DataFrame
    
    df=DataFrame(data=np.random.randint(10,60,size=(8,8)))
    
    df.iloc[1,3]=None
    df.iloc[2,6]=None
    df.iloc[4,1]=None
    df.iloc[5,6]=np.nan
    
    df
    
    df.notnull().all(axis=1)                      #不为空的全部记录布尔值 索引
    
    df.loc[df.notnull().all(axis=1)]              #对该索引取值
    
    df.loc[df.isnull().any(axis=1)]               #存在空值的记录条数

     直接调用dropna方法去执行空值去除

    df.dropna(axis=0)

    方法二:用当前值进行填充

    df.fillna(method='ffill',axis=1)              #用前面的值填充   axis=1横向填充 axis=0 上下填充
    df.fillna(method='backfill',axis=1)           #用后面的值填充

     去除行元素重复:

    df.drop_duplicateates(keep="last")  #重复行仅保留最后一行值

    关于替换操作:

    df.replace(to_replace={66:6666})          #将66替换成6666
    dr.replace(to_replace={2:66},value=999)        #把2当做列索引 66替换成999

     map的映射与计算:

    import pandas as pd
    
    import numpy as np
    
    from pandas import Series,DataFrame
    
    dic={
        'name':['kevin','lisa','jack'],
        'money':[8500,12000,15000]
    }
    df=DataFrame(data=dic)
    
    df
    
    dic={
        'kevin':'凯文',
        'lisa':'丽莎',
        'jack':'杰克'
    }
    
    
    df['c_name']=df['name'].map(dic)          #英文名映射中文名
    
    df
    
    def get_salary(s):
        if s<10000:
            return s
        else:
            s-=(s-10000)*0.3
            return s
    
    df['money'].map(get_salary)

    df['after']=df['money'].map(get_salary)

    对多表数据进行拼接:

    import numpy as np
    import pandas as pd
    from pandas import Series,DataFrame
    
    city=pd.read_excel(r'C:UsersasaxhDesktopcity.xlsx')
    
    people=pd.read_excel(r'C:UsersasaxhDesktoppeople.xlsx')
    
    area=pd.read_excel(r'C:UsersasaxhDesktoparea.xlsx')
    
    display(city.head(2),area.head(2),people.head(2))
    # pd.merge(city,area,on='简称',how='outer')
    pd.merge(city,area,left_on='简称',right_on='简称',how='outer')
    
    city_peo=pd.merge(city,people,left_on='简称',right_on='简称',how='outer')
    cpa=pd.merge(city_peo,area,left_on='简称',right_on='简称',how='outer')               #城市人口面积拼接数据
    
    cpa.drop(labels='Unnamed: 3',axis=1,inplace=True)                         #删除空白列

    -------------------------------------------------------------------------------------

    cpa.isnull().any(axis=0)                       #查询数据为空列
    
    cpa.query('简称=="晋"')                       #按指定条件查询
  • 相关阅读:
    219. Contains Duplicate II
    189. Rotate Array
    169. Majority Element
    122. Best Time to Buy and Sell Stock II
    121. Best Time to Buy and Sell Stock
    119. Pascal's Triangle II
    118. Pascal's Triangle
    88. Merge Sorted Array
    53. Maximum Subarray
    CodeForces 359D Pair of Numbers (暴力)
  • 原文地址:https://www.cnblogs.com/wen-kang/p/10989919.html
Copyright © 2011-2022 走看看