zoukankan      html  css  js  c++  java
  • Pandas文摘:Applying Operations Over pandas Dataframes

    原文地址:https://chrisalbon.com/python/data_wrangling/pandas_apply_operations_to_dataframes/

    Applying Operations Over pandas Dataframes

    20 Dec 2017

    Import Modules

    import pandas as pd
    import numpy as np

    Create a dataframe

    data = {'name': ['Jason', 'Molly', 'Tina', 'Jake', 'Amy'], 
            'year': [2012, 2012, 2013, 2014, 2014], 
            'reports': [4, 24, 31, 2, 3],
            'coverage': [25, 94, 57, 62, 70]}
    df = pd.DataFrame(data, index = ['Cochice', 'Pima', 'Santa Cruz', 'Maricopa', 'Yuma'])
    df
     coveragenamereportsyear
    Cochice 25 Jason 4 2012
    Pima 94 Molly 24 2012
    Santa Cruz 57 Tina 31 2013
    Maricopa 62 Jake 2 2014
    Yuma 70 Amy 3 2014

    Create a capitalization lambda function

    capitalizer = lambda x: x.upper()

    Apply the capitalizer function over the column ‘name’

    apply() can apply a function along any axis of the dataframe

    df['name'].apply(capitalizer)
    Cochice       JASON
    Pima          MOLLY
    Santa Cruz     TINA
    Maricopa       JAKE
    Yuma            AMY
    Name: name, dtype: object
    

    Map the capitalizer lambda function over each element in the series ‘name’

    map() applies an operation over each element of a series

    df['name'].map(capitalizer)
    Cochice       JASON
    Pima          MOLLY
    Santa Cruz     TINA
    Maricopa       JAKE
    Yuma            AMY
    Name: name, dtype: object
    

    Apply a square root function to every single cell in the whole data frame

    applymap() applies a function to every single element in the entire dataframe.

    # Drop the string variable so that applymap() can run
    df = df.drop('name', axis=1)
    
    # Return the square root of every cell in the dataframe
    df.applymap(np.sqrt)
     coveragereportsyear
    Cochice 5.000000 2.000000 44.855323
    Pima 9.695360 4.898979 44.855323
    Santa Cruz 7.549834 5.567764 44.866469
    Maricopa 7.874008 1.414214 44.877611
    Yuma 8.366600 1.732051 44.877611

    Applying A Function Over A Dataframe

    Create a function that multiplies all non-strings by 100

    # create a function called times100
    def times100(x):
        # that, if x is a string,
        if type(x) is str:
            # just returns it untouched
            return x
        # but, if not, return it multiplied by 100
        elif x:
            return 100 * x
        # and leave everything else
        else:
            return

    Apply the times100 over every cell in the dataframe

    df.applymap(times100)
     coveragereportsyear
    Cochice 2500 400 201200
    Pima 9400 2400 201200
    Santa Cruz 5700 3100 201300
    Maricopa 6200 200 201400
    Yuma 7000 300 201400
  • 相关阅读:
    记账本开发进程第四天
    记账本开发进程第三天
    记账本开发进程第二天
    记账本开发进程第一天
    《人月神话》阅读笔记三
    《人月神话》阅读笔记二
    一、计算机基础
    Fox and Minimal path CodeForces
    Maximum Value (二分+思维枚举)
    True Liars (思维想法+带权并并查集+01背包)
  • 原文地址:https://www.cnblogs.com/chickenwrap/p/10125568.html
Copyright © 2011-2022 走看看