zoukankan      html  css  js  c++  java
  • pandas练习(四)--- 应用Apply函数

    探索学生对酒的消费情况

    数据见github

    步骤1 - 导入必要的库

    import pandas as pd
    import numpy as np

    步骤2 - 数据集

    path4 = "./data/student-mat.csv"   

    步骤3 将数据命名为student

    student = pd.read_csv(path4)
    student.head()

    输出:

    步骤4 从'school'到'guardian'将数据切片

    stud_alcoh = student.loc[: , "school":"guardian"]
    stud_alcoh.head()

    输出:

     

    步骤5 创建一个捕获字符串的lambda函数

    captalizer = lambda x: x.upper()

    步骤6 使'Fjob'列都大写

    stud_alcoh['Fjob'].apply(captalizer)

    输出:

    0       TEACHER
    1         OTHER
    2         OTHER
    3      SERVICES
    4         OTHER
    5         OTHER
    6         OTHER
    7       TEACHER
    8         OTHER
    9         OTHER
    10       HEALTH
    11        OTHER
    12     SERVICES
    13        OTHER
    14        OTHER
    15        OTHER
    16     SERVICES
    17        OTHER
    18     SERVICES
    19        OTHER
    20        OTHER
    21       HEALTH
    22        OTHER
    23        OTHER
    24       HEALTH
    25     SERVICES
    26        OTHER
    27     SERVICES
    28        OTHER
    29      TEACHER
             ...   
    365       OTHER
    366    SERVICES
    367    SERVICES
    368    SERVICES
    369     TEACHER
    370    SERVICES
    371    SERVICES
    372     AT_HOME
    373       OTHER
    374       OTHER
    375       OTHER
    376       OTHER
    377    SERVICES
    378       OTHER
    379       OTHER
    380     TEACHER
    381       OTHER
    382    SERVICES
    383    SERVICES
    384       OTHER
    385       OTHER
    386     AT_HOME
    387       OTHER
    388    SERVICES
    389       OTHER
    390    SERVICES
    391    SERVICES
    392       OTHER
    393       OTHER
    394     AT_HOME
    Name: Fjob, dtype: object

    步骤7 打印数据集的最后几行元素

    stud_alcoh.tail()

    输出:

    步骤8 注意到原始数据框仍然是小写字母,接下来改进一下

    stud_alcoh['Mjob'] = stud_alcoh['Mjob'].apply(captalizer)
    stud_alcoh['Fjob'] = stud_alcoh['Fjob'].apply(captalizer)
    stud_alcoh.tail()

    输出:

    步骤9 创建一个名为majority的函数,它返回一个布尔值到一个名为legal_drinker的新列(多数年龄大于17岁)

    def majority(x):
        if x > 17:
            return True
        else:
            return False
    stud_alcoh['legal_drinker'] = stud_alcoh['age'].apply(majority)
    stud_alcoh.head()

    输出:

    步骤10 将数据集的每个数字乘以10

    def times10(x):
        if type(x) is int:
            return 10 * x
        return x
    stud_alcoh.applymap(times10).head(10)

    输出:

    参考链接:

    1、http://pandas.pydata.org/pandas-docs/stable/cookbook.html#cookbook

    2、https://www.analyticsvidhya.com/blog/2016/01/12-pandas-techniques-python-data-manipulation/

    3、https://github.com/guipsamora/pandas_exercises

  • 相关阅读:
    Java数据类型
    Hadoop之MapReduce单词计数经典实例
    亲戚问你每月多少工资?程序员该如何机智回答
    MySQL进阶操作
    MySQL基础操作
    Redis安装教程
    希尔排序(Shell Sort)
    插入排序(Insertion Sort)
    javascriptの循序渐进(一)
    css Animation初体验
  • 原文地址:https://www.cnblogs.com/xiaxuexiaoab/p/9226772.html
Copyright © 2011-2022 走看看