- 构造数据
import pandas as pd df = pd.DataFrame({'Country':['China','China', 'India', 'India', 'America', 'Japan', 'China', 'India'], 'Income':[10000, 10000, 5000, 5002, 40000, 50000, 8000, 5000], 'Age':[5000, 4321, 1234, 4010, 250, 250, 4500, 4321]})
结果如下:
Age Country Income 0 5000 China 10000 1 4321 China 10000 2 1234 India 5000 3 4010 India 5002 4 250 America 40000 5 250 Japan 50000 6 4500 China 8000 7 4321 India 5000
-
单列分组
df_gb = df.groupby('Country') for index, data in df_gb: print(index) print(data) 输出 America Age Country Income 4 250 America 40000 China Age Country Income 0 5000 China 10000 1 4321 China 10000 6 4500 China 8000 India Age Country Income 2 1234 India 5000 3 4010 India 5002 7 4321 India 5000 Japan Age Country Income 5 250 Japan 50000
- 多列分组
df_gb = df.groupby(['Country', 'Income']) for (index1, index2), data in df_gb: print((index1, index2)) print(data) 输出 ('America', 40000) Age Country Income 4 250 America 40000 ('China', 8000) Age Country Income 6 4500 China 8000 ('China', 10000) Age Country Income 0 5000 China 10000 1 4321 China 10000 ('India', 5000) Age Country Income 2 1234 India 5000 7 4321 India 5000 ('India', 5002) Age Country Income 3 4010 India 5002 ('Japan', 50000) Age Country Income 5 250 Japan 50000
聚合函数,对分组后数据进行聚合
-
df_agg = df.groupby('Country').agg(['min', 'mean', 'max']) print(df_agg) 输出 Age Income min mean max min mean max Country America 250 250.000000 250 40000 40000.000000 40000 China 4321 4607.000000 5000 8000 9333.333333 10000 India 1234 3188.333333 4321 5000 5000.666667 5002 Japan 250 250.000000 250 50000 50000.000000 50000
对分组后的部分列进行聚合
-
num_agg = {'Age':['min', 'mean', 'max']} print(df.groupby('Country').agg(num_agg)) 输出 Age min mean max Country America 250 250.000000 250 China 4321 4607.000000 5000 India 1234 3188.333333 4321 Japan 250 250.000000 250
num_agg = {'Age':['min', 'mean', 'max'], 'Income':['min', 'max']} print(df.groupby('Country').agg(num_agg)) 输出 Age Income min mean max min max Country America 250 250.000000 250 40000 40000 China 4321 4607.000000 5000 8000 10000 India 1234 3188.333333 4321 5000 5002 Japan 250 250.000000 250 50000 50000