zoukankan      html  css  js  c++  java
  • Python笔记 #18# Pandas: Grouping

    10 Minutes to pandas

    By “group by” we are referring to a process involving one or more of the following steps
    
    Splitting the data into groups based on some criteria
    Applying a function to each group independently
    Combining the results into a data structure
    See the Grouping section

    代码

    df = pd.DataFrame({'A': ['foo', 'bar', 'foo', 'bar','foo', 'bar', 'foo', 'foo'],
                        'B': ['one', 'one', 'two', 'three','two', 'two', 'one', 'three'],
                         'C': np.random.randn(8), 'D': np.random.randn(8)})
    print(df)
    print(df.groupby('A').sum()) # 计算 foo bar 各自对应 C D 列的和(B列无法求和)
    
    print(df.groupby(['A','B']).sum()) # 同理,不过这里有个一对多的关系
    
    #      A      B         C         D
    # 0  foo    one  0.102071 -0.301926
    # 1  bar    one  1.161158  0.847451
    # 2  foo    two -0.023879  0.936338
    # 3  bar  three -0.353075 -0.834349
    # 4  foo    two -0.272542 -1.425635
    # 5  bar    two -1.016016 -0.031614
    # 6  foo    one -0.428517  0.892747
    # 7  foo  three -0.843796  0.614443
    # /
    #             C         D
    # A                      
    # bar -0.207932 -0.018512
    # foo -1.466663  0.715967
    #                   C         D
    # /
    # A   B                        
    # bar one    1.161158  0.847451
    #     three -0.353075 -0.834349
    #     two   -1.016016 -0.031614
    # foo one   -0.326445  0.590821
    #     three -0.843796  0.614443
    #     two   -0.296421 -0.489296
  • 相关阅读:
    因特网中和多媒体有关的协议
    进程与线程
    线程模型
    SMP PVP Cluster
    读写者
    回调函数
    环境变量
    堆与栈的区别
    操作系统中的同步、异步、阻塞和非阻塞
    Razor潜入2令人疑惑的LocateOwner方法
  • 原文地址:https://www.cnblogs.com/xkxf/p/8394941.html
Copyright © 2011-2022 走看看