zoukankan      html  css  js  c++  java
  • pandas.DataFrame对象解析

    pandas.DataFrame对象类型解析

    df = pd.DataFrame([[1,"2",3,4],[5,"6",7,8]],columns=["a","b","c","d"])

    method解析

    1、add()方法:类似加法运算(相加的元素必须是同一对象的数据)

     |  add(self, other, axis='columns', level=None, fill_value=None)
     |      Addition of dataframe and other, element-wise (binary operator `add`).
     |      
     |      Equivalent to ``dataframe + other``, but with support to substitute a fill_value for
     |      missing data in one of the inputs.
     |      
     |      Parameters
     |      ----------
     |      other : Series, DataFrame, or constant
     |      axis : {0, 1, 'index', 'columns'}
     |          For Series input, axis to match Series index on
     |      level : int or name
     |          Broadcast across a level, matching Index values on the
     |          passed MultiIndex level
     |      fill_value : None or float value, default None
     |          Fill existing missing (NaN) values, and any new element needed for
     |          successful DataFrame alignment, with this value before computation.
     |          If data in both corresponding DataFrame locations is missing
     |          the result will be missing
    pandas.DataFrame.add方法

    example:

    output: 

    2、aggregate()方法:可简写agg()方法

    aggregate(self, func, axis=0, *args, **kwargs)
     |      Aggregate using one or more operations over the specified axis.
     |      
     |      .. versionadded:: 0.20.0
     |      
     |      Parameters
     |      ----------
     |      func : function, string, dictionary, or list of string/functions
     |          Function to use for aggregating the data. If a function, must either
     |          work when passed a DataFrame or when passed to DataFrame.apply. For
     |          a DataFrame, can pass a dict, if the keys are DataFrame column names.
     |      
     |          Accepted combinations are:
     |      
     |          - string function name.
     |          - function.
     |          - list of functions.
     |          - dict of column names -> functions (or list of functions).
    pandas.DataFrame.aggregate方法

    example:

    #coding=utf-8
    import pandas as pd
    import numpy as np
    
    ds = pd.Series([11,"2",13,14])
    print ds,"
    "
    
    df = pd.DataFrame([[1,"2",3,4],[5,"6",7,8]],columns=["a","b","c","d"])
    print df,"
    "
    
    print(df.agg(['sum', 'min']))
    print(df.agg({"a":['sum', 'min']}))
    View Code

    output:

    0    11
    1     2
    2    13
    3    14
    dtype: object 
    
       a  b  c  d
    0  1  2  3  4
    1  5  6  7  8 
    
         a   b   c   d
    sum  6  26  10  12
    min  1   2   3   4
         a
    sum  6
    min  1
    View Code

    常用的aggregation functions (`mean`, `median`, `prod`, `sum`, `std`,`var`)

    mad(self, axis=None, skipna=None, level=None)
        Return the mean absolute deviation of the values for the requested axis
    max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
        This method returns the maximum of the values in the object.If you want the *index* of the maximum, use ``idxmax``. This is the equivalent of the ``numpy.ndarray`` method ``argmax``.
    mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
        Return the mean of the values for the requested axis
    median(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
        Return the median of the values for the requested axis
    min(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
        This method returns the minimum of the values in the object.
        
    memory_usage(self, index=True, deep=False)
        Return the memory usage of each column in bytes.
    merge(self, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None)
        Merge DataFrame objects by performing a database-style join operation by columns or indexes.
    align(self, other, join='outer', axis=None, level=None, copy=True, fill_value=None, method=None, limit=None, fill_axis=0, broadcast_axis=None):
        Align two objects on their axes with the specified join method for each axis Index
    all(self, axis=None, bool_only=None, skipna=None, level=None, **kwargs):
        Return whether all elements are True over series or dataframe axis.
    any(self, axis=None, bool_only=None, skipna=None, level=None, **kwargs):
        Return whether any element is True over requested axis.
    apply(self, func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds):
        Apply a function along an axis of the DataFrame.
    applymap(self, func):
        Apply a function to a Dataframe elementwise.This method applies a function that accepts and returns a scalarto every element of a DataFrame.
    append(self, other, ignore_index=False, verify_integrity=False, sort=None):
        Append rows of `other` to the end of this frame, returning a new object. Columns not in this frame are added as new columns.
    assign(self, **kwargs):
        Assign new columns to a DataFrame, returning a new object(a copy) with the new columns added to the original ones.Existing columns that are re-assigned will be overwritten.
    insert(self, loc, column, value, allow_duplicates=False)
        Insert column into DataFrame at specified location.    
    
    combine(self, other, func, fill_value=None, overwrite=True):
        Add two DataFrame objects and do not propagate NaN values, so if for a(column, time) one frame is missing a value, it will default to theother frame's value (which might be NaN as well)
    count(self, axis=0, level=None, numeric_only=False):
        Count non-NA cells for each column or row.
    cov(self, min_periods=None):
       Compute pairwise covariance of columns, excluding NA/null values.
    drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise'):
        Drop specified labels from rows or columns.
    drop_duplicates(self, subset=None, keep='first', inplace=False):
        Return DataFrame with duplicate rows removed, optionally onlyconsidering certain columns
    dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False)
        Remove missing values.
    duplicated(self, subset=None, keep='first')
        Return boolean Series denoting duplicate rows, optionally onlyconsidering certain columns
    eq(self, other, axis='columns', level=None)
        Wrapper for flexible comparison methods eq
    eval(self, expr, inplace=False, **kwargs)
        Evaluate a string describing operations on DataFrame columns.
    fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs)
        Fill NA/NaN values using the specified method
    ge(self, other, axis='columns', level=None)
        Wrapper for flexible comparison methods ge
    gt(self, other, axis='columns', level=None)
        Wrapper for flexible comparison methods gt
    le(self, other, axis='columns', level=None)
        Wrapper for flexible comparison methods le
    lt(self, other, axis='columns', level=None)
        Wrapper for flexible comparison methods lt
    
    get_value(self, index, col, takeable=False)
        Quickly retrieve single value at passed column and index
    info(self, verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None)
        Print a concise summary of a DataFrame.
    isin(self, values)
        Return boolean DataFrame showing whether each element in theDataFrame is contained in values.
    isna(self)
        Detect missing values.Return a boolean same-sized object indicating if the values are NA.
    isnull(self)
        Detect missing values.Return a boolean same-sized object indicating if the values are NA.
    iteritems(self)
        Iterator over (column name, Series) pairs.
    iterrows(self)
        Iterate over DataFrame rows as (index, Series) pairs.
    itertuples(self, index=True, name='Pandas')
        Iterate over DataFrame rows as namedtuples, with index value as firstelement of the tuple.
    join(self, other, on=None, how='left', lsuffix='', rsuffix='', sort=False)
        Join columns with other DataFrame either on index or on a keycolumn. Efficiently Join multiple DataFrame objects by index at once bypassing a list.
  • 相关阅读:
    java静态导入
    java导出javadoc文档
    Java编程规范
    Java谜题——类谜题(二)
    JS注意事项
    JS——实现短信验证码的倒计时功能(没有验证码,只有倒计时)
    Java网络通信——XML和JSON
    JS基础知识——定时器
    JS的事件动态绑定机制
    JS基础知识(五)
  • 原文地址:https://www.cnblogs.com/windyrainy/p/10949421.html
Copyright © 2011-2022 走看看