pandas.DataFrame对象类型解析
df = pd.DataFrame([[1,"2",3,4],[5,"6",7,8]],columns=["a","b","c","d"])
method解析
1、add()方法:类似加法运算(相加的元素必须是同一对象的数据)
| add(self, other, axis='columns', level=None, fill_value=None) | Addition of dataframe and other, element-wise (binary operator `add`). | | Equivalent to ``dataframe + other``, but with support to substitute a fill_value for | missing data in one of the inputs. | | Parameters | ---------- | other : Series, DataFrame, or constant | axis : {0, 1, 'index', 'columns'} | For Series input, axis to match Series index on | level : int or name | Broadcast across a level, matching Index values on the | passed MultiIndex level | fill_value : None or float value, default None | Fill existing missing (NaN) values, and any new element needed for | successful DataFrame alignment, with this value before computation. | If data in both corresponding DataFrame locations is missing | the result will be missing
example:
output:
2、aggregate()方法:可简写agg()方法
aggregate(self, func, axis=0, *args, **kwargs) | Aggregate using one or more operations over the specified axis. | | .. versionadded:: 0.20.0 | | Parameters | ---------- | func : function, string, dictionary, or list of string/functions | Function to use for aggregating the data. If a function, must either | work when passed a DataFrame or when passed to DataFrame.apply. For | a DataFrame, can pass a dict, if the keys are DataFrame column names. | | Accepted combinations are: | | - string function name. | - function. | - list of functions. | - dict of column names -> functions (or list of functions).
example:
#coding=utf-8 import pandas as pd import numpy as np ds = pd.Series([11,"2",13,14]) print ds," " df = pd.DataFrame([[1,"2",3,4],[5,"6",7,8]],columns=["a","b","c","d"]) print df," " print(df.agg(['sum', 'min'])) print(df.agg({"a":['sum', 'min']}))
output:
0 11 1 2 2 13 3 14 dtype: object a b c d 0 1 2 3 4 1 5 6 7 8 a b c d sum 6 26 10 12 min 1 2 3 4 a sum 6 min 1
常用的aggregation functions (`mean`, `median`, `prod`, `sum`, `std`,`var`)
mad(self, axis=None, skipna=None, level=None) Return the mean absolute deviation of the values for the requested axis max(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs) This method returns the maximum of the values in the object.If you want the *index* of the maximum, use ``idxmax``. This is the equivalent of the ``numpy.ndarray`` method ``argmax``. mean(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs) Return the mean of the values for the requested axis median(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs) Return the median of the values for the requested axis min(self, axis=None, skipna=None, level=None, numeric_only=None, **kwargs) This method returns the minimum of the values in the object. memory_usage(self, index=True, deep=False) Return the memory usage of each column in bytes. merge(self, right, how='inner', on=None, left_on=None, right_on=None, left_index=False, right_index=False, sort=False, suffixes=('_x', '_y'), copy=True, indicator=False, validate=None) Merge DataFrame objects by performing a database-style join operation by columns or indexes.
align(self, other, join='outer', axis=None, level=None, copy=True, fill_value=None, method=None, limit=None, fill_axis=0, broadcast_axis=None): Align two objects on their axes with the specified join method for each axis Index all(self, axis=None, bool_only=None, skipna=None, level=None, **kwargs): Return whether all elements are True over series or dataframe axis. any(self, axis=None, bool_only=None, skipna=None, level=None, **kwargs): Return whether any element is True over requested axis. apply(self, func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds): Apply a function along an axis of the DataFrame. applymap(self, func): Apply a function to a Dataframe elementwise.This method applies a function that accepts and returns a scalarto every element of a DataFrame. append(self, other, ignore_index=False, verify_integrity=False, sort=None): Append rows of `other` to the end of this frame, returning a new object. Columns not in this frame are added as new columns. assign(self, **kwargs): Assign new columns to a DataFrame, returning a new object(a copy) with the new columns added to the original ones.Existing columns that are re-assigned will be overwritten. insert(self, loc, column, value, allow_duplicates=False) Insert column into DataFrame at specified location. combine(self, other, func, fill_value=None, overwrite=True): Add two DataFrame objects and do not propagate NaN values, so if for a(column, time) one frame is missing a value, it will default to theother frame's value (which might be NaN as well) count(self, axis=0, level=None, numeric_only=False): Count non-NA cells for each column or row. cov(self, min_periods=None): Compute pairwise covariance of columns, excluding NA/null values. drop(self, labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise'): Drop specified labels from rows or columns. drop_duplicates(self, subset=None, keep='first', inplace=False): Return DataFrame with duplicate rows removed, optionally onlyconsidering certain columns dropna(self, axis=0, how='any', thresh=None, subset=None, inplace=False) Remove missing values. duplicated(self, subset=None, keep='first') Return boolean Series denoting duplicate rows, optionally onlyconsidering certain columns eq(self, other, axis='columns', level=None) Wrapper for flexible comparison methods eq eval(self, expr, inplace=False, **kwargs) Evaluate a string describing operations on DataFrame columns. fillna(self, value=None, method=None, axis=None, inplace=False, limit=None, downcast=None, **kwargs) Fill NA/NaN values using the specified method ge(self, other, axis='columns', level=None) Wrapper for flexible comparison methods ge gt(self, other, axis='columns', level=None) Wrapper for flexible comparison methods gt le(self, other, axis='columns', level=None) Wrapper for flexible comparison methods le lt(self, other, axis='columns', level=None) Wrapper for flexible comparison methods lt get_value(self, index, col, takeable=False) Quickly retrieve single value at passed column and index info(self, verbose=None, buf=None, max_cols=None, memory_usage=None, null_counts=None) Print a concise summary of a DataFrame. isin(self, values) Return boolean DataFrame showing whether each element in theDataFrame is contained in values. isna(self) Detect missing values.Return a boolean same-sized object indicating if the values are NA. isnull(self) Detect missing values.Return a boolean same-sized object indicating if the values are NA. iteritems(self) Iterator over (column name, Series) pairs. iterrows(self) Iterate over DataFrame rows as (index, Series) pairs. itertuples(self, index=True, name='Pandas') Iterate over DataFrame rows as namedtuples, with index value as firstelement of the tuple. join(self, other, on=None, how='left', lsuffix='', rsuffix='', sort=False) Join columns with other DataFrame either on index or on a keycolumn. Efficiently Join multiple DataFrame objects by index at once bypassing a list.