zoukankan      html  css  js  c++  java
  • 【338】Pandas.DataFrame

    Ref: Pandas Tutorial: DataFrames in Python

    Ref: pandas.DataFrame

    Ref: Pandas:DataFrame对象的基础操作


    Ref: Creating, reading, and writing reference

    • pandas.DataFrame()
    • pandas.Series()
    • pandas.read_csv()
    • pandas.DataFrame.shape
    • pandas.DataFrame.head
    • pandas.read_excel()
    • pandas.to_csv()
    • pandas.to_excel()

    Ref: Indexing, selecting, assigning reference

    • pandas.iloc(): 类似于Excel中的Cell函数,将其看做Matrix
    • pandas.loc()

    一、基本概念

    class pandas.DataFrame(data=None, index=None, columns=None, dtype=None, copy=False)
    Parameters:

    data : 数据主体部分,numpy ndarray (structured or homogeneous), dict, or DataFrame

    Dict can contain Series, arrays, constants, or list-like objects

    Changed in version 0.23.0: If data is a dict, argument order is maintained for Python 3.6 and later.

    index : 行名称,默认 0, 1, 2, ..., n, Index or array-like

    Index to use for resulting frame. Will default to RangeIndex if no indexing information part of input data and no index provided

    columns : 列名称,默认 0, 1, 2, ..., n, Index or array-like

    Column labels to use for resulting frame. Will default to RangeIndex (0, 1, 2, …, n) if no column labels are provided

    dtype : 数据类型,dtype, default None

    Data type to force. Only a single dtype is allowed. If None, infer

    copy : boolean, default False

    Copy data from inputs. Only affects DataFrame / 2d ndarray input

    data[1:,0] means the first column, data[0,1:] means the first row.

    >>> import numpy as np
    >>> import pandas as pd
    >>> data = np.array([
    	['','Col1','Col2'],
    	['Row1',1,2],
    	['Row2',3,4]
    	])
    >>> print(pd.DataFrame(data=data[1:,1:],
    		       index=data[1:,0],
    		       columns=data[0,1:]))
         Col1 Col2
    Row1    1    2
    Row2    3    4
    

    or

    >>> data = np.array([
    	[1,2],
    	[3,4]])
    >>> print(pd.DataFrame(data=data,
    		       index=['Row1','Row2'],
    		       columns=['Col1','Col2']))
          Col1  Col2
    Row1     1     2
    Row2     3     4
    

    Ref: pandas dataframe.apply() 实现对某一行/列进行处理获得一个新行/新列

    Ref: 在pandas中遍历DataFrame行

    Ref: pandas.DataFrame.apply


    二、相关方法:

    DataFrame.apply(func, axis=0, broadcast=None, raw=False, reduce=None, result_type=None, args=(), **kwds)

    Apply a funciton along an axis of the DataFrame. (类似Excel中对一列或者一行数据进行摸个函数的处理)

    Objects passed to the function are Series objects whose index is either the DataFrame's index (axis=0) or the DataFrame's columns (axis=1).

    Ref: pandas.Series.value_counts

    Series.value_counts(normalize=False, sort=True, ascending=False, bins=None, dropna=True)

    Returns object containing counts of unique values.

    The resulting object will be in desceding order so that the first element is the most frequent-occurring element. Excludes NA values by default.

    DataFrame.read_csv():  可以将 Str 通过 StringIO() 转为文件缓存,可以直接用此方法

    >>> from io import StringIO
    >>> a = '''
    A, B, C
    1,2,3
    4,5,6
    7,8,9
    '''
    >>> a
    '
    A, B, C
    1,2,3
    4,5,6
    7,8,9
    '
    >>> data = pd.read_csv(StringIO(a))
    >>> data
       A   B   C
    0  1   2   3
    1  4   5   6
    2  7   8   9
    

  • 相关阅读:
    归并排序(非递归)
    centos7.2 安装jenkins2.274
    归并排序
    Jmeter5.4支持TPS测试
    centos下安装rocketmq4.6.1
    Java 8新特性:lambda表达式
    tomcat putty启动
    Linux启动tomcat带控制台
    每个Java开发者都应该知道的5个JDK工具
    强大易用的日期和时间库 线程安全 Joda Time
  • 原文地址:https://www.cnblogs.com/alex-bn-lee/p/9951877.html
Copyright © 2011-2022 走看看