zoukankan      html  css  js  c++  java
  • [译]pandas中的iloc loc的区别?

    loc 从特定的

    gets rows (or columns) with particular labels from the index.
    iloc gets rows (or columns) at particular positions in the index (so it only takes integers).
    ix usually tries to behave like loc but falls back to behaving like iloc if a label is not present in the index.
    It's important to note some subtleties that can make ix slightly tricky to use:

    if the index is of integer type, ix will only use label-based indexing and not fall back to position-based indexing. If the label is not in the index, an error is raised.

    if the index does not contain only integers, then given an integer, ix will immediately use position-based indexing rather than label-based indexing. If however ix is given another type (e.g. a string), it can use label-based indexing.

    To illustrate the differences between the three methods, consider the following Series:

    >>> s = pd.Series(np.nan, index=[49,48,47,46,45, 1, 2, 3, 4, 5])
    >>> s
    49   NaN
    48   NaN
    47   NaN
    46   NaN
    45   NaN
    1    NaN
    2    NaN
    3    NaN
    4    NaN
    5    NaN
    

    We'll look at slicing with the integer value 3.

    In this case, s.iloc[:3] returns us the first 3 rows (since it treats 3 as a position) and s.loc[:3] returns us the first 8 rows (since it treats 3 as a label):

    >>> s.iloc[:3] # slice the first three rows
    49   NaN
    48   NaN
    47   NaN
    
    >>> s.loc[:3] # slice up to and including label 3
    49   NaN
    48   NaN
    47   NaN
    46   NaN
    45   NaN
    1    NaN
    2    NaN
    3    NaN
    
    >>> s.ix[:3] # the integer is in the index so s.ix[:3] works like loc
    49   NaN
    48   NaN
    47   NaN
    46   NaN
    45   NaN
    1    NaN
    2    NaN
    3    NaN
    

    Notice s.ix[:3] returns the same Series as s.loc[:3] since it looks for the label first rather than working on the position (and the index for s is of integer type).

    What if we try with an integer label that isn't in the index (say 6)?

    Here s.iloc[:6] returns the first 6 rows of the Series as expected. However, s.loc[:6] raises a KeyError since 6 is not in the index.

    >>> s.iloc[:6]
    49   NaN
    48   NaN
    47   NaN
    46   NaN
    45   NaN
    1    NaN
    
    >>> s.loc[:6]
    KeyError: 6
    
    >>> s.ix[:6]
    

    KeyError: 6
    As per the subtleties noted above, s.ix[:6] now raises a KeyError because it tries to work like loc but can't find a 6 in the index. Because our index is of integer type ix doesn't fall back to behaving like iloc.

    If, however, our index was of mixed type, given an integer ix would behave like iloc immediately instead of raising a KeyError:

    >>> s2 = pd.Series(np.nan, index=['a','b','c','d','e', 1, 2, 3, 4, 5])
    >>> s2.index.is_mixed() # index is mix of different types
    

    True

    >>> s2.ix[:6] # now behaves like iloc given integer
    

    a NaN
    b NaN
    c NaN
    d NaN
    e NaN
    1 NaN
    Keep in mind that ix can still accept non-integers and behave like loc:

    >>> s2.ix[:'c'] # behaves like loc given non-integer
    

    a NaN
    b NaN
    c NaN
    As general advice, if you're only indexing using labels, or only indexing using integer positions, stick with loc or iloc to avoid unexpected results - try not use ix.

    Combining position-based and label-based indexing
    Sometimes given a DataFrame, you will want to mix label and positional indexing methods for the rows and columns.

    For example, consider the following DataFrame. How best to slice the rows up to and including 'c' and take the first four columns?

    >>> df = pd.DataFrame(np.nan, 
                          index=list('abcde'),
                          columns=['x','y','z', 8, 9])
    
    >>> df
    
    x   y   z   8   9
    

    a NaN NaN NaN NaN NaN
    b NaN NaN NaN NaN NaN
    c NaN NaN NaN NaN NaN
    d NaN NaN NaN NaN NaN
    e NaN NaN NaN NaN NaN
    In earlier versions of pandas (before 0.20.0) ix lets you do this quite neatly - we can slice the rows by label and the columns by position (note that for the columns, ix will default to position-based slicing since 4 is not a column name):

    >>> df.ix[:'c', :4]
    
    x   y   z   8
    

    a NaN NaN NaN NaN
    b NaN NaN NaN NaN
    c NaN NaN NaN NaN
    In later versions of pandas, we can achieve this result using iloc and the help of another method:

    >>> df.iloc[:df.index.get_loc('c') + 1, :4]
    
    x   y   z   8
    

    a NaN NaN NaN NaN
    b NaN NaN NaN NaN
    c NaN NaN NaN NaN
    get_loc() is an index method meaning "get the position of the label in this index". Note that since slicing with iloc is exclusive of its endpoint, we must add 1 to this value if we want row 'c' as well.

    There are further examples in pandas' documentation here.

    注:
    https://stackoverflow.com/questions/31593201/pandas-iloc-vs-ix-vs-loc-explanation-how-are-they-different

  • 相关阅读:
    Linux使用lrzsz上传下载文件
    开发Wordpress主题时没有特色图片的功能
    Windows10重启之后总是将默认浏览器设置为IE
    C#泛型类的类型约束
    CentOS给网站配置Https证书
    从微软官网下载VS离线安装包的方法
    Azure Sql Database为某个数据库创建单独的访问账户
    VS2017/2019 Product Key
    VMware Workstation/Fusion 14/15 密钥
    将DataTable进行分页并生成新的DataTable
  • 原文地址:https://www.cnblogs.com/everfight/p/iloc_vs_loc.html
Copyright © 2011-2022 走看看