Pandas中Series对象的唯一值
unique()函数用于获取Series对象的唯一值。
唯一性按出现顺序返回。基于哈希表的唯一,因此不排序
以NumPy数组形式返回唯一值。如果是扩展数组支持的Series,则返回仅具有唯一值的该类型的新ExtensionArray
The unique() function is used to get unique values of Series object.
Uniques are returned in order of appearance. Hash table-based unique, therefore does NOT sort.
Syntax:
Series.unique(self)
Returns: ndarray or ExtensionArray
The unique values returned as a NumPy array. See Notes.
Notes: Returns the unique values as a NumPy array. In case of an extension-array backed Series, a new ExtensionArray of that type with just the unique values is returned. This includes
- Categorical
- Period
- Datetime with Timezone
- Interval
- Sparse
- IntegerNA
Examples
In [1]:
import numpy as np
import pandas as pd
In [2]:
pd.Series([2, 4, 3, 3], name='P').unique()
Out[2]:
In [3]:
pd.Series([pd.Timestamp('2019-01-01') for _ in range(3)]).unique()
Out[3]:
In [4]:
pd.Series([pd.Timestamp('2019-01-01', tz='US/Eastern')
for _ in range(3)]).unique()
Out[4]:
An unordered Categorical will return categories in the order of appearance.
In [5]:
pd.Series(pd.Categorical(list('qppqr'))).unique()
Out[5]:
An ordered Categorical preserves the category ordering.
In [6]:
pd.Series(pd.Categorical(list('qppqr'), categories=list('pqr'),
ordered=True)).unique()
Out[6]: