zoukankan      html  css  js  c++  java
  • pandas对时间列分组求diff遇到的问题

    例子:

    df = pd.DataFrame()
    df['A'] = [1, 1, 2]
    df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)]
    df['C'] = df.groupby('A').B.diff()
    df['C'] = df.C.dt.days
    

     

    报错:

    Traceback (most recent call last):
      File "D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoreseries.py", line 2820, in _make_dt_accessor
        return maybe_to_datetimelike(self)
      File "D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoreindexesaccessors.py", line 84, in maybe_to_datetimelike
        "datetimelike index".format(type(data)))
    TypeError: cannot convert an object of type <class 'pandas.core.series.Series'> to a datetimelike index
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "D:/学习/pandas_test/pandas_learn_20190102.py", line 49, in <module>
        test2()
      File "D:/学习/pandas_test/pandas_learn_20190102.py", line 32, in test2
        df['C'] = df.C.dt.days
      File "D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoregeneric.py", line 3077, in __getattr__
        return object.__getattribute__(self, name)
      File "D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascorease.py", line 243, in __get__
        return self.construct_accessor(instance)
      File "D:python_virtualenvcommonlibsite-packagespandas-0.20.3-py3.6-win-amd64.eggpandascoreseries.py", line 2822, in _make_dt_accessor
        raise AttributeError("Can only use .dt accessor with datetimelike "
    AttributeError: Can only use .dt accessor with datetimelike values

    原因:
    分组求diff后的结果是:

    A B C
    0 1 2018-01-02 NaT
    1 1 2018-01-03 1 days 00:00:00
    2 2 2018-01-03 NaN

    类型是:

    A int64
    B object
    C object
    dtype: object

    预想的类型是:

    A int64
    B object
    C timedelta64[ns]
    dtype: object

    解决:
    原本尝试使用astype强制将object列,转成timedelta列

    df['C'] = df.C.astype(pd.Timedelta)

    这句代码不会报错,但是C列的类型不会改变,没有作用。

    最后有两种处理方式:
    提前定义B列为时间列:

    df = pd.DataFrame()
    df['A'] = [1, 1, 2]
    df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)]
    df.B = pd.to_datetime(df.B)
    df['C'] = df.groupby('A').B.diff()
    df['C'] = df.C.dt.days

    增加类型转换:

    df = pd.DataFrame()
    df['A'] = [1, 1, 2]
    df['B'] = [datetime.date(2018, 1, 2), datetime.date(2018, 1, 3), datetime.date(2018, 1, 3)]
    df['C'] = df.groupby('A').B.diff()
    df['C'] = pd.to_timedelta(df.C, unit='d').dt.days
  • 相关阅读:
    torch.nn.functional中softmax的作用及其参数说明
    from __future__ import包的作用
    python textwrap的使用
    深度学习框架PyTorch一书的学习-第五章-常用工具模块
    python实现命令行解析的argparse的使用
    pytorch visdom可视化工具学习—2—详细使用-1—基本使用函数
    pytorch visdom可视化工具学习—2—详细使用-3-Generic Plots和Others
    pytorch visdom可视化工具学习—2—详细使用-2-plotting绘图
    硬盘没有显示
    Eclipse中syso 快捷键 Alt + / 不能使用的问题
  • 原文地址:https://www.cnblogs.com/hufulinblog/p/10207678.html
Copyright © 2011-2022 走看看