zoukankan      html  css  js  c++  java
  • Splitting Time Series Data into Train/Test/Validation Sets

    Time-series (or other intrinsically ordered data) can be problematic for cross-validation. If some pattern emerges in year 3 and stays for years 4-6, then your model can pick up on it, even though it wasn’t part of years 1 & 2.

    An approach that’s sometimes more principled for time series is forward chaining, where your procedure would be something like this:

    • fold 1 : training [1], test [2]
    • fold 2 : training [1 2], test [3]
    • fold 3 : training [1 2 3], test [4]
    • fold 4 : training [1 2 3 4], test [5]
    • fold 5 : training [1 2 3 4 5], test [6]

    That more accurately models the situation you’ll see at prediction time, where you’ll model on past data and predict on forward-looking data. It also will give you a sense of the dependence of your modeling on data size.

     

     

    REF

    https://machinelearningmastery.com/backtest-machine-learning-models-time-series-forecasting/

    https://stats.stackexchange.com/questions/346907/splitting-time-series-data-into-train-test-validation-sets

    https://stats.stackexchange.com/questions/117350/how-to-split-dataset-for-time-series-prediction

    https://stats.stackexchange.com/questions/453386/working-with-time-series-data-splitting-the-dataset-and-putting-the-model-into

    https://stats.stackexchange.com/questions/14099/using-k-fold-cross-validation-for-time-series-model-selection

    https://community.dataquest.io/t/how-to-split-time-series-data-into-training-and-test-set/4116/2

  • 相关阅读:
    DA_06_iptables 与 firewalld 防火墙
    DA_05_Linux(CentOS6.7) 安装MySql5.7数据库
    DA_04_解决Xshell中文乱码问题
    3.NumPy
    2.NumPy简介
    1.python环境安装
    4.5. scrapy两大爬虫类_Spider
    redis 锦集
    一位资深程序员大牛给予Java初学者的学习路线建议
    idea 使用过程中的一些设置记录
  • 原文地址:https://www.cnblogs.com/emanlee/p/14456435.html
Copyright © 2011-2022 走看看