zoukankan      html  css  js  c++  java
  • 过度拟合问题

    The Problem of Overfitting

    解决方法:预先挑选特征;

                     正则化

    Consider the problem of predicting y from x ∈ R. The leftmost figure below shows the result of fitting a y = θ0+θ1x to a dataset. We see that the data doesn’t really lie on straight line, and so the fit is not very good.

    Underfitting, or high bias, is when the form of our hypothesis function h maps poorly to the trend of the data. It is usually caused by a function that is too simple or uses too few features. At the other extreme, overfitting, or high variance, is caused by a hypothesis function that fits the available data but does not generalize well to predict new data. It is usually caused by a complicated function that creates a lot of unnecessary curves and angles unrelated to the data.

    This terminology is applied to both linear and logistic regression. There are two main options to address the issue of overfitting:

    1) Reduce the number of features:

    • Manually select which features to keep.
    • Use a model selection algorithm (studied later in the course).

    2) Regularization

    • Keep all the features, but reduce the magnitude of parameters θj.
    • Regularization works well when we have a lot of slightly useful features.
  • 相关阅读:
    线程的等待与唤醒
    多线程start()与run()的区别
    Thread与Runnable
    关于i++和++i的一些见解
    Mysql优化(转)
    Java 注解
    Java 泛型(转)
    Java 中的CAS
    CAS ABA问题
    Java 线程池分析
  • 原文地址:https://www.cnblogs.com/ne-zha/p/7388149.html
Copyright © 2011-2022 走看看