zoukankan      html  css  js  c++  java
  • [Machine Learning] Diagnosing Bias vs. Variance

    In this section we examine the relationship between the degree of the polynomial d and the underfitting or overfitting of our hypothesis.

    • We need to distinguish whether bias or variance is the problem contributing to bad predictions.
    • High bias is underfitting and high variance is overfitting. Ideally, we need to find a golden mean between these two.

    The training error will tend to decrease as we increase the degree d of the polynomial.

    At the same time, the cross validation error will tend to decrease as we increase d up to a point, and then it will increase as d is increased, forming a convex curve.

    Our decision process can be broken down as follows:

    • Getting more training examples: Fixes high variance
    • Trying smaller sets of features: Fixes high variance
    • Adding features: Fixes high bias
    • Adding polynomial features: Fixes high bias
    • Decreasing λ: Fixes high bias
    • Increasing λ: Fixes high variance.

    Diagnosing Neural Networks

    • A neural network with fewer parameters is prone to underfitting. It is also computationally cheaper.
    • A large neural network with more parameters is prone to overfitting. It is also computationally expensive. In this case you can use regularization (increase λ) to address the overfitting.

    Using a single hidden layer is a good starting default. You can train your neural network on a number of hidden layers using your cross validation set. You can then select the one that performs best.

    Model Complexity Effects:

    • Lower-order polynomials (low model complexity) have high bias and low variance. In this case, the model fits poorly consistently.
    • Higher-order polynomials (high model complexity) fit the training data extremely well and the test data extremely poorly. These have low bias on the training data, but very high variance.
    • In reality, we would want to choose a model somewhere in between, that can generalize well but also fits the data reasonably well.
  • 相关阅读:
    css3中-webkit-text-size-adjust详解
    CSS 让标点符号不出现在行首
    html.day02
    老生常谈的问题——抽象类与接口
    C# 4个小技巧
    IIS状态代码的含义
    关于线程的synchronized、wait(),notify
    再说粗粒度
    粗粒度与细粒度
    .NET中栈和堆的比较 #1
  • 原文地址:https://www.cnblogs.com/Answer1215/p/13712627.html
Copyright © 2011-2022 走看看