zoukankan      html  css  js  c++  java
  • svmrank 的误差惩罚因子c选择 经验

    C是一个由用户去指定的系数,表示对分错的点加入多少的惩罚,当C很大的时候,分错的点就会更少,但是过拟合的情况可能会比较严重,当C很小的时候,分错的点可能会很多,不过可能由此得到的模型也会不太正确,所以如何选择C是有很多学问的,不过在大部分情况下就是通过经验尝试得到的。

    Trade-off between Maximum Margin and Classification Errors

    http://mi.eng.cam.ac.uk/~kkc21/thesis_main/node29.html

    The trade-off between maximum margin and the classification error (during training) is defined by the value C in Eqn. gif. The value C is called the Error Penalty. A high error penalty will force the SVM training to avoid classification errors (Section gif gives a brief overview of the significance of the value of C).

    A larger C will result in a larger search space for the QP optimiser. This generally increases the duration of the QP search, as results in Table gif show. Other experiments with larger numbers of data points (1200) fail to converge whenC is set higher than 1000. This is mainly due to numerical problems. The cost function of the QP does not decrease monotonically gif. A larger search space does contribute to these problems.

    The number of SVs does not change significantly with different C value. A smaller C does cause the average number of SVs to increases slightly. This could be due to more support vectors being needed to compensate the bound on the other support vectors. The tex2html_wrap_inline1712 norm of w decreases with smaller C. This is as expected, because if errors are allowed, then the training algorithm can find a separating plane with much larger margin. Figures gifgifgif and gif show the decision boundaries for two very different error penalties on two classifiers (2-to-rest and 5-to-rest). It is clear that with higher error penalty, the optimiser gives a boundary that classifies all the training points correctly. This can give very irregular boundaries.

    One can easily conclude that the more regular boundaries (Figures gif and gif) will give better generalisation. This conclusion is also supported by the value of ||w|| which is lower for these two classifiers, i.e. they have larger margin. One can also use the expected error bound to predict the best error penalty setting. First the expected error bound is computed using Eqn. gif and gif ( tex2html_wrap_inline2446 ). This is shown in Figure gif. It predicts that the best setting isC=10 and C=100. The accuracy obtained from testing data (Figure gif) agrees with this prediction.

       table1242 

    所以c一般 选用10,100

    实测:

    用svm_rank测试数据时,

    经验参数,c=1,效果不如c=3.
    故c=1,放弃。

     但c=1 训练时间比c=3训练时间短。

    总的来说,c越大,svm_rank learn的迭代次数越大,所耗训练时间越长。

  • 相关阅读:
    vue使用axios调用api接口
    vue引用echarts
    C# 倒计时,显示天,时,分,秒。时间可以是从数据库捞出来
    DataGridView 控件操作大全 (内容居中显示,右键绑定菜单)
    Oracle使用row_number()函数查询时增加序号列
    Oracle 相关操作SQL
    oracle rac切换到单实例DG后OGG的处理
    oracle dg库因为standby_file_management参数导致应用停止
    oracle rac与单实例DG切换
    oracle rac搭建单实例DG步骤(阅读全篇后再做)
  • 原文地址:https://www.cnblogs.com/lifegoesonitself/p/3513383.html
Copyright © 2011-2022 走看看