交叉验证 - 走看看

zoukankan html css js c++ java

交叉验证
Q1:使用训练/测试集分离进行模型评估的缺点是什么?

Q2:K-fold交叉验证如何克服这个限制?

Q3:如何使用交叉验证来选择调优参数、选择模型和选择特性?

Q4:交叉验证有哪些可能的改进？

动机:目标是估计一个模型在样本外数据

初始想法:对相同的数据进行训练和测试，但是，最大限度地提高训练精度容易过拟合。

最终想法：训练/测试将数据集分割成两部分,这样可以训练模型和测试不同的数据

测试精度比训练精度更好地估计样本外的表现

　　　　　但它提供了一个高方差估计因为改变观察发生在测试组可以显著改变测试精度
import tensorflow as tf from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier from sklearn import metrics # read in the iris data iris = load_iris() # create X (features) and y (response) X = iris.data y = iris.target # use train/test split with different random_state values X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=4) # check classification accuracy of KNN with K=5 knn = KNeighborsClassifier(n_neighbors=5) knn.fit(X_train, y_train) y_pred = knn.predict(X_test) print(metrics.accuracy_score(y_test, y_pred))
精度：
0.9736842105263158
查看全文

相关阅读:
mysql ibd 文件过大问题
 magento性能分析插件
 magento 自定义url路径和 filter data 小结
 magento layout xml 小结
 magento 开启 3D secure credit card validation
magento package
docker安装与使用记录（debian9）
Windows使用Charles对模拟器/真机进行抓包问题记录
 windows10 windump使用记录
 使用systrace的问题记录

原文地址：https://www.cnblogs.com/wywshtc/p/12498131.html

Copyright © 2011-2022 走看看