KFold，StratifiedKFold k折交叉切分

zoukankan html css js c++ java

KFold，StratifiedKFold k折交叉切分
python机器学习-乳腺癌细胞挖掘（博主亲自录制视频）

https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share

原文链接

https://blog.csdn.net/wqh_jingsong/article/details/77896449

StratifiedKFold用法类似Kfold，但是他是分层采样，确保训练集，测试集中各类别样本的比例与原始数据集中相同。

例子：

import numpy as np
from sklearn.model_selection import KFold,StratifiedKFold
X=np.array([ [1,2,3,4], [11,12,13,14], [21,22,23,24], [31,32,33,34], [41,42,43,44], [51,52,53,54], [61,62,63,64], [71,72,73,74] ]) y=np.array([1,1,0,0,1,1,0,0]) #n_folds这个参数没有，引入的包不同， floder = KFold(n_splits=4,random_state=0,shuffle=False) sfolder = StratifiedKFold(n_splits=4,random_state=0,shuffle=False) for train, test in sfolder.split(X,y): print('Train: %s | test: %s' % (train, test)) print(" ") for train, test in floder.split(X,y): print('Train: %s | test: %s' % (train, test)) print(" ")
结果：

1.
Train: [1 3 4 5 6 7] | test: [0 2]

Train: [0 2 4 5 6 7] | test: [1 3]

Train: [0 1 2 3 5 7] | test: [4 6]

Train: [0 1 2 3 4 6] | test: [5 7]

2.
Train: [2 3 4 5 6 7] | test: [0 1]

Train: [0 1 4 5 6 7] | test: [2 3]

Train: [0 1 2 3 6 7] | test: [4 5]

Train: [0 1 2 3 4 5] | test: [6 7]

分析：可以看到StratifiedKFold 分层采样交叉切分，确保训练集，测试集中各类别样本的比例与原始数据集中相同。

https://study.163.com/provider/400000000398149/index.htm?share=2&shareId=400000000398149（欢迎关注博主主页，学习python视频资源，还有大量免费python经典文章）

　　
查看全文

相关阅读:
python脚本快速执行mapreduce程序
 ArrayList中contains()的使用方法
 利用jstl标签实现国际化
 device eth0 does not seem to be present, delaying initialization
java中compareTo和compare方法之比较
 CentOS 7 NAT模式LVS搭建
 CentOS 7 DR模式LVS搭建
 CentOS 7 开机延迟解决办法
 CentOS 7 nginx+tomcat9 session处理方案之session复制
 Jumpserver(跳板机、堡垒机)启动jms Django连接mysql数据库报错

原文地址：https://www.cnblogs.com/webRobot/p/10438341.html

KFold，StratifiedKFold k折交叉切分

python机器学习-乳腺癌细胞挖掘（博主亲自录制视频）

https://study.163.com/course/introduction.htm?courseId=1005269003&utm_campaign=commission&utm_source=cp-400000000398149&utm_medium=share