最好先将数据转换为numpy数组的格式。
方法一:使用np.random.shuffle
state = np.random.get_state()
np.random.shuffle(train)
np.random.set_state(state)
np.random.shuffle(label)
或者这么使用:
需要注意的是,如果数组类型是:['a','b','c','d'],(4,)
我们要先将其转换为[['a'],['b'],['c'],['d']],(4,1)
train_row = list(range(len(train_label))) random.shuffle(train_row) train_image = train_image[train_row,:] train_label = train_label[train_row,:]
方法二:使用np.random.permutation()
shuffle_ix = np.random.permutation(np.arange(len(train_data))) train_data = train_data[shuffle_ix,:] train_label = train_label[shuffle_ix,:]
方法三:使用pytorch中的Dataset,还可以设置batchsize的大小
dataset = torch.utils.data.TensorDataset(data, target) # 设置数据集 train_iter = torch.utils.data.DataLoader(dataset, batch_size, shuffle=True) # 设置获取数据方式
举个例子:
import numpy as np tes = np.array([['a'],['b'],['c'],['d']]) shuffle_ix = np.random.permutation(len(tes)) shuffle_ix = list(shuffle_ix) print(shuffle_ix) tes = tes[shuffle_ix,:]
[1, 3, 0, 2]
array([['b'], ['d'], ['a'], ['c']], dtype='<U1')
参考:
https://blog.csdn.net/sinat_38682860/article/details/108813209