predict_proba返回的是一个n行k列的数组,第i行第j列上的数值是模型预测第i个预测样本的标签为j的概率。所以每一行的和应该等于1. 举个例子 >>> from sklearn.linear_model import LogisticRegression >>> import numpy as np >>> x_train = np.array([[1,2,3], [1,3,4], [2,1,2], [4,5,6], [3,5,3], [1,7,2]]) >>> y_train = np.array([0, 0, 0, 1, 1, 1]) >>> x_test = np.array([[2,2,2], [3,2,6], [1,7,4]]) >>> clf = LogisticRegression() >>> clf.fit(x_train, y_train) # 返回预测标签 >>> clf.predict(x_test) array([1, 0, 1]) # 返回预测属于某标签的概率 >>> clf.predict_proba(x_test) array([[ 0.43348191, 0.56651809], [ 0.84401838, 0.15598162], [ 0.13147498, 0.86852502]]) 预测[2,2,2]的标签是0的概率为0.43348191,1的概率为0.56651809 预测[3,2,6]的标签是0的概率为0.84401838,1的概率为0.15598162 预测[1,7,4]的标签是0的概率为0.13147498,1的概率为0.86852502