zoukankan      html  css  js  c++  java
  • paper 130:MatLab分类器大全(svm,knn,随机森林等)

    train_data是训练特征数据, train_label是分类标签。
    Predict_label是预测的标签。
    MatLab训练数据, 得到语义标签向量 Scores(概率输出)。


    1.逻辑回归(多项式MultiNomial logistic Regression)
    Factor = mnrfit(train_data, train_label);
    Scores = mnrval(Factor, test_data);
    scores是语义向量(概率输出)。对高维特征,吃不消。


    2.随机森林分类器(Random Forest)
    Factor = TreeBagger(nTree, train_data, train_label);
    [Predict_label,Scores] = predict(Factor, test_data);
    scores是语义向量(概率输出)。实验中nTree = 500。
    效果好,但是有点慢。2500行数据,耗时400秒。500万行大数据分析,会咋样?准备好一篇小说慢慢阅读吧^_^


    3.朴素贝叶斯分类(Naive Bayes)
    Factor = NaiveBayes.fit(train_data, train_label);
    Scores = posterior(Factor, test_data);
    [Scores,Predict_label] = posterior(Factor, test_data);
    Predict_label = predict(Factor, test_data);
    accuracy = length(find(predict_label == test_label))/length(test_label)*100;
    效果不佳。


    4. 支持向量机SVM分类
    Factor = svmtrain(train_data, train_label);
    predict_label = svmclassify(Factor, test_data);
    不能有语义向量 Scores(概率输出)


    支持向量机SVM(Libsvm)
    Factor = svmtrain(train_label, train_data, '-b 1');
    [predicted_label, accuracy, Scores] = svmpredict(test_label, test_data, Factor, '-b 1');


    5.K近邻分类器 (KNN)
    predict_label = knnclassify(test_data, train_data,train_label, num_neighbors);
    accuracy = length(find(predict_label == test_label))/length(test_label)*100;
    不能有语义向量 Scores(概率输出)


    IDX = knnsearch(train_data, test_data);
    IDX = knnsearch(train_data, test_data, 'K', num_neighbors);
    [IDX, Dist] = knnsearch(train_data, test_data, 'K', num_neighbors);
    IDX是近邻样本的下标集合,Dist是距离集合。
    自己编写, 实现概率输出 Scores(概率输出)


    Matlab 2012新版本:
    Factor = ClassificationKNN.fit(train_data, train_label, 'NumNeighbors', num_neighbors);
    predict_label = predict(Factor, test_data);
    [predict_label, Scores] = predict(Factor, test_data);


    6.集成学习器(Ensembles for Boosting, Bagging, or Random Subspace)
    Matlab 2012新版本:
    Factor = fitensemble(train_data, train_label, 'AdaBoostM2', 100, 'tree');
    Factor = fitensemble(train_data, train_label, 'AdaBoostM2', 100, 'tree', 'type', 'classification');
    Factor = fitensemble(train_data, train_label, 'Subspace', 50, 'KNN');
    predict_label = predict(Factor, test_data);
    [predict_label, Scores] = predict(Factor, test_data);
    效果比预期差了很多。不佳。


    7. 判别分析分类器(discriminant analysis classifier)
    Factor = ClassificationDiscriminant.fit(train_data, train_label);
    Factor = ClassificationDiscriminant.fit(train_data, train_label, 'discrimType', '判别类型:伪线性...');
    predict_label = predict(Factor, test_data);

    [predict_label, Scores] = predict(Factor, test_data);

  • 相关阅读:
    Java流程控制,用户交互scanner和运算结构
    Day14_Date类
    Day14_BigDecimal的使用
    Day14_StringBuffer和StringBuilder
    Day14_String概述
    装箱、拆箱面试题
    Day14_类型转换与装箱、拆箱
    简单的银行小案例
    Day12_面向对象 异常处理机制
    Day12_面向对象 异常机制
  • 原文地址:https://www.cnblogs.com/molakejin/p/6124397.html
Copyright © 2011-2022 走看看