zoukankan      html  css  js  c++  java
  • Emgu 决策树

    MCvDTreeParams

      cvFolds        //If this parameter is >1, the tree is pruned using cv_folds-fold cross validation.

      maxCategories     //默认为10

      maxDepth      //This parameter specifies the maximum possible depth of the tree. That is the training algorithms attempts to split a node while its depth is less than max_depth. The actual depth may be smaller if the other termination criteria are met (see the outline of the training procedure in the beginning of the section), and/or if the tree is pruned.

      minSampleCount   //A node is not split if the number of samples directed to the node is less than the parameter value.

      priors        //错分类代价

      regressionAccuracy  //Another stop criteria - only for regression trees. As soon as the estimated node value differs from the node training samples responses by less than the parameter value, the node is not split further.

      truncatePrunedTree  //If true, the cut off nodes (with Tn<=CvDTree::pruned_tree_idx) are physically removed from the tree. Otherwise they are kept, and by decreasing CvDTree::pruned_tree_idx (e.g. setting it to -1) it is still possible to get the results from the original un-pruned (or pruned less aggressively) tree.

      use1seRule      //If true, the tree is truncated a bit more by the pruning procedure. That leads to compact, and more resistant to the training data noise, but a bit less accurate decision tree.

      useSurrogates    //If true, surrogate splits are built. Surrogate splits are needed to handle missing measurements and for variable importance estimation.

     public bool Train(

       Matrix<float> trainData,  //A 32-bit floating-point, single-channel matrix,one vector per row

      Emgu.CV.ML.MlEnum.DATA_LAYOUT_TYPE tflag,  //data layout type:  COL_SAMPLE or ROW_SAMPLE

      Matrix<float> responses,  //行阵等同于trainData,列数为1,表示训练数据的结果
      Matrix<byte> varIdx,     //Can be null if not needed. When specified, identifies variables (features) of interest. It is a Matrix>int< of nx1

      Matrix<byte> sampleIdx,    //Can be null if not needed. When specified, identifies samples of interest. It is a Matrix>int< of nx1. 与上一个向量要么是基于0的整数列,要么是8位的标记,1预示有用,0预示跳过

      Matrix<byte> varType,    //The types of input variables. 是一个各个特征类型的基于0的标记(特征类型是CV_VAR_CATEGORICAL 还是 CV_VAR_ORDERED),它的长度是特征的个数加1,最后一个数字指定学习结果的类型

      Matrix<byte> missingMask,  //掩码矩阵,值为1表示相对应的值无效,值为0表示相对应的值有用

      MCvDTreeParams param    //The parameters for training the decision tree

    )

    示例程序:

      private static DTree MushroomCreateDTree(Matrix<float> data, Matrix<byte> missing, Matrix<float> responses, int weight)

      {

      DTree dtree=new DTree();
      Matrix<byte> varType=new Matrix<byte>(data.Width+1,1);
      float[] priors = new float[] { 1, weight };
      MCvDTreeParams mcd = new MCvDTreeParams();
      mcd.maxDepth=8;
      mcd.minSampleCount=10;
      mcd.regressionAccuracy=0;
      mcd.useSurrogates=true;
      mcd.maxCategories=20;
      mcd.cvFolds=10;
      mcd.use1seRule=true;
      mcd.truncatePrunedTree=true;
      IntPtr p = Marshal.UnsafeAddrOfPinnedArrayElement(priors, 0); //将float[] 转化成 IntPtr类型
      mcd.priors=p;
      dtree.Train(data,Emgu.CV.ML.MlEnum.DATA_LAYOUT_TYPE.ROW_SAMPLE,responses,null,null,varType,missing,mcd); 

      dtree.Save(System.Environment.CurrentDirectory + "\dtee.xml");
      int varSuccess = 0;
      for (int i = 0; i < data.Rows; i++)
      {
        Matrix<float> sample=new Matrix<float>(1,data.Width);
        Matrix<byte> missingDataMask=new Matrix<byte>(1,missing.Width);
        for (int j = 0; j < data.Width; j++)
        {
          sample[0, j] = data[i, j];
        }
        for (int j = 0; j < missing.Width; j++)
        {
          missingDataMask[0, j] = missing[i, j];
        }
        MCvDTreeNode mcn = dtree.Predict(sample, missingDataMask, false);
        if (mcn.value == responses[i, 0])
        varSuccess++;
      }
      Console.WriteLine(responses.Rows +": " + varSuccess);

      return dtree;

    }

  • 相关阅读:
    Spring Boot 返回 XML 数据,一分钟搞定!
    Spring Cloud Alibaba Sentinel 整合 Feign 的设计实现
    周末去面试,进去 5 分钟就出来了…
    Spring Boot 返回 JSON 数据,一分钟搞定!
    Java 11 已发布,String 还能这样玩!
    Hashtable 为什么不叫 HashTable?
    Java 中初始化 List 集合的 6 种方式!
    HashMap 和 Hashtable 的 6 个区别,最后一个没几个人知道!
    毕业不到一年,绩效打了个D!
    poj 3111 K Best (二分搜索之最大化平均值之01分数规划)
  • 原文地址:https://www.cnblogs.com/alsofly/p/3600526.html
Copyright © 2011-2022 走看看