zoukankan      html  css  js  c++  java
  • weka控制台指令

    java weka.classifiers.trees.J48 -t data/weather.arff

    java 类的完整名称 -t表示下一个参数是训练数据集的名称

     java weka.classifiers.trees.J48 -h

    查看java命令行中各个参数的具体含义

    -h or -help
        Output help information.
    -synopsis or -info
        Output synopsis for classifier (use in conjunction  with -h)
    -t <name of training file>
        Sets training file.
    -T <name of test file>
        Sets test file. If missing, a cross-validation will be performed
        on the training data.
    -c <class index>
        Sets index of class attribute (default: last).
    -x <number of folds>
        Sets number of folds for cross-validation (default: 10).
    -no-cv
        Do not perform any cross validation.
    -force-batch-training
        Always train classifier in batch mode, never incrementally.
    -split-percentage <percentage>
        Sets the percentage for the train/test set split, e.g., 66.
    -preserve-order
        Preserves the order in the percentage split.
    -s <random number seed>
        Sets random number seed for cross-validation or percentage split
        (default: 1).
    -m <name of file with cost matrix>
        Sets file with cost matrix.
    -disable <comma-separated list of evaluation metric names>
        Comma separated list of metric names not to print to the output.
        Available metrics:
        Correct,Incorrect,Kappa,Total cost,Average cost,KB relative,KB information,
        Correlation,Complexity 0,Complexity scheme,Complexity improvement,
        MAE,RMSE,RAE,RRSE,Coverage,Region size,TP rate,FP rate,Precision,Recall,
        F-measure,MCC,ROC area,PRC area
    -l <name of input file>
        Sets model input file. In case the filename ends with '.xml',
        a PMML file is loaded or, if that fails, options are loaded
        from the XML file.
    -d <name of output file>
        Sets model output file. In case the filename ends with '.xml',
        only the options are saved to the XML file, not the model.
    -v
        Outputs no statistics for training data.
    -o
        Outputs statistics only, not the classifier.
    -i
        Outputs detailed information-retrieval statistics for each class.
    -k
        Outputs information-theoretic statistics.
    -classifications "weka.classifiers.evaluation.output.prediction.AbstractOutput + options"
        Uses the specified class for generating the classification output.
        E.g.: weka.classifiers.evaluation.output.prediction.PlainText
    -p range
        Outputs predictions for test instances (or the train instances if
        no test instances provided and -no-cv is used), along with the 
        attributes in the specified range (and nothing else). 
        Use '-p 0' if no attributes are desired.
        Deprecated: use "-classifications ..." instead.
    -distribution
        Outputs the distribution instead of only the prediction
        in conjunction with the '-p' option (only nominal classes).
        Deprecated: use "-classifications ..." instead.
    -r
        Only outputs cumulative margin distribution.
    -z <class name>
        Only outputs the source representation of the classifier,
        giving it the supplied name.
    -g
        Only outputs the graph representation of the classifier.
    -xml filename | xml-string
        Retrieves the options from the XML-data instead of the command line.
    -threshold-file <file>
        The file to save the threshold data to.
        The format is determined by the extensions, e.g., '.arff' for ARFF 
        format or '.csv' for CSV.
    -threshold-label <label>
        The class label to determine the threshold data for
        (default is the first label)
    
    Options specific to weka.classifiers.trees.J48:
    
    -U
        Use unpruned tree.
    -O
        Do not collapse tree.
    -C <pruning confidence>
        Set confidence threshold for pruning.
        (default 0.25)
    -M <minimum number of instances>
        Set minimum number of instances per leaf.
        (default 2)
    -R
        Use reduced error pruning.
    -N <number of folds>
        Set number of folds for reduced error
        pruning. One fold is used as pruning set.
        (default 3)
    -B
        Use binary splits only.
    -S
        Don't perform subtree raising.
    -L
        Do not clean up after the tree has been built.
    -A
        Laplace smoothing for predicted probabilities.
    -J
        Do not use MDL correction for info gain on numeric attributes.
    -Q <seed>
        Seed for random data shuffling (default 1).

    weka.core  

    weka核心包,基本所有类都与他有联系

    核心包中的关键类:Attribute:包含attribute’s name, its type, and, in the case of a nominal or string attribute, its possible values

    Instance:contains the attribute values of a particular instance

    Instances:holds an ordered set of instances—in other words, a dataset

    weka.classifiers

    内容:contains implementations of most of the algorithms for clas-sification  and  numeric  prediction

    关键抽象类:Classifier---->>defines the general structure of any  scheme  for  classification  or  numeric  prediction

    包含三个核心方法:buildClassifier(), classifyInstance(),distributionForInstance()

    继承这个抽象类的例子:

    • weka.classifiers.trees.DecisionStump
    • 覆写了distributionForInstance()
    • 包含getRevision(),simply returns the revision number of the classifier,used  by  Weka  maintainers  when  diagnosing  and debugging  problems  reported  by  users.
    • 包含globalInfo(),returns  a  string describing  the  classifier,  which,  along  with  the  scheme’s  options
    • 包含toString(), returns a textual representation of the classifier
    • 包含toSource(),s used to obtain a source code repre-sentation  of  the  learned  classifier
    • 包含main(),called  when  you  ask  for a  decision  stump  from  the  command  line,相当于执行这个类的入口
    • 包含getCapabilities() ,called  by  the  generic  object  editor  to  provide information about the capabilities of a learning scheme

    其他的一些比较重要的包

    weka.associations

    :contains association-rule  learners

    weka.clusterers 

    :contains  methods  for  unsupervised  learning.包含非监督学习方法

    weka.datagenerators

    :产生人工数据

    weka.estimators package

    :computes  different  types  of  probability  distribution

     weka.filters

    :提供数据清理的相关方法

  • 相关阅读:
    Https协议详解
    python3爬虫之入门和正则表达式
    浅谈httpsssl数字证书
    Linux常见配置文件
    标准C++中的string类的用法总结
    SourceInsight中 加namespace宏后,无法跳转问题解决
    ubuntu 12.04安装vmtools 问题解决
    Prolific PL2303 usb 转串口Win8 Win8.1驱动
    大津法阈值法代码
    opencv常用函数备忘
  • 原文地址:https://www.cnblogs.com/yican/p/3810985.html
Copyright © 2011-2022 走看看