zoukankan      html  css  js  c++  java
  • sklearn 随机森林方法

    Notes
    
    The default values for the parameters controlling the size of the trees (e.g. max_depth, min_samples_leaf, etc.) lead to fully grown and unpruned trees 
    which can potentially be very large on some data sets. To reduce memory consumption, the complexity and size of the trees should be controlled by setting
    those parameter values. The features are always randomly permuted at each split. Therefore, the best found split may vary, even with the same training data, max_features
    =n_features
    and bootstrap=False, if the improvement of the criterion is identical for several splits enumerated during the search of the best split. To obtain a
    deterministic behaviour during fitting, random_state has to be fixed. References [R157] Breiman, “Random Forests”, Machine Learning, 45(1), 5-32, 2001.

    Methods

    apply(X) Apply trees in the forest to X, return leaf indices.
    decision_path(X) Return the decision path in the forest
    fit(X, y[, sample_weight]) Build a forest of trees from the training set (X, y).
    get_params([deep]) Get parameters for this estimator.
    predict(X) Predict class for X.
    predict_log_proba(X) Predict class log-probabilities for X.
    predict_proba(X) Predict class probabilities for X.
    score(X, y[, sample_weight]) Returns the mean accuracy on the given test data and labels.
    set_params(**params) Set the parameters of this estimator.
    predict(X)

    Predict class for X.

    The predicted class of an input sample is a vote by the trees in the forest, weighted by their probability estimates. That is, the predicted class is the one with highest mean probability estimate across the trees.

    Parameters:

    X : array-like or sparse matrix of shape = [n_samples, n_features]

    The input samples. Internally, its dtype will be converted to dtype=np.float32. If a sparse matrix is provided, it will be converted into a sparse csr_matrix.

    Returns:

    y : array of shape = [n_samples] or [n_samples, n_outputs]

    The predicted classes.

    predict_log_proba(X)

    Predict class log-probabilities for X.

    The predicted class log-probabilities of an input sample is computed as the log of the mean predicted class probabilities of the trees in the forest.

    Parameters:

    X : array-like or sparse matrix of shape = [n_samples, n_features]

    The input samples. Internally, its dtype will be converted to dtype=np.float32. If a sparse matrix is provided, it will be converted into a sparse csr_matrix.

    Returns:

    p : array of shape = [n_samples, n_classes], or a list of n_outputs

    such arrays if n_outputs > 1. The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

    predict_proba(X)

    Predict class probabilities for X.

    The predicted class probabilities of an input sample are computed as the mean predicted class probabilities of the trees in the forest. The class probability of a single tree is the fraction of samples of the same class in a leaf.

    Parameters:

    X : array-like or sparse matrix of shape = [n_samples, n_features]

    The input samples. Internally, its dtype will be converted to dtype=np.float32. If a sparse matrix is provided, it will be converted into a sparse csr_matrix.

    Returns:

    p : array of shape = [n_samples, n_classes], or a list of n_outputs

    such arrays if n_outputs > 1. The class probabilities of the input samples. The order of the classes corresponds to that in the attribute classes_.

    score(Xysample_weight=None)

    Returns the mean accuracy on the given test data and labels.

    In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

    Parameters:

    X : array-like, shape = (n_samples, n_features)

    Test samples.

    y : array-like, shape = (n_samples) or (n_samples, n_outputs)

    True labels for X.

    sample_weight : array-like, shape = [n_samples], optional

    Sample weights.

    Returns:

    score : float

    Mean accuracy of self.predict(X) wrt. y.

    From Sklearn:

    http://sklearn.apachecn.org/cn/stable/modules/generated/sklearn.ensemble.RandomForestClassifier.html#sklearn.ensemble.RandomForestClassifier

  • 相关阅读:
    剑气之争,聊聊算法岗位的门户之见!
    80%学生的困惑,学完C/C++之后学什么?
    算法工程师日常,训练的模型翻车了怎么办?
    迭代器设计模式,帮你大幅提升Python性能
    十年编程经验总结,三点技巧帮你提升代码能力!
    CenterNet:Corner-Center三元关键点,检测性能全面提升 | ICCV 2019
    CornerNet:经典keypoint-based方法,通过定位角点进行目标检测 | ECCV2018
    阿里面试:MySQL如何设计索引更高效?
    大厂是怎么进行SQL调优的?
    程序人生|从网瘾少年到微软、BAT、字节offer收割机逆袭之路
  • 原文地址:https://www.cnblogs.com/Allen-rg/p/9577848.html
Copyright © 2011-2022 走看看