zoukankan      html  css  js  c++  java
  • 3.2. Grid Search: Searching for estimator parameters

    3.2. Grid Search: Searching for estimator parameters

    Parameters that are not directly learnt within estimators can be set by searching a parameter space for the best Cross-validation: evaluating estimator performance score. Typical examples include Ckernel and gamma for Support Vector Classifier, alpha for Lasso, etc.

    Any parameter provided when constructing an estimator may be optimized in this manner. Specifically, to find the names and current values for all parameters for a given estimator, use:

    estimator.get_params()
    

    Such parameters are often referred to as hyperparameters (particularly in Bayesian learning), distinguishing them from the parameters optimised in a machine learning procedure.

    A search consists of:

    • an estimator (regressor or classifier such as sklearn.svm.SVC());
    • a parameter space;
    • a method for searching or sampling candidates;
    • a cross-validation scheme; and
    • score function.

    Some models allow for specialized, efficient parameter search strategies, outlined below. Two generic approaches to sampling search candidates are provided in scikit-learn: for given values, GridSearchCV exhaustively considers all parameter combinations, while RandomizedSearchCV can sample a given number of candidates from a parameter space with a specified distribution. After describing these tools we detail best practice applicable to both approaches.

    3.2.2. Randomized Parameter Optimization

    While using a grid of parameter settings is currently the most widely used method for parameter optimization, other search methods have more favourable properties. RandomizedSearchCV implements a randomized search over parameters, where each setting is sampled from a distribution over possible parameter values. This has two main benefits over an exhaustive search:

    • A budget can be chosen independent of the number of parameters and possible values.
    • Adding parameters that do not influence the performance does not decrease efficiency.

    Specifying how parameters should be sampled is done using a dictionary, very similar to specifying parameters forGridSearchCV. Additionally, a computation budget, being the number of sampled candidates or sampling iterations, is specified using the n_iter parameter. For each parameter, either a distribution over possible values or a list of discrete choices (which will be sampled uniformly) can be specified:

    [{'C': scipy.stats.expon(scale=100), 'gamma': scipy.stats.expon(scale=.1),
      'kernel': ['rbf'], 'class_weight':['auto', None]}]
    

    This example uses the scipy.stats module, which contains many useful distributions for sampling parameters, such as expon,gammauniform or randint. In principle, any function can be passed that provides a rvs (random variate sample) method to sample a value. A call to the rvs function should provide independent random samples from possible parameter values on consecutive calls.

    Warning

     

    The distributions in scipy.stats do not allow specifying a random state. Instead, they use the global numpy random state, that can be seeded via np.random.seed or set using np.random.set_state.

    For continuous parameters, such as C above, it is important to specify a continuous distribution to take full advantage of the randomization. This way, increasing n_iter will always lead to a finer search.

    Examples:

    References:

    • Bergstra, J. and Bengio, Y., Random search for hyper-parameter optimization, The Journal of Machine Learning Research (2012)
  • 相关阅读:
    大数据量下协同推荐的难点与优化方法
    几个常见的Mysql索引问题
    Orchard CMS -Migration文件更新后数据库不更新的问题 new properties not updating after migrationData migration is not working?
    一个可以设置中奖概率的抽奖程序[转]
    Implementing HTTPS Everywhere in ASP.Net MVC application.
    通过jQuery Ajax使用FormData对象上传文件
    SqlServer 数据去重
    向json中添加新的熟悉或对象 Add new attribute (element) to JSON object using JavaScript
    Dapper的基本使用,Insert、Update、Select、Delete
    DapperExtensions and Dapper.Contrib在表构架不是默认dbo时的处理 DapperExtensions and Dapper.Contrib with non-dbo Schema
  • 原文地址:https://www.cnblogs.com/yymn/p/4598419.html
Copyright © 2011-2022 走看看