zoukankan      html  css  js  c++  java
  • mxnet设置动态学习率(learning rate)

    https://blog.csdn.net/xiaotao_1/article/details/78874336

    如果learning rate很大,算法会在局部最优点附近来回跳动,不会收敛;
      如果learning rate太小,算法每步的移动距离很短,就会导致算法收敛速度很慢。
      所以我们可以先设置一个比较大的学习率,随着迭代次数的增加慢慢降低它。mxnet中有现成的类class,我们可以直接引用。
      这里有三种mxnet.lr_scheduler。
      第一种是:

    mxnet.lr_scheduler.FactorScheduler(step, factor=1, stop_factor_lr=1e-08)
    # Reduce the learning rate by a factor for every n steps.
    # It returns a new learning rate by:
    base_lr * pow(factor, floor(num_update/step))

    # Parameters:
    step (int) – Changes the learning rate for every n updates.
    factor (float, optional) – The factor to change the learning rate.
    stop_factor_lr (float, optional) – Stop updating the learning rate if it is less than this value.
    1
    2
    3
    4
    5
    6
    7
    8
    9
      例如:

    lr_sch = mxnet.lr_scheduler.FactorScheduler(step=500, factor=0.9)
    model.fit(
    train_iter,
    eval_data=val_iter,
    optimizer='sgd',
    optimizer_params={'learning_rate': 0.1, 'lr_scheduler': lr_sch},
    eval_metric=metric,
    num_epoch=num_epoch,
    1
    2
    3
    4
    5
    6
    7
    8
      这里就表示:初始学习率是0.1 。经过500次参数更新后,学习率变为0.1×0.90.1×0.9。经过1000次参数更新之后,学习率变为0.1×0.9×0.90.1×0.9×0.9
      第二种是:

    class mxnet.lr_scheduler.LRScheduler(base_lr=0.01)
    # Base class of a learning rate scheduler.
    # A scheduler returns a new learning rate based on the number of updates that have been performed.
    Parameters: base_lr (float, optional) – The initial learning rate.

    __call__(num_update)
    # Return a new learning rate.
    # The num_update is the upper bound of the number of updates applied to every weight.
    # Assume the optimizer has updated i-th weight by k_i times, namely optimizer.update(i, weight_i) is called by k_i times. Then:
    num_update = max([k_i for all i])
    Parameters: num_update (int) – the maximal number of updates applied to a weight.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
      第三种是:

    class mxnet.lr_scheduler.MultiFactorScheduler(step, factor=1)
    # Reduce the learning rate by given a list of steps.
    # Assume there exists k such that:
    step[k] <= num_update and num_update < step[k+1]

    # Then calculate the new learning rate by:
    base_lr * pow(factor, k+1)
    # Parameters:
    step (list of int) – The list of steps to schedule a change
    factor (float) – The factor to change the learning rate.
    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    参考:https://mxnet.incubator.apache.org/api/python/optimization/optimization.html#mxnet.lr_scheduler.LRScheduler
    ---------------------
    作者:xiaotao_1
    来源:CSDN
    原文:https://blog.csdn.net/xiaotao_1/article/details/78874336
    版权声明:本文为博主原创文章,转载请附上博文链接!

  • 相关阅读:
    CentOS 阿里源
    使用分区挂载 ftp 目录
    Docker-compose常用命令
    docker 启动容器失败 id already in use
    Docker daemon.json 的配置项目合集
    Watchtower
    umount 时目标忙解决办法
    opencontrail 参考资料
    使用disk-image-builder(DIB)制作Ironic 裸金属镜像
    Nodejs常见安装
  • 原文地址:https://www.cnblogs.com/jukan/p/10235091.html
Copyright © 2011-2022 走看看