zoukankan      html  css  js  c++  java
  • darknet-yolov3 burn_in learning_rate policy

    darknet-yolov3中的learning_rate是一个超参数,调参时可通过调节该参数使模型收敛到一个较好的状态。

    在cfg配置中的呈现如下图:

    我这里随便设了一个值。

    接下来说一下burn_in和policy.

    这两者在代码中的呈现如下所示:

    float get_current_rate(network *net)
    {
        size_t batch_num = get_current_batch(net);
        int i;
        float rate;
        if (batch_num < net->burn_in)  //当batch_num小于burn_in时,返回如下learning_rate
          return net->learning_rate * pow((float)batch_num / net->burn_in, net->power);   
        switch (net->policy) {//当大于burn_in时,按如下方式,原配值中给的是STEPS
            case CONSTANT:
                return net->learning_rate;
            case STEP:
                return net->learning_rate * pow(net->scale, batch_num/net->step);
            case STEPS:
                rate = net->learning_rate;     for(i = 0; i < net->num_steps; ++i){
                    if(net->steps[i] > batch_num) return rate;
                    rate *= net->scales[i];
                }
                return rate;
            case EXP:
                return net->learning_rate * pow(net->gamma, batch_num);
            case POLY:
                return net->learning_rate * pow(1 - (float)batch_num / net->max_batches, net->power);
            case RANDOM:
                return net->learning_rate * pow(rand_uniform(0,1), net->power);
            case SIG:
                return net->learning_rate * (1./(1.+exp(net->gamma*(batch_num - net->step))));
            default:
                fprintf(stderr, "Policy is weird!
    ");
                return net->learning_rate;
        }
    }

    这里我做了一些调整。

    调整依据是:发现自己设置的学习率和burn_in结束时的学习率总是有很大差异,造成loss变化出现停滞,或者剧烈抖动。

    调整办法:让steps的起始学习率=burn_in结束时的学习率。

    实现如下:

    float last_rate;
    float get_current_rate(network *net)
    {
        size_t batch_num = get_current_batch(net);
        int i;
        float rate;
        if (batch_num < net->burn_in)
        {
          /******************************************************/
          last_rate = net->learning_rate * pow((float)batch_num / net->burn_in, net->power);
          /*****************************************************/
          return net->learning_rate * pow((float)batch_num / net->burn_in, net->power);
        }
        switch (net->policy) {
            case CONSTANT:
                return net->learning_rate;
            case STEP:
                return net->learning_rate * pow(net->scale, batch_num/net->step);
            case STEPS:
                //rate = net->learning_rate;
               rate = last_rate;
                for(i = 0; i < net->num_steps; ++i){
                    if(net->steps[i] > batch_num) return rate;
                    rate *= net->scales[i];
                }
                return rate;
            case EXP:
                return net->learning_rate * pow(net->gamma, batch_num);
            case POLY:
                return net->learning_rate * pow(1 - (float)batch_num / net->max_batches, net->power);
            case RANDOM:
                return net->learning_rate * pow(rand_uniform(0,1), net->power);
            case SIG:
                return net->learning_rate * (1./(1.+exp(net->gamma*(batch_num - net->step))));
            default:
                fprintf(stderr, "Policy is weird!
    ");
                return net->learning_rate;
        }
    }
  • 相关阅读:
    选择结构(if、switch)
    顺序结构程序
    矩阵变换、矩阵求值
    basicRF双向灯光控制
    基于BasicRF点对点无线开发基础知识
    MATLAB矩阵处理—特殊矩阵
    ScrollView嵌套RecyclerView时滑动出现的卡顿
    如何给GridView添加网格
    Android 中关于ListView分割线的设置
    关于ScrollView嵌套ListView问题
  • 原文地址:https://www.cnblogs.com/zhibei/p/12165360.html
Copyright © 2011-2022 走看看