zoukankan      html  css  js  c++  java
  • 多目标损失中权重学习

    最近在想把collaborative learning做成类似Federated learning可以各个设备独立计算,仅交互参数的东西,打算参考Multi-task的框架。顺路就把Multi-task的东西看了看,然后发现了类似agnostic model或者average model的paper,paper作者还提供了一个tony example,自己代码生疏,正好熟悉一下

    Multi-task

    自己不熟悉Multi-task,这里也不做总结和梳理,只讲和paper思路、code相关的。

    通常,优化目标是一个标量,如果引入多个任务,那么目标函数就变成了一个向量。使向量变成标量最简单的办法就是将各个任务加权平均,这个系数(w)是一个hyper-parameter,不好调整的。average model或者agnostic model做的就是让模型自己学习这个参数(w),这也是这篇文章的思路,不过这篇文章我觉得更好的一点在于对(w)还有正则项,即原文中的公式(10)

    [frac{1}{2sigma^2_1}mathcal{L}_1({ m W})+frac{1}{sigma_2^2}mathcal{L}_2({ m W})+underbrace{log(sigma_1sigma_2)}_{regularization} ]

    (sigma)增大的时候,对应的(mathcal{L})权重减小,同时最后一个正则项限制了(sigma)的无限制增大。

    我最近看过4~5篇这样学习权重的idea,有叫adaptive的,有叫agnostic的、还有叫average的,还有在华为2018年前一篇meta-learning中对MAML学习率做调整动态调整的,总之差不多都一个意思。

    (虽然我之前很不喜欢这种小修小补,但是我连小修小补都做不出来,/(ㄒoㄒ)/~~!九层之台,起于垒土,挖坑也是从填坑开始的!)

    code

    作者的code使用keras写的,通过查阅keras目前定义函数loss的方式仅允许输入y_prey_tru两项,而这篇文章的loss又是比较复杂,依赖于(sigma),而且(sigma)也是要学习的参数。因此作者通过定义一个参数层来实现

    from keras.layers import Input, Dense, Lambda, Layer
    from keras.initializers import Constant
    from keras.models import Model
    from keras import backend as K
    
    # Custom loss layer
    # Inherit from Layer. Must have build and call function
    class CustomMultiLossLayer(Layer):
        def __init__(self, nb_outputs=2, **kwargs):
            self.nb_outputs = nb_outputs
            self.is_placeholder = True
            super(CustomMultiLossLayer, self).__init__(**kwargs)
            
        def build(self, input_shape=None):
            # initialise log_vars
            # define learning parameters by add_weight function(set trainable=True)
            self.log_vars = []
            for i in range(self.nb_outputs):
                self.log_vars += [self.add_weight(name='log_var' + str(i), shape=(1,),
                                                  initializer=Constant(0.), trainable=True)]
            super(CustomMultiLossLayer, self).build(input_shape)
    	
        def multi_loss(self, ys_true, ys_pred):
            # Because kera loss function only support input y_true and y_pred
            # this complex function use class attributes to program loss
            assert len(ys_true) == self.nb_outputs and len(ys_pred) == self.nb_outputs
            loss = 0
            for y_true, y_pred, log_var in zip(ys_true, ys_pred, self.log_vars):
                precision = K.exp(-log_var[0])
                loss += K.sum(precision * (y_true - y_pred)**2. + log_var[0], -1)
            return K.mean(loss)
    
        def call(self, inputs):
            ys_true = inputs[:self.nb_outputs]
            ys_pred = inputs[self.nb_outputs:]
            loss = self.multi_loss(ys_true, ys_pred)
            self.add_loss(loss, inputs=inputs)  # adding loss to class _loss attribute
            # We won't actually use the output.
            return K.concatenate(inputs, -1)
    
    def get_prediction_model():
        inp = Input(shape=(Q,), name='inp')
        x = Dense(nb_features, activation='relu')(inp)
        y1_pred = Dense(D1)(x)
        y2_pred = Dense(D2)(x)
        return Model(inp, [y1_pred, y2_pred])
    
    def get_trainable_model(prediction_model):
        inp = Input(shape=(Q,), name='inp')
        y1_pred, y2_pred = prediction_model(inp)
        y1_true = Input(shape=(D1,), name='y1_true')
        y2_true = Input(shape=(D2,), name='y2_true')
        out = CustomMultiLossLayer(nb_outputs=2)([y1_true, y2_true, y1_pred, y2_pred])
        return Model([inp, y1_true, y2_true], out)
    
    prediction_model = get_prediction_model()
    trainable_model = get_trainable_model(prediction_model)
    trainable_model.compile(optimizer='adam', loss=None)
    assert len(trainable_model.layers[-1].trainable_weights) == 2  # two log_vars, one for each output
    assert len(trainable_model.losses) == 1
    

    作者通过自己定义一个损失函数层来实现complex loss,后来我在知乎上找到了一篇讲解keras如何做custom loss的文章,主要代码贴在这里

    class WbceLoss(KL.Layer):
        def __init__(self, **kwargs):
            super(WbceLoss, self).__init__(**kwargs)
    
        def call(self, inputs, **kwargs):
            """
            # inputs:Input tensor, or list/tuple of input tensors.
            如上,父类KL.Layer的call方法明确要求inputs为一个tensor,或者包含多个tensor的列表/元组
            所以这里不能直接接受多个入参,需要把多个入参封装成列表/元组的形式然后在函数中自行解包,否则会报错。
            """
            # 解包入参
            y_true, y_weight, y_pred = inputs
            # 复杂的损失函数
            bce_loss = K.binary_crossentropy(y_true, y_pred)
            wbce_loss = K.mean(bce_loss * y_weight)
            # 重点:把自定义的loss添加进层使其生效,同时加入metric方便在KERAS的进度条上实时追踪
            self.add_loss(wbce_loss, inputs=True)
            self.add_metric(wbce_loss, aggregation="mean", name="wbce_loss")
            return wbce_loss
        
    def my_model():
        # input layers
        input_img = KL.Input([64, 64, 3], name="img")
        input_lbl = KL.Input([64, 64, 1], name="lbl")
        input_weight = KL.Input([64, 64, 1], name="weight")
        
        predict = KL.Conv2D(2, [1, 1], padding="same")(input_img)
        my_loss = WbceLoss()([input_lbl, input_weight, predict])
    
        model = KM.Model(inputs=[input_img, input_lbl, input_weight], outputs=[predict, my_loss])
        model.compile(optimizer="adam")
        return model
    

    参考资料

    1. Github: yaringal, multi-task-learning-example
    2. Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics
    3. [知乎: Ziyigogogo, Tensorflow2.0中复杂损失函数实现](
  • 相关阅读:
    洛谷P2602 [ZJOI2010]数字计数 题解
    数位DP模板
    The Meaningless Game 思维题
    CF55D Beautiful numbers 数位DP
    NOIP 2016 洛谷 P2827 蚯蚓 题解
    弹性碰撞问题:Ants+Linear world
    BZOJ1294 洛谷P2566 状态压缩DP 围豆豆
    朋友HDU
    树的深度———树形DP
    CF1292C Xenon's Attack on the Gangs 题解
  • 原文地址:https://www.cnblogs.com/DemonHunter/p/12776342.html
Copyright © 2011-2022 走看看