zoukankan      html  css  js  c++  java
  • 【489】高级的深度学习最佳实践

    参考:Deep Learning with Python P196


    一、不用 Sequential 模型的解决方案:Keras 函数式 API

    • 一个多输入模型
    • 一个多输出(或多头)模型

     

    1.1 函数式 API 简介

      都是按照输入输出的模式,以下两种模式是一致的。

    from keras.models import Sequential, Model
    from keras import layers, Input
    
    seq_model = Sequential()
    seq_model.add(layers.Dense(32, activation='relu', input_shape=(64,)))
    seq_model.add(layers.Dense(32, activation='relu'))
    seq_model.add(layers.Dense(10, activation='softmax'))
    
    input_tensor = Input(shape=(64,))
    h1 = layers.Dense(32, activation='relu')(input_tensor)
    h2 = layers.Dense(32, activation='relu')(h1)
    output_tensor = layers.Dense(10, activation='softmax')(h2)
    
    model = Model(input_tensor, output_tensor)
    
    model.summary()
    

      output

    _________________________________________________________________
    Layer (type)                 Output Shape              Param #   
    =================================================================
    input_1 (InputLayer)         (None, 64)                0         
    _________________________________________________________________
    dense_1 (Dense)              (None, 32)                2080      
    _________________________________________________________________
    dense_2 (Dense)              (None, 32)                1056      
    _________________________________________________________________
    dense_3 (Dense)              (None, 10)                330       
    =================================================================
    Total params: 3,466
    Trainable params: 3,466
    Non-trainable params: 0
    _________________________________________________________________
    

      对这种 Model 实例进行编译、训练或评估时,其 API 与 Sequential 模型相同。

    1.2 多输入模型

    参考:3.keras实现-->高级的深度学习最佳实践

      典型的问答模型有两个输入:一个自然语言描述的问题和一个文本片段(比如新闻文章),后者提供用于回答问题的信息。然后模型要生成一个回答,在最简单的情况下,这个回答只包含一个词,可以通过对摸个预定义的词表做softmax得到。

      输入:问题 + 文本片段

      输出:回答(一个词)

    from keras.models import Model
    from keras import layers
    from keras import Input
     
    text_vocabulary_size = 10000
    question_vocabulary_size = 10000
    answer_vocabulary_size = 500
     
    text_input = Input(shape=(None,),
                       dtype='int32',
                       name='text')
    embeded_text = layers.Embedding(text_vocabulary_size,64)(text_input)
    encoded_text = layers.LSTM(32)(embeded_text)
     
     
    question_input = Input(shape=(None,),
                          dtype = 'int32',
                          name = 'question')
    embeded_question = layers.Embedding(question_vocabulary_size,32)(question_input)
    encoded_question = layers.LSTM(16)(embeded_question)
     
    concatenated = layers.concatenate([encoded_text,encoded_question],axis=-1)
    answer = layers.Dense(answer_vocabulary_size,activation='softmax')(concatenated)
     
    model = Model([text_input,question_input],answer)
    model.compile(optimizer='rmsprop',
                 loss = 'categorical_crossentropy',
                 metrics = ['acc'])
     
    model.summary()
    

      output:

    __________________________________________________________________________________________________
    Layer (type)                    Output Shape         Param #     Connected to                     
    ==================================================================================================
    text (InputLayer)               (None, None)         0                                            
    __________________________________________________________________________________________________
    question (InputLayer)           (None, None)         0                                            
    __________________________________________________________________________________________________
    embedding_5 (Embedding)         (None, None, 64)     640000      text[0][0]                       
    __________________________________________________________________________________________________
    embedding_6 (Embedding)         (None, None, 32)     320000      question[0][0]                   
    __________________________________________________________________________________________________
    lstm_5 (LSTM)                   (None, 32)           12416       embedding_5[0][0]                
    __________________________________________________________________________________________________
    lstm_6 (LSTM)                   (None, 16)           3136        embedding_6[0][0]                
    __________________________________________________________________________________________________
    concatenate_3 (Concatenate)     (None, 48)           0           lstm_5[0][0]                     
                                                                     lstm_6[0][0]                     
    __________________________________________________________________________________________________
    dense_6 (Dense)                 (None, 500)          24500       concatenate_3[0][0]              
    ==================================================================================================
    Total params: 1,000,052
    Trainable params: 1,000,052
    Non-trainable params: 0
    __________________________________________________________________________________________________
    

      训练这种模型需要能够对网络的各个头指定不同的损失函数,例如:年龄预测是标量回归任务,而性别预测是二分类任务,二者需要不同的损失过程。 但是,梯度下降要求将一个标量最小化,所以为了能够训练模型,我们必须将这些损失合并为单个标量。合并不同损失最简单的方法就是对所有损失求和。

    #多输出模型的编译选项:多重损失
    #方法一
    model.compile(optimizer='rmsprop',
                 loss = ['mse','categorical_crossentropy','binary_crossentropy'])
    #方法二
    # model.compile(optimizer='rmsprop',
    #              loss={'age':'mse',
    #                   'income':'categorical_crossentropy',
    #                   'gender':'binary_crossentropy'})
    

      #多输出模型的编译选项:损失加权

    #方法一
    model.compile(optimizer='rmsprop',
                 loss = ['mse','categorical_crossentropy',<br>'binary_crossentropy'],
                 loss_weights=[0.25,1.,10.])
    #方法二
    # model.compile(optimizer='rmsprop',
    #              loss={'age':'mse',
    #                   'income':'categorical_crossentropy',
    #                   'gender':'binary_crossentropy'},
    #              loss_weights={'age':0.25,
    #                            'income':1.,
    #                            'gender':10.})
    

      不同的损失值具有不同的取值范围,为了平衡不同损失的贡献,应该对loss_weights进行设置

    #将数据输入到多输出模型中
    #方法一
    model.fit(posts,[age_targets,income_targets,gender_targets],
             epochs=10,batch_size=64)
    #方法二
    # model.fit(posts,{'age':age_targets,
    #                 'income':income_targets,
    #                 'gender':gender_targets},
    #          epochs=10,batch_size=64)
    

      

  • 相关阅读:
    5.21 CSS样式表练习
    5.20 c#验证码练习
    5.20 邮箱注册,及网页嵌套,知识点复习
    5.19 网页注册练习
    5.19练习标签及其 定义
    5.16 兔子生兔子,日期时间练习
    5.15 复习;共5题
    5.11 集合 与 特殊集合
    5.11 集合与特殊集合
    WinForm1
  • 原文地址:https://www.cnblogs.com/alex-bn-lee/p/13762696.html
Copyright © 2011-2022 走看看