zoukankan      html  css  js  c++  java
  • keras embeding设置初始值的两种方式

    随机初始化Embedding

    from keras.models import Sequential
    from keras.layers import Embedding
    import numpy as np
    
    model = Sequential()
    model.add(Embedding(1000, 64, input_length=10))
    # the model will take as input an integer matrix of size (batch, input_length).
    # the largest integer (i.e. word index) in the input should be no larger than 999 (vocabulary size).
    # now model.output_shape == (None, 10, 64), where None is the batch dimension.
    
    input_array = np.random.randint(1000, size=(32, 10))
    
    model.compile('rmsprop', 'mse')
    output_array = model.predict(input_array)
    print(output_array)
    assert output_array.shape == (32, 10, 64)
    
    

    使用weights参数指明embedding初始值

    import numpy as np
    
    import keras
    
    m = keras.models.Sequential()
    """
    可以通过weights参数指定初始的weights参数
    
    因为Embedding层是不可导的 
    梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层
    
    注意weights到embeddings的绑定过程很复杂,weights是一个列表
    """
    embedding = keras.layers.Embedding(input_dim=3, output_dim=2, input_length=1, weights=[np.arange(3 * 2).reshape((3, 2))], mask_zero=True)
    m.add(embedding)  # 一旦add,就会自动调用embedding的build函数,
    print(keras.backend.get_value(embedding.embeddings))
    m.compile(keras.optimizers.RMSprop(), keras.losses.mse)
    print(m.predict([1, 2, 2, 1, 2, 0]))
    print(m.get_layer(index=0).get_weights())
    print(keras.backend.get_value(embedding.embeddings))
    
    

    给embedding设置初始值的第二种方式:使用initializer

    import numpy as np
    
    import keras
    
    m = keras.models.Sequential()
    """
    可以通过weights参数指定初始的weights参数
    
    因为Embedding层是不可导的 
    梯度东流至此回,所以把embedding放在中间层是没有意义的,emebedding只能作为第一层
    
    
    给embedding设置权值的第二种方式,使用constant_initializer 
    """
    embedding = keras.layers.Embedding(input_dim=3, output_dim=2, input_length=1, embeddings_initializer=keras.initializers.constant(np.arange(3 * 2, dtype=np.float32).reshape((3, 2))))
    m.add(embedding)
    print(keras.backend.get_value(embedding.embeddings))
    m.compile(keras.optimizers.RMSprop(), keras.losses.mse)
    print(m.predict([1, 2, 2, 1, 2]))
    print(m.get_layer(index=0).get_weights())
    print(keras.backend.get_value(embedding.embeddings))
    

    关键的难点在于理清weights是怎么传入到embedding.embeddings张量里面去的。

    Embedding是一个层,继承自Layer,Layer有weights参数,weights参数是一个list,里面的元素都是numpy数组。在调用Layer的构造函数的时候,weights参数就被存储到了_initial_weights变量
    basic_layer.py 之Layer类

            if 'weights' in kwargs:
                self._initial_weights = kwargs['weights']
            else:
                self._initial_weights = None
    

    当把Embedding层添加到模型中、跟模型的上一层进行拼接的时候,会调用layer(上一层)函数,此处layer是Embedding实例,Embedding是一个继承了Layer的类,Embedding类没有重写__call__()方法,Layer实现了__call__()方法。父类Layer的__call__方法调用子类的call()方法来获取结果。所以最终调用的是Layer.__call__()。在这个方法中,会自动检测该层是否build过(根据self.built布尔变量)。

    Layer.__call__函数非常重要。

        def __call__(self, inputs, **kwargs):
            """Wrapper around self.call(), for handling internal references.
    
            If a Keras tensor is passed:
                - We call self._add_inbound_node().
                - If necessary, we `build` the layer to match
                    the _keras_shape of the input(s).
                - We update the _keras_shape of every input tensor with
                    its new shape (obtained via self.compute_output_shape).
                    This is done as part of _add_inbound_node().
                - We update the _keras_history of the output tensor(s)
                    with the current layer.
                    This is done as part of _add_inbound_node().
    
            # Arguments
                inputs: Can be a tensor or list/tuple of tensors.
                **kwargs: Additional keyword arguments to be passed to `call()`.
    
            # Returns
                Output of the layer's `call` method.
    
            # Raises
                ValueError: in case the layer is missing shape information
                    for its `build` call.
            """
            if isinstance(inputs, list):
                inputs = inputs[:]
            with K.name_scope(self.name):
                # Handle laying building (weight creating, input spec locking).
                if not self.built:#如果未曾build,那就要先执行build再调用call函数
                    # Raise exceptions in case the input is not compatible
                    # with the input_spec specified in the layer constructor.
                    self.assert_input_compatibility(inputs)
    
                    # Collect input shapes to build layer.
                    input_shapes = []
                    for x_elem in to_list(inputs):
                        if hasattr(x_elem, '_keras_shape'):
                            input_shapes.append(x_elem._keras_shape)
                        elif hasattr(K, 'int_shape'):
                            input_shapes.append(K.int_shape(x_elem))
                        else:
                            raise ValueError('You tried to call layer "' +
                                             self.name +
                                             '". This layer has no information'
                                             ' about its expected input shape, '
                                             'and thus cannot be built. '
                                             'You can build it manually via: '
                                             '`layer.build(batch_input_shape)`')
                    self.build(unpack_singleton(input_shapes))
                    self.built = True#这句话其实有些多余,因为self.build函数已经把built置为True了
    
                    # Load weights that were specified at layer instantiation.
                    if self._initial_weights is not None:#如果传入了weights,把weights参数赋值到每个变量,此处会覆盖上面的self.build函数中的赋值。
                        self.set_weights(self._initial_weights)
    
                # Raise exceptions in case the input is not compatible
                # with the input_spec set at build time.
                self.assert_input_compatibility(inputs)
    
                # Handle mask propagation.
                previous_mask = _collect_previous_mask(inputs)
                user_kwargs = copy.copy(kwargs)
                if not is_all_none(previous_mask):
                    # The previous layer generated a mask.
                    if has_arg(self.call, 'mask'):
                        if 'mask' not in kwargs:
                            # If mask is explicitly passed to __call__,
                            # we should override the default mask.
                            kwargs['mask'] = previous_mask
                # Handle automatic shape inference (only useful for Theano).
                input_shape = _collect_input_shape(inputs)
    
                # Actually call the layer,
                # collecting output(s), mask(s), and shape(s).
                output = self.call(inputs, **kwargs)
                output_mask = self.compute_mask(inputs, previous_mask)
    
                # If the layer returns tensors from its inputs, unmodified,
                # we copy them to avoid loss of tensor metadata.
                output_ls = to_list(output)
                inputs_ls = to_list(inputs)
                output_ls_copy = []
                for x in output_ls:
                    if x in inputs_ls:
                        x = K.identity(x)
                    output_ls_copy.append(x)
                output = unpack_singleton(output_ls_copy)
    
                # Inferring the output shape is only relevant for Theano.
                if all([s is not None
                        for s in to_list(input_shape)]):
                    output_shape = self.compute_output_shape(input_shape)
                else:
                    if isinstance(input_shape, list):
                        output_shape = [None for _ in input_shape]
                    else:
                        output_shape = None
    
                if (not isinstance(output_mask, (list, tuple)) and
                        len(output_ls) > 1):
                    # Augment the mask to match the length of the output.
                    output_mask = [output_mask] * len(output_ls)
    
                # Add an inbound node to the layer, so that it keeps track
                # of the call and of all new variables created during the call.
                # This also updates the layer history of the output tensor(s).
                # If the input tensor(s) had not previous Keras history,
                # this does nothing.
                self._add_inbound_node(input_tensors=inputs,
                                       output_tensors=output,
                                       input_masks=previous_mask,
                                       output_masks=output_mask,
                                       input_shapes=input_shape,
                                       output_shapes=output_shape,
                                       arguments=user_kwargs)
    
                # Apply activity regularizer if any:
                if (hasattr(self, 'activity_regularizer') and
                        self.activity_regularizer is not None):
                    with K.name_scope('activity_regularizer'):
                        regularization_losses = [
                            self.activity_regularizer(x)
                            for x in to_list(output)]
                    self.add_loss(regularization_losses,
                                  inputs=to_list(inputs))
            return output
    
    

    如果没有build过,会自动调用Embedding类的build()函数。Embedding.build()这个函数并不会去管weights,如果它使用的initializer没有传入,self.embeddings_initializer会变成随机初始化。如果传入了,那么在这一步就能够把weights初始化好。如果同时传入embeddings_initializer和weights参数,那么weights参数稍后会把Embedding#embeddings覆盖掉。

    embedding.py Embedding类的build函数

    
        def build(self, input_shape):
            self.embeddings = self.add_weight(
                shape=(self.input_dim, self.output_dim),
                initializer=self.embeddings_initializer,
                name='embeddings',
                regularizer=self.embeddings_regularizer,
                constraint=self.embeddings_constraint,
                dtype=self.dtype)
            self.built = True
    

    综上,在keras中,使用weights给Layer的变量赋值是一个比较通用的方法,但是不够直观。keras鼓励多多使用明确的initializer,而尽量不要触碰weights。

  • 相关阅读:
    支付方法及注意事项
    网站负载均衡策略
    工作成长
    java内存机制
    关于前途的一些思考
    git记录
    关于博客
    如何为公司创造价值?
    遍历集合方法总结
    二叉树和红黑二叉树
  • 原文地址:https://www.cnblogs.com/weiyinfu/p/9873001.html
Copyright © 2011-2022 走看看