神经网络中concatenate和add层的不同

zoukankan html css js c++ java

神经网络中concatenate和add层的不同
在网络结构的设计上，经常说DenseNet和Inception中更多采用的是concatenate操作，而ResNet更多采用的add操作，那么这两个操作有什么异同呢？

concatenate操作是网络结构设计中很重要的一种操作，经常用于将特征联合，多个卷积特征提取框架提取的特征融合或者是将输出层的信息进行融合，而add层更像是信息之间的叠加。

This reveals that both DenseNets and ResNets densely aggregate features from prior layers and their essential difference is how features are aggregated: ResNets aggregate features by summation and DenseNets aggregate them by concatenation.

Resnet是做值的叠加，通道数是不变的，DenseNet是做通道的合并。你可以这么理解，add是描述图像的特征下的信息量增多了，但是描述图像的维度本身并没有增加，只是每一维下的信息量在增加，这显然是对最终的图像的分类是有益的。而concatenate是通道数的合并，也就是说描述图像本身的特征增加了，而每一特征下的信息是没有增加。

在代码层面就是ResNet使用的都是add操作，而DenseNet使用的是concatenate。

这些对我们设计网络结构其实有很大的启发。

通过看keras的源码，发现add操作，
```
def _merge_function(self, inputs):
    output = inputs[0]
    for i in range(1, len(inputs)):
        output += inputs[i]
    return output
```
执行的就是加和操作，举个例子
1. import keras
3. input1 = keras.layers.Input(shape=(16,))
4. x1 = keras.layers.Dense(8, activation='relu')(input1)
5. input2 = keras.layers.Input(shape=(32,))
6. x2 = keras.layers.Dense(8, activation='relu')(input2)
7. added = keras.layers.add([x1, x2])
9. out = keras.layers.Dense(4)(added)
10. model = keras.models.Model(inputs=[input1, input2], outputs=out)
11. model.summary()
打印出来模型结构就是：

__________________________________________________________________________________________________
Layer (type) Output Shape Param # Connected to
==================================================================================================
input_1 (InputLayer) (None, 16) 0
__________________________________________________________________________________________________
input_2 (InputLayer) (None, 32) 0
__________________________________________________________________________________________________
dense_1 (Dense) (None, 8) 136 input_1[0][0]
__________________________________________________________________________________________________
dense_2 (Dense) (None, 8) 264 input_2[0][0]
__________________________________________________________________________________________________
add_1 (Add) (None, 8) 0 dense_1[0][0]
dense_2[0][0]
__________________________________________________________________________________________________
dense_3 (Dense) (None, 4) 36 add_1[0][0]
==================================================================================================
Total params: 436
Trainable params: 436
Non-trainable params: 0

__________________________________________________________________________________________________

这个比较好理解，add层就是接在dense_1,dense_2后面的是一个连接操作，并没有训练参数。

相对来说，concatenate操作比较难理解一点。
```
if py_all([is_sparse(x) for x in tensors]):
    return tf.sparse_concat(axis, tensors)
else:
    return tf.concat([to_dense(x) for x in tensors], axis)
```
通过keras源码发现，一个返回sparse_concate，一个返回concate，这个就比较明朗了，

concate操作，举个例子
```
t1 = [[1, 2, 3], [4, 5, 6]]
t2 = [[7, 8, 9], [10, 11, 12]]
tf.concat([t1, t2], 0) ==> [[1, 2, 3], [4, 5, 6], [7, 8, 9], [10, 11, 12]]
tf.concat([t1, t2], 1) ==> [[1, 2, 3, 7, 8, 9], [4, 5, 6, 10, 11, 12]]

# tensor t3 with shape [2, 3]
# tensor t4 with shape [2, 3]
tf.shape(tf.concat([t3, t4], 0)) ==> [4, 3]
tf.shape(tf.concat([t3, t4], 1)) ==> [2, 6]
```
事实上，是关于维度的一个联合，axis=0表示列维，1表示行维，沿着通道维度连接两个张量。另一个sparse_concate则是关于稀疏矩阵的级联，也比较好理解。
作者：柒月

出处：https://www.cnblogs.com/Ph-one/

开源：https://github.com/iqiy/

站点：https://qiy.net/

Q群：2122210（嵌入式/机器学习）
查看全文

相关阅读:
感觉这周的每日都是累
 昨天是弄了一下这个把国境点以外的航路截断
 现在硬盘有点运行不快了，想换个硬盘
 昨天晚上本来想早睡的，可是彭突然有工作上的问题然后我就一直在远程
 昨天晚上接到知本时代电话较为多，前面还好是从10点开始
 python 绘制折线图
 Numpy中Meshgrid函数介绍及2种应用场景 (转）
np.around() Numpy 数组，DataFrame 四舍五入的利器
 Numpy 数据的元素级逻辑运算 np.logical_and、np.logical_or、np.logical_not
Numpy 中的 ravel() 和 flatten()

原文地址：https://www.cnblogs.com/Ph-one/p/13873427.html