batch normalization, instance normalization, layer normalization, group normalization (知乎)
batch normalization, instance normalization, layer normalization, group normalization比较 (博客)
Batch Normalization 强行让一个batch的数据的某个channel的$mu = 0, sigma = 1$
Layer Normalization 强行让一个数据的所有channel的$mu = 0, sigma = 1$.