zoukankan html css js c++ java

TensorFlow2.0（5）：张量限幅
注：本系列所有博客将持续更新并发布在github上，您可以通过github下载本系列所有文章笔记文件。

1 maxmium()与minmium()¶

maximum()用于限制最小值,也即是说，将一个tensor中小于指定值的元素替换为指定值：

In [1]:

import tensorflow as tf

In [2]:

a = tf.range(10) a

Out[2]:

<tf.Tensor: id=3, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], dtype=int32)>

In [3]:

tf.maximum(a, 4)

Out[3]:

<tf.Tensor: id=5, shape=(10,), dtype=int32, numpy=array([4, 4, 4, 4, 4, 5, 6, 7, 8, 9], dtype=int32)>

In [4]:

b = tf.random.uniform([3,4], minval=1, maxval=10, dtype=tf.int32) b

Out[4]:

<tf.Tensor: id=9, shape=(3, 4), dtype=int32, numpy= array([[8, 2, 4, 1], [9, 5, 4, 7], [6, 5, 8, 6]], dtype=int32)>

In [5]:

tf.maximum(b, 4)

Out[5]:

<tf.Tensor: id=11, shape=(3, 4), dtype=int32, numpy= array([[8, 4, 4, 4], [9, 5, 4, 7], [6, 5, 8, 6]], dtype=int32)>

minium()方法与maximum()方法想法，用于限制一个tensor的最大值，即将tensor中大于指定值的元素替换为指定值：

In [6]:

tf.minimum(a, 6)

Out[6]:

<tf.Tensor: id=13, shape=(10,), dtype=int32, numpy=array([0, 1, 2, 3, 4, 5, 6, 6, 6, 6], dtype=int32)>

In [7]:

tf.minimum(b, 6)

Out[7]:

<tf.Tensor: id=15, shape=(3, 4), dtype=int32, numpy= array([[6, 2, 4, 1], [6, 5, 4, 6], [6, 5, 6, 6]], dtype=int32)>

如果要同时限制一个tensor的最大值和最小值，可以这么做：

In [8]:

tf.minimum(tf.maximum(b,4),6)

Out[8]:

<tf.Tensor: id=19, shape=(3, 4), dtype=int32, numpy= array([[6, 4, 4, 4], [6, 5, 4, 6], [6, 5, 6, 6]], dtype=int32)>

这种同时调用minmium()和maxmium()的方法不够便捷，所以TensorFlow中提供了clip_by_value()方法来实现这一功能。

2 clip_by_value()¶

clip_by_value()底层也是通过调用minmium()和maxmium()方法来实现同时限制最大值、最小值功能，我们现在来感受一下：

In [9]:

b

Out[9]:

<tf.Tensor: id=9, shape=(3, 4), dtype=int32, numpy= array([[8, 2, 4, 1], [9, 5, 4, 7], [6, 5, 8, 6]], dtype=int32)>

In [10]:

tf.clip_by_value(b,4,6)

Out[10]:

<tf.Tensor: id=23, shape=(3, 4), dtype=int32, numpy= array([[6, 4, 4, 4], [6, 5, 4, 6], [6, 5, 6, 6]], dtype=int32)>

3 relu()¶

relu()方法将tensor最小值限制为0，相当于tf.maxmium(a,0),注意，relu()方法在tf.nn模块中：

In [11]:

a = tf.range(-5,5,1) a

Out[11]:

<tf.Tensor: id=27, shape=(10,), dtype=int32, numpy=array([-5, -4, -3, -2, -1, 0, 1, 2, 3, 4], dtype=int32)>

In [12]:

tf.nn.relu(a)

Out[12]:

<tf.Tensor: id=28, shape=(10,), dtype=int32, numpy=array([0, 0, 0, 0, 0, 0, 1, 2, 3, 4], dtype=int32)>

In [13]:

b = tf.random.uniform([3,4],minval=-10, maxval=10, dtype=tf.int32) b

Out[13]:

<tf.Tensor: id=32, shape=(3, 4), dtype=int32, numpy= array([[-8, -1, -4, 7], [-6, -3, 2, -8], [ 5, 6, 2, 5]], dtype=int32)>

In [14]:

tf.nn.relu(b)

Out[14]:

<tf.Tensor: id=33, shape=(3, 4), dtype=int32, numpy= array([[0, 0, 0, 7], [0, 0, 2, 0], [5, 6, 2, 5]], dtype=int32)>

4 cli_by_norm()¶

cli_by_norm()方法是根据tensor的L2范数（模）和给定裁切值按比例对tensor进行限幅。这种方法可以在不改变方向的前提下，按比例对向量进行限幅。我们先手动实现这一过程，先定义一个向量：

In [15]:

a = tf.random.normal([2,3],mean=10) a

Out[15]:

<tf.Tensor: id=39, shape=(2, 3), dtype=float32, numpy= array([[ 9.93618 , 10.367402, 9.617832], [ 8.890949, 9.650288, 9.430309]], dtype=float32)>

然后求这个向量的L2范数，也就是向量的模：

In [16]:

n = tf.norm(a) n

Out[16]:

<tf.Tensor: id=44, shape=(), dtype=float32, numpy=23.66054>

向量处理模，就可以将向量缩放到0到1范围：

In [17]:

a1 = a / n a1

Out[17]:

<tf.Tensor: id=45, shape=(2, 3), dtype=float32, numpy= array([[0.41994733, 0.43817267, 0.4064925 ], [0.3757712 , 0.4078642 , 0.39856696]], dtype=float32)>

对向量限幅时，例如限制在10范围内：

In [18]:

a2 = a1 * 10 a2

Out[18]:

<tf.Tensor: id=47, shape=(2, 3), dtype=float32, numpy= array([[4.1994734, 4.3817267, 4.064925 ], [3.757712 , 4.078642 , 3.9856696]], dtype=float32)>

clip_by_norm()方法实现的就是上述步骤：

In [19]:

tf.clip_by_norm(a,10)

Out[19]:

<tf.Tensor: id=63, shape=(2, 3), dtype=float32, numpy= array([[4.1994734, 4.3817267, 4.064925 ], [3.757712 , 4.0786424, 3.9856696]], dtype=float32)>

当然，cli_by_norm()方法内部还做了一个判断：如果给定的裁切值大于tensor的模，那就不会去对tensor进行修改，依旧返回tensor本身。继续上面例子，a的模为25.625225，如果给定的裁切值大于这个值，就不会对a进行限幅：

In [20]:

tf.clip_by_norm(a,26)

Out[20]:

<tf.Tensor: id=79, shape=(2, 3), dtype=float32, numpy= array([[ 9.936181, 10.367402, 9.617832], [ 8.890949, 9.650288, 9.430309]], dtype=float32)>

5 clip_by_global_norm()¶

在梯度更新等诸多场景中，需要同时综合多个参数（tensor）进行梯度更新，这时候，clip_by_norm()就满足不了需求了，所以就有了cip_by_global_norm()方法。cip_by_global_norm()方法限幅原理与clip_by_norm()是一样的，都是综合范数和给定的裁切值进行限幅，不同的是，cip_by_global_norm()方法方法计算范数时是综合给定的多个tensor进行计算。

注：clip_by_global_norm()方法用于修正梯度值，控制梯度爆炸的问题。梯度爆炸和梯度弥散的原因一样，都是因为链式法则求导的关系，导致梯度的指数级衰减。为了避免梯度爆炸，需要对梯度进行修剪。

以下面三个向量为例，同时进行限幅：

In [21]:

t1 = tf.random.normal([3],mean=10) t1

Out[21]:

<tf.Tensor: id=85, shape=(3,), dtype=float32, numpy=array([8.257121, 7.466969, 8.756236], dtype=float32)>

In [22]:

t2 = tf.random.normal([3],mean=10) t2

Out[22]:

<tf.Tensor: id=91, shape=(3,), dtype=float32, numpy=array([10.112761, 10.555879, 9.646121], dtype=float32)>

In [23]:

t3 = tf.random.normal([3],mean=10) t3

Out[23]:

<tf.Tensor: id=97, shape=(3,), dtype=float32, numpy=array([9.884818, 8.648524, 9.125227], dtype=float32)>

In [24]:

t_list = [t1,t2,t3]

首先计算全局L2范数,计算公式为： global_norm = sqrt(sum([L2norm(t)**2 for t in t_list]))

In [25]:

global_norm = tf.norm([tf.norm(t) for t in t_list])

假设给定裁切值为25：

In [26]:

[t*25/global_norm for t in t_list]

Out[26]:

[<tf.Tensor: id=121, shape=(3,), dtype=float32, numpy=array([7.4725804, 6.757504 , 7.9242725], dtype=float32)>, <tf.Tensor: id=124, shape=(3,), dtype=float32, numpy=array([9.151909, 9.552924, 8.729607], dtype=float32)>, <tf.Tensor: id=127, shape=(3,), dtype=float32, numpy=array([8.945623, 7.826795, 8.258204], dtype=float32)>]

In [27]:

tf.clip_by_global_norm(t_list,25)

Out[27]:

([<tf.Tensor: id=148, shape=(3,), dtype=float32, numpy=array([7.47258 , 6.7575035, 7.9242725], dtype=float32)>, <tf.Tensor: id=149, shape=(3,), dtype=float32, numpy=array([9.151908, 9.552924, 8.729606], dtype=float32)>, <tf.Tensor: id=150, shape=(3,), dtype=float32, numpy=array([8.945623 , 7.8267946, 8.2582035], dtype=float32)>], <tf.Tensor: id=136, shape=(), dtype=float32, numpy=27.624733>)

计算结果是一样的，不过clip_by_global_norm()返回两个值，分别是各向量限幅后的返回值列表、全局范数。
查看全文

相关阅读:
泛型接口（C# 编程指南） From MSDN
不知道是不是心理作用，我怎么觉得在Fedora下写cnblogs比Windows下快。
VS.NET 2005真是太好用了！
写了个打字游戏，可是有问题(C#)
C#多线程测试
 关于继承的一个小程序
 VS.NET 2008 试用
 基本排序算法及分析（二）：冒泡排序
 基本排序算法及分析（三）：shell排序
 [导入]一维数组输出杨辉三角形

原文地址：https://www.cnblogs.com/chenhuabin/p/11638224.html

TensorFlow2.0（5）：张量限幅

1 maxmium()与minmium()¶

2 clip_by_value()¶

3 relu()¶

4 cli_by_norm()¶

5 clip_by_global_norm()¶