zoukankan html css js c++ java

tensorflow基础【6】-共享变量与作用域

共享变量

tf.Variable 创建唯一变量：当系统检测到命名冲突会自动处理，保证了变量的唯一性，也就是不可复用，

######################### Variable 无法共享 #########################
v1 = tf.Variable(tf.random_normal((2, 3)), name='v1')
print(v1)       # <tf.Variable 'v1:0' shape=(2, 3) dtype=float32_ref>
v2 = tf.Variable(tf.random_normal((3, 3)), name='v1')
print(v2)       # <tf.Variable 'v1_1:0' shape=(3, 3) dtype=float32_ref>   ### name 冲突自动处理

通俗的说就是不能有两个变量一模一样

tf.get_variable 创建共享变量：所谓的共享变量，不是共享文件的共享，也不是编程中公共变量的共享，

这里的共享是指可以把变量 A 直接赋值给变量 B，即变量 B 共享了变量 A，通俗的说就是连个变量一模一样；

实际上 get_variable 是获取指定属性（name）的已存在变量，如果可获取，即共享了变量，如果指定属性的变量不存在，就新建一个。

######################### get_variable 共享变量 #########################
############# 证明 get_ariable 的共享是有条件的 #############

### test1：用 tf.Variable 创建一个变量
# 1. 用 get_variable 无法获取这个已存在的变量，提示你在瞎搞
# 2. 用 get_variable 新建一个同名的变量，名字冲突会自动处理
v3 = tf.Variable(tf.random_normal((3, 3)), name='v3')
print(v3)       # <tf.Variable 'v3:0' shape=(3, 3) dtype=float32_ref>
# v4 = tf.get_variable(name='v3')         ### 报错 ValueError: The initializer passed is not valid. It should be a callable with no arguments and the shape should not be provided or an instance of `tf.keras.initializers.*' and `shape` should be fully defined.
v5 = tf.get_variable(name='v3', shape=(2, 2))
print(v5)       # <tf.Variable 'v3_1:0' shape=(2, 2) dtype=float32_ref>

### test2：用 tf.get_variable 创建一个变量
# 1. 用 get_variable 也无法获取这个已存在的变量，提示已经存在，可见不是在获取，而是在新建
# 2. 用 get_variable 新建一个同名的变量，显然也会提示已存在
gv1 = tf.get_variable(name='gv1', shape=(2, 3))
print(gv1)      # <tf.Variable 'gv1:0' shape=(2, 3) dtype=float32_ref>
# gv2 = tf.get_variable(name='gv1')      ### 报错 ValueError: Variable gv1 already exists, disallowed. Did you mean to set reuse=True or reuse=tf.AUTO_REUSE in VarScope? Originally defined at:
# gv3 = tf.get_variable(name='gv1', shape=(2, 2))

#### 结论：直接使用 get_ariable 无法共享变量

小结

1. 共享变量就是两个变量一模一样

2. tf.get_variable 无法获取 tf.Variable 创建的变量，因为 tf.Variable 创建的是唯一变量，无法共享

3. tf.get_variable 无法直接获取 tf.get_variable 创建的变量，因为共享变量需要一定的条件

变量作用域

tensorflow 有一个默认的变量作用域，当然我们可以自己创建新的作用域

##### 默认变量作用域
print(tf.get_variable_scope().name)         # 空

##### 显式地创建一个变量作用域
with tf.variable_scope('scope1'):
    print(tf.get_variable_scope().name)     # scope1

with tf.variable_scope('scope2') as scope:
    print(tf.get_variable_scope().name)     # scope2

tf.get_variable 共享变量的条件是作用域内变量可共享

############# 探索 get_ariable 共享变量的方法 #############
### test1：用 tf.Variable 创建一个变量
# 1. 使该作用域的变量可复用后，用 get_variable 无法获取这个已存在的变量，提示变量已存在
v11 = tf.Variable(tf.random_normal((2, 3)), name='v11')
# tf.get_variable_scope().reuse_variables()       ### 作用域内的变量可复用
# v12 = tf.get_variable(name='v11')

### test2：用 tf.get_variable 创建一个变量
# 1. 使该作用域的变量可复用后，用 get_variable 可以获取这个已存在的变量
gv11 = tf.get_variable(name='gv11', shape=(2, 3))
print(gv11)     # <tf.Variable 'gv11:0' shape=(2, 3) dtype=float32_ref>
tf.get_variable_scope().reuse_variables()       ### 作用域内的变量可复用
gv12 = tf.get_variable(name='gv11')
print(gv12)     # <tf.Variable 'gv11:0' shape=(2, 3) dtype=float32_ref>
print(gv11 is gv12)     # True

小结

1. tf.Variable 创建唯一变量，无论什么情况，都不可共享

2. tf.get_variable 只能共享 tf.get_variable 创建的变量

3. tf.get_variable 共享变量的条件是作用域内变量可复用

作用域内变量可复用

有多种方法，如下

##### 作用域内的变量可复用-method1
with tf.variable_scope('test1') as test:
    v1 = tf.get_variable('v1', shape=(2, 3))
    tf.get_variable_scope().reuse_variables()       ### 作用域内的变量可复用
    v2 = tf.get_variable('v1')
    print(v1 is v2)         # True

##### 作用域内的变量可复用-method2
with tf.variable_scope('test2') as test:
    v3 = tf.get_variable('v3', shape=(2, 3))
    test.reuse_variables()              ### 作用域内的变量可复用
    v4 = tf.get_variable('v3')
    print(v3 is v4)         # True

##### 作用域内的变量可复用-method3
with tf.variable_scope('test3') as test:
    v5 = tf.get_variable('v5', shape=(2, 3))
with tf.variable_scope(test, reuse=True):           ### 作用域内的变量可复用
    v6 = tf.get_variable('v5')
    print(v5 is v6)         # True

小结

1. 需要使用同时使用 get_variable 和 variable_scope

2. 在同一个 variable_scope 内，不需要指定 reuse=True，但需要用 scope.reuse_variables() 或者 tf.get_variable_scope().reuse_variables()

3. 在不同的 variable_scope 内，第一个不需要指定reuse=True，但后面需要指定。

get_variable 机制

get_variable 会判断指定属性的变量是否存在，如果存在，并且该变量空间的 reuse=True，那么就共享之前的变量，否则新建一个，

但是如果没有指定 reuse=True，会提示命名冲突

变量作用域进阶

多重作用域

作用域中的 resuse 默认是 False，调用函数 reuse_variables() 可设置为 True，

一旦设置为True，就不能返回到False，并且该作用域的子空间 reuse 都是True。

如果不想重用变量，那么可以退回到上层作用域，相当于 exit 当前作用域

with tf.variable_scope("root"):
    # At start, the scope is not reusing.
    assert tf.get_variable_scope().reuse == False
    with tf.variable_scope("foo"):
        # Opened a sub-scope, still not reusing.
        assert tf.get_variable_scope().reuse == False
    with tf.variable_scope("foo", reuse=True):
        # Explicitly opened a reusing scope.
        assert tf.get_variable_scope().reuse == True
        with tf.variable_scope("bar"):
            # Now sub-scope inherits the reuse flag.
            assert tf.get_variable_scope().reuse == True
        # with tf.variable_scope("bar2"):
        #     # Now sub-scope inherits the reuse flag.
        #     assert tf.get_variable_scope().reuse == False       # AssertionError
    # Exited the reusing scope, back to a non-reusing one.
    assert tf.get_variable_scope().reuse == False

可以看到在 bar2 作用域内，reuse==False 报错了，因为这个父域是True。

作用域的调用

作用域名字可以作为参数

with tf.variable_scope("foo") as foo_scope:　　# 名字
    v = tf.get_variable("v", [1])
with tf.variable_scope(foo_scope):          # 参数
    w = tf.get_variable("w", [1])
with tf.variable_scope(foo_scope, reuse=True):
    v1 = tf.get_variable("v", [1])
    w1 = tf.get_variable("w", [1])
assert v1 is v
assert w1 is w

with tf.variable_scope('scope1'):
    w1 = tf.Variable(1, name='w1')
    w2 = tf.get_variable(name='w2', initializer=2.)

with tf.variable_scope('scope1', reuse=True):　　# 另一种方式
    w1_p = tf.Variable(1, name='w1')
    w2_p = tf.get_variable(name='w2', initializer=2.)

用as 或者直接用名字都可作为参数。但是有区别，后面会总结。

作用域跳转

不管作用域如何嵌套，当使用with tf.variable_scope()打开一个已经存在的作用域时，就会跳转到这个作用域。

with tf.variable_scope("foo") as foo_scope:
    assert foo_scope.name == "foo"
with tf.variable_scope("bar"):
    with tf.variable_scope("baz") as other_scope:
        assert other_scope.name == "bar/baz"

　　　　　　with tf.variable_scope(foo_scope) as foo_scope2:
　　　　　　　　print(tf.get_variable_scope().name) # foo
　　　　　　　　assert foo_scope2.name == "foo" # Not changed
　　　　　　with tf.variable_scope('foo') as foo_scope3:
　　　　　　　　print(tf.get_variable_scope().name) # bar/baz/foo
　　　　　　　　assert foo_scope3.name == "foo" # AssertionError

这里可以看到，直接用名字没有跳转，而用as跳转成功。

多重作用域下的变量

变量都是通过作用域/变量名来标识，作用域可以像文件路径一样嵌套。

# encoding:utf-8
__author__ = 'HP'
import tensorflow as tf

with tf.variable_scope('s1'):
    x1 = tf.get_variable('data1', [3, 4])
    print(x1)                   # <tf.Variable 's1/data1:0' shape=(3, 4) dtype=float32_ref>
    tf.get_variable_scope().reuse_variables()

    with tf.variable_scope('s11'):
        # x2 = tf.get_variable('data1')           # ValueError: Variable s1/s11/data1 does not exist
        # print(x2)
        pass

        with tf.variable_scope('s1'):
            # x3 = tf.get_variable('data1')       # ValueError: Variable s1/s11/s1/data1 does not exist
            # print(x3)
            pass

    with tf.variable_scope('s1', reuse=True):
        x4 = tf.get_variable('data1')       # ValueError: Variable s1/s1/data1 does not exist
        print(x4)
        pass

with tf.variable_scope('s1', reuse=True):
    x5 = tf.get_variable('data1')
    print(x5)                       # <tf.Variable 's1/data1:0' shape=(3, 4) dtype=float32_ref>

　with tf.variable_scope('s2'):
　　 with tf.variable_scope('s1', reuse=True):
　print(tf.get_variable_scope().name) # s2/s1
　　 # x6 = tf.get_variable('data1') # Variable s2/s1/data1 does not exist
　　 # print(x6)

可以看到变量就像文件一样，这个文件夹内的a文件和另外文件夹内的a文件不是一个文件。

综上得出如下结论：

1. 如果直接用名字，只能在同级作用域下跳转，如上例。

2. 如果用as，可以在任何地方跳转到该作用域

　　// 可以这么理解：如果直接用名字，是相对路径，相当于是在当前目录下创建了一个该名字的文件夹，

　　// 而as是绝对路径，不管在哪调用，都能指定该路径。

命名空间

命名空间，也是一种作用域

name_scope 仅对普通operation 有用，对 get_variable 无效，

variable_scope 不仅对普通operation 有效，也对 get_variable 有效

先上代码

with tf.name_scope('name_test'):
    n1 = tf.constant(1, name='cs1')
    n2 = tf.Variable(tf.zeros([1]), name='v1')
    ww1 = tf.multiply(n2, [1])

    nv1 = tf.get_variable(name='nv1', initializer=1.0)

with tf.variable_scope('v_test'):
    v_n1 = tf.constant(2, name='cs2')
    v_n2 = tf.Variable(tf.zeros([1]), name='v2')
    ww2 = tf.multiply(v_n2, [1])

    v1 = tf.get_variable(name='vv1', initializer=2.0)

### name_scope
print('n1', n1.name)        # n1 name_test/cs1:0
print('n2', n2.name)        # n2 name_test/v1:0
print('nv1', nv1.name)      # nv1 nv1:0                 # 注意和前两个不同，name_scope 对 get_variable 无效

print('ww1', ww1.name)      # ww name_test/Mul:0        # 注意也加上了name_scope

### variable_scope
print('v_n1', v_n1.name)    # v_n1 v_test/cs2:0
print('v_n2', v_n2.name)    # v_n2 v_test/v2:0
print('v1', v1.name)        # v1   v_test/vv1:0         # 注意和前两个相同，name_scope 对 get_variable 有效

print('ww2', ww2.name) # ww2 v_test/Mul:0 # 注意也加上了variable_scope

变量的名称是指针名，变量的name是地址。

共享变量用法进阶

在重复使用（即非第一次使用）时，设置 reuse=True 来再次调用该共享变量作用域（variable_scope）。但是这种方法太繁琐了。

简单方法

def myfunc():
    with tf.variable_scope('test', reuse=tf.AUTO_REUSE):
        w = tf.get_variable('data', [2, 2])
    return w

for i in range(3):
    print(myfunc())

# <tf.Variable 'test/data:0' shape=(2, 2) dtype=float32_ref>
# <tf.Variable 'test/data:0' shape=(2, 2) dtype=float32_ref>
# <tf.Variable 'test/data:0' shape=(2, 2) dtype=float32_ref>