zoukankan html css js c++ java

机器学习基石作业2

出现noisy时错误的可能性，这题是简单的概率问题，μ是h出错的概率，当y=f(x)时出错：λμ；当otherwise时出错：(1-λ)(1-μ)。

=> λμ+(1-λ)(1-μ)

h的表现和μ无关，λμ+(1-λ)(1-μ)=1-λ-(1-2λ)μ => λ=0.5

根据公式4(2N)^d_VC exp(-1/8 ξ²N)=0.05求得sample size最接近的是460000

 1 import matplotlib.pyplot as plt
 2 import numpy as np
 3 
 4 t = np.arange(-10,10,0.01)
 5 
 6 def original_vc_bound(n):
 7     return np.sqrt(((8.0 / n) * (np.log(80) + 50 * np.log(2 * n))))
 8 
 9 def rademacher_penalty_bound(n):
10     return 1.0 / n + np.sqrt((2.0 / n) * np.log(20)) + np.sqrt((2.0 * (np.log(2 * n) + 50 * np.log(n))) / n)
11 
12 def parrondo_vandenBroek(t, n):
13     return t - np.sqrt(1.0 / n * (2 * t + np.log(120) + 50 * np.log(2 * n)))
14 
15 def devroye(t, n):
16     return t - np.sqrt(1.0 / (2 * n) * (4 * t * (1 + t) + np.log(80) + 100 * np.log(n)))
17 
18 def variant_VC_bound(n):
19     return np.sqrt(16.0 / n * (np.log(2 / np.sqrt(0.05)) + 50 * np.log(n)))
20 
21 print original_vc_bound(10000)
22 print rademacher_penalty_bound(10000)
23 print variant_VC_bound(10000)
24 
25 plt.subplot(211)
26 plt.plot(t,parrondo_vandenBroek(t, 10000))
27 plt.subplot(212)
28 plt.plot(t, devroye(t, 10000))
29 plt.show()
30 
31 print parrondo_vandenBroek(0.331308785962, 10000)
32 print devroye(0.331308785962, 10000)
33 
34 print parrondo_vandenBroek(0.22, 10000)
35 print devroye(0.22, 10000)

parrondo_vandenBroek在图上可以看到大概当x = 0.22时，y < 0

devroye在图书可以看到大概当x = 0.22时，y > 0，x的bound肯定小于0.22

综上Devroye bound smallest

 1 import matplotlib.pyplot as plt
 2 import numpy as np
 3 
 4 t = np.arange(-10,10,0.01)
 5 
 6 def original_vc_bound(n):
 7     return np.sqrt(((8.0 / n) * (np.log(80) + 50 * np.log(2 * n))))
 8 
 9 def rademacher_penalty_bound(n):
10     return 1.0 / n + np.sqrt((2.0 / n) * np.log(20)) + np.sqrt((2.0 * (np.log(2 * n) + 50 * np.log(n))) / n)
11 
12 def parrondo_vandenBroek(t, n):
13     return t - np.sqrt(1.0 / n * (2 * t + np.log(120) + 50 * np.log(2 * n)))
14 
15 def devroye(t, n):
16     return t - np.sqrt(1.0 / (2 * n) * (4 * t * (1 + t) + np.log(80) + 100 * np.log(n)))
17 
18 def variant_VC_bound(n):
19     return np.sqrt(16.0 / n * (np.log(2 / np.sqrt(0.05)) + 50 * np.log(n)))
20 
21 print original_vc_bound(5)
22 print rademacher_penalty_bound(5)
23 print variant_VC_bound(5)
24 
25 plt.subplot(211)
26 plt.plot(t,parrondo_vandenBroek(t, 5))
27 plt.subplot(212)
28 plt.plot(t, devroye(t, 5))
29 plt.show()
30 
31 print parrondo_vandenBroek(7.04877656418, 5)
32 print devroye(7.04877656418, 5)
33 
34 print parrondo_vandenBroek(5.5, 5)
35 print devroye(5.5, 5)

很明显，最tightest的是

parrondo_vandenBroek


此题就是将N个数，拆成positive和negtive,最多只能三部分，排列组合：

C(n-1,2)*2 + C(n-1,1)*2 +2 = n^2_-n+2

由于已经有了m_h^（N)所以直接计算break point，n=3时m=8,n=4时m=14 => k=4 => dvc=3




可以把此题看成点在以a为半径的圆和以b为半径的圆之间时为1，其余情况-1

可将所有情况分为2种，N个点全不在a，b之间和其他 

=> 在n个点的n+1个空间里选2个+n个点之外的任意空间 => C(n+1,2)+1



degree is D，hypothesis parameters c = (c0;c1;····;cd );

=> dVC = D + 1

把[xi > ti]看成一个d维度的决策树，那么可以选出这样2^d个点使得[xi > ti]后，得到决策树的2^d个叶子节点v。可以选择

不同的S，使得2^d个叶子节点v，得到所有2^2^d个结果。即N=2^d时可以shatter，N=2^d+1时，[xi > ti]后最多只能得到

2^d个叶子节点，所有必定有一个重复，所以不管使用什么S，都无法得到2^(2^d+1)个结果，即N=2^d+1时不可shatter，综上

N=2^d+1是break point => dvc =2^d

感觉上由于α是随便取的，每个点都有机会取到两种情况所以2ⁿ，不知道该如何证明。。。。

这题很明显，等于是求每个函数的复杂度，√N^dvc是O(N^(dvc/2))；mh(n/2)是O((n/2)^dvc)；2^dvc；

2ⁱ*mh(n-i)由于指数的存在肯定是大于mh(n)所以至少是O(N^(dvc));

由于N ≥ dvc ≥ 2 => O(N^(dvc)) > O(N^(dvc/2))，O((n/2)^dvc)，2^dvc

=> 2ⁱ*mh(n-i) upper

很明显N是不对的，如果存在成长函数mh(n)=n，那么它的break point就在n=1的时候，那么如果当n=2

时，mh(2)=2，那么其中至少有一个点存在2种情况 => 该点shatter =>矛盾

此时mh(n)=1

intersection，肯定是越交集越小，依题意最小肯定是0，最大肯定也是小于H1~Hk里最小的

union，肯定是越并越大，所以最小也得H1~Hk里最大的，关于为什么不是0老师给了很好的解答

（any single hypothesis is of VC dimension 0，but the union of many hypotheses (which is a hypothesis set) can be of non-zero VC dimension

upper bound比较麻烦，思考了很久，在论坛里看老师和小伙伴的回帖，有了些启发。我们思考这样一种情况H_k

dvc为n，总共有2^n+k-1-(k-1)中情况也就是缺k-1种情况shatter，而这k-1种情况分别是single hypothesis H₁~H_K-1,

很明显unionH₁~H_k的dvc是n+k-1 = sum(dvc)+k-1

并且：不考虑dvc max的H，每个H的情况2^dvc_hi≤ hi ≤ 2^dvc_hi+t_i < 2^dvc_hi⁺¹

因为x>1时，2^x₁*2^x₂ > 2^x₁+2^x₂=> union(hi) ≤ sum(2^dvc_hi+t_i) ≤ ∏(2^dvc_hi+1)

=> dvc(union(hi)) ≤ dvc(∏(2^dvc_hi+1)) = sum(dvc(hi)+1)

=>加上max 后dvc(∏(2^dvc_hi+1)) = sum(dvc(hi))+k-1

当Θ>0时：

当s==1:Eout=((1-Θ)*s*0.2+(1-(1-Θ)*s)*0.8+1*0.2)/2=0.5+0.3*(Θ-1)*s

当s==-1:Eout=(-(1-Θ)*s*0.8+(1+(1-Θ)*s)*0.2+1*0.8)/2=0.5+0.3*(Θ-1)*s

由于Θ<0时和Θ>0时是对称的，综上

=> Eout=0.5+0.3*(|Θ|-1)*s

x = [i / 10.0 for i in range(-10, 11)]
x.remove(0)
y = []
error = 0
best = 100
for i in x:
    if i <= 0:
        y.append(-1)
    else:
        y.append(1)
y[0] = 1
y[9] = 1
y[19] = -1
y[10] = -1

for i in range(5000):
    if i < 2500:
        s = -1
    else:
        s = 1
    theta = -1 + (i % 2500) / 1250.0

    for j in range(20):
        if x[j] > theta:
            t = 1
        else:
            t = -1
        if t * s * y[j] < 0:
            error += 1
    if best > error:
        best = error
    error = 0

print best / 20.0

 1 import math
 2 x = [i / 10.0 for i in range(-10, 11)]
 3 x.remove(0)
 4 y = []
 5 error = 0
 6 best = 100
 7 bestTheta = 0
 8 bestS = 0
 9 
10 for i in x:
11     if i <= 0:
12         y.append(-1)
13     else:
14         y.append(1)
15 y[0] = 1
16 y[9] = 1
17 y[19] = -1
18 y[10] = -1
19 
20 for i in range(5000):
21     if i < 2500:
22         s = -1
23     else:
24         s = 1
25     theta = -1 + (i % 2500) / 1250.0
26 
27     for j in range(20):
28         if x[j] > theta:
29             t = 1
30         else:
31             t = -1
32         if t * s * y[j] < 0:
33             error += 1
34     if best > error:
35         best = error
36         bestTheta = theta
37         bestS = s
38     error = 0
39 
40 print 0.5 + 0.3 * (math.fabs(bestTheta) - 1) * bestS

 1 import math
 2 x1 = []
 3 x2 = []
 4 x3 = []
 5 x4 = []
 6 y = []
 7 file = open("C:/Users/samsung/Desktop/ntumlone_hw1_hw1_18_train.dat")
 8 for line in file:
 9     lines = line.split(' ')
10     x1.append(float(lines[0]))
11     x2.append(float(lines[1]))
12     x3.append(float(lines[2]))
13     x4.append(float(lines[3].split('	')[0]))
14     y.append(int(lines[3].split('	')[1].split('
')[0]))
15 file.close()
16 x = [x1, x2, x3, x4]
17 error = 0
18 best = 10000
19 bestTheta = 0
20 bestS = 0
21 for t in range(4):
22     for i in range(5000):
23         if i < 2500:
24             s = -1
25         else:
26             s = 1
27         theta = 0 + (i % 2500) / 2500.0
28 
29         for j in range(500):
30             if x[t][j] > theta:
31                 a = 1
32             else:
33                 a = -1
34             if a * s * y[j] < 0:
35                 error += 1
36         if best > error:
37             best = error
38             bestTheta = theta
39             bestS = s
40         error = 0
41 
42 print best / 500.0

 1 import math
 2 x1 = []
 3 x2 = []
 4 x3 = []
 5 x4 = []
 6 y = []
 7 file = open("C:/Users/samsung/Desktop/ntumlone_hw1_hw1_18_train.dat")
 8 for line in file:
 9     lines = line.split(' ')
10     x1.append(float(lines[0]))
11     x2.append(float(lines[1]))
12     x3.append(float(lines[2]))
13     x4.append(float(lines[3].split('	')[0]))
14     y.append(int(lines[3].split('	')[1].split('
')[0]))
15 file.close()
16 x = [x1, x2, x3, x4]
17 error = 0
18 best = 10000
19 bestTheta = 0
20 bestS = 0
21 bestT = 0
22 for t in range(4):
23     for i in range(100000):
24         if i < 50000:
25             s = -1
26         else:
27             s = 1
28         theta = 0 + (i % 50000) / 50000.0
29 
30         for j in range(500):
31             if x[t][j] > theta:
32                 a = 1
33             else:
34                 a = -1
35             if a * s * y[j] < 0:
36                 error += 1
37         if best > error:
38             best = error
39             bestTheta = theta
40             bestS = s
41             bestT = t
42         error = 0
43 print bestT
44 print bestTheta
45 print bestS
46 print best / 500.0
47 
48 file = open("C:/Users/samsung/Desktop/ntumlone_hw1_hw1_18_test.dat")
49 xtest = []
50 ytest = []
51 for line in file:
52     lines = line.split(' ')
53     xtest.append(float(lines[3].split('	')[0]))
54     ytest.append(int(lines[3].split('	')[1].split('
')[0]))
55 
56 file.close()
57 for j in range(500):
58     if xtest[j] > bestTheta:
59         a = 1
60     else:
61         a = -1
62     if a * bestS * ytest[j] < 0:
63         error += 1
64 print error / 500.0

查看全文

相关阅读:
列表,字典,元组,集合内置方法
 数据类型内置方法（1）
if判断与while、for循环语句
 与用户交互、格式化输出、基本运算符
 执行python程序的两种方式
 # 操作系统与编程语言分类
 drf框架2-序列化与反序列化
 drf框架1
前端-vue路由传参、axios前后台交互、cookie设置
 前端-vue的配置和使用

原文地址：https://www.cnblogs.com/wanderingzj/p/5000299.html