1、 参考书《数据压缩导论(第4版)》 Page 100
5、给定如表4-9所示的概率模型,求出序列a1a1a3a2a3a1 的实值标签。
解:由题意可得:字符集{a1,a2,a3} , 其中p(a1)=0.2 ,p(a2)=0.3 ,p(a3)=0.5
则X(ai)=i, X(a1)=1,X(a2)=2,X(a3)=3
FX(0)=0,FX(1)=0.2 ,FX(2)=0.5 ,FX(3)=1.0, U(0)=1 ,L(0)=0
由公式知,下界:L(n)=L(n-1)+(U(n-1)-L(n-1))Fx(xn-1)
上界:U(n)=L(n-1)+(U(n-1)-L(n-1))Fx(xn)
首次 出现a1时:
L(1)=L(0)+(U(0)-L(0))Fx(0)=0
U(1)=L(0)+(U(0)-L(0))Fx(1)=0.2
第二次出现a1时:
L(2)=L(1)+(U(1)-L(1))Fx(0)=0
U(2)=L(1)+(U(1)-L(1))Fx(1)=0.04
第三次出现a3时:
L(3)=L(2)+(U(2)-L(2))Fx(2)=0.02
U(3)=L(2)+(U(2)-L(2))Fx(3)=0.04
第四次出现a2时:
L(4)=L(3)+(U(3)-L(3))Fx(1)=0.024
U(4)=L(3)+(U(3)-L(3))Fx(2)=0.03
第五次出现a3时:
L(5)=L(4)+(U(4)-L(4))Fx(2)=0.027
U(5)=L(4)+(U(4)-L(4))Fx(3)=0.03
最后一次出现a1时:
L(6)=L(5)+(U(5)-L(5))Fx(0)=0.027
U(6)=L(5)+(U(5)-L(5))Fx(1)=0.0276
综上所述可得序列a1a1a3a2a3a1的实值标签:T(113231)=(L(6)+ U(6))/2=0.0273
6、对于表4-9所示的概率模型,对于一个标签为0.63215699的长度为10的序列进行解码。
解:由题意可知:对于标签为0.63215699长度为10的序列进行解码过程如下:
由公式知,下界:L(n)=L(n-1)+(U(n-1)-L(n-1))Fx(xn-1)
上界:U(n)=L(n-1)+(U(n-1)-L(n-1))Fx(xn)
首先设L(0)=0,U(0)=1 则
L(1)=0+(1-0)Fx(x1-1)=Fx(x1-1)
U(1)=0+(1-0)Fx(x1)=Fx(x1)
假设x1=1,则区间为[0,0.2)
假设x1=2,则区间为[0.2,0.5)
假设x1=3,则区间为[0.5,1)
由于0.63215699在[0.5,1)中,因此取x1=3
L(2)=0.5+(1-0.5)Fx(x2-1)=0.5+0.5Fx(x2-1)
U(2)=0.5+(1-0.5)Fx(x2)=0.5+0.5Fx(x2)
假设x2=1,则区间为[0.5,0.6)
假设x2=2,则区间为[0.6,0.75)
假设x2=3,则区间为[0.75,1)
由于0.63215699在[0.6,0.75)中,因此取x2=2
同理,x3,x4,x5,x6,x7,x8,x9,x10也是这样:
L(3)=0.6+(0.75-0.6)Fx(x3-1)=0.6+0.15Fx(x3-1)
U(3)=0.6+(0.75-0.6)Fx(x3)=0.6+0.15Fx(x3)
假设x3=1,则区间为[0.6,0.63)
假设x3=2,则区间为[0.63,0.675)
假设x3=3,则区间为[0.675,0.75)
由于0.63215699在[0.63,0.675)中,因此取x3=2
L(4)=0.63+(0.675-0.63)Fx(x4-1)=0.63+0.045Fx(x4-1)
U(4)=0.625+(0.675-0.3)Fx(x4)=0.63+0.045Fx(x4)
假设x4=1,则区间为[0.63,0.639)
假设x4=2,则区间为[0.639,0.6525)
假设x4=3,则区间为[0.6525,0.675)
由于0.63215699在[0.63,0.639)中,因此取x4=1
L(5)=0.63+(0.639-0.63)Fx(x5-1)=0.63+0.009Fx(x5-1)
U(5)=0.63+(0.639-0.63)Fx(x5)=0.63+0.009Fx(x5)
假设x5=1,则区间为[0.63,0.6318)
假设x5=2,则区间为[0.6318,0.6345)
假设x5=3,则区间为[0.6345,0.639)
由于0.63215699在[0.6318,0.6345)中,因此取x5=2
L(6)=0.6318+(0.6345-0.6318)Fx(x6-1)=0.6318+0.0027Fx(x6-1)
U(6)=0.6318+(0.6345-0.6318)Fx(x6)=0.6318+0.0027Fx(x6)
假设x6=1,则区间为[0.6318,0.63234)
假设x6=2,则区间为[0.63234,0.63315)
假设x6=3,则区间为[0.63315,0.6345)
由于0.63215699在[0.6318,0.63234)中,因此取x6=1
L(7)=0.6318+(0.63234-0.6318)Fx(x7-1)=0.6318+0.00054Fx(x7-1)
U(7)=0.6318+(0.63234-0.6318)Fx(x7)=0.6318+0.00054Fx(x7)
假设x7=1,则区间为[0.6318,0.631908)
假设x7=2,则区间为[0.631908,0.63207)
假设x7=3,则区间为[0.63207,0.63234)
由于0.63215699在[0.63207,0.63234)中,因此取x7=3。
L(8)=0.63207+(0.63234-0.63207)Fx(x8-1)=0.63207+0.00027Fx(x8-1)
U(8)=0.63207+(0.63234-0.63207)Fx(x8)=0.63207+0.00027Fx(x8)
假设x8=1,则区间为[0.63207,0.632124)
假设x8=2,则区间为[0.632124,0.632205)
假设x8=3,则区间为[0.632205,0.63234)
由于0.63215699在[0.632124,0.632205)中,因此取x8=2
L(9)=0.632124+(0.632205-0.632085)Fx(x9-1)=632124+0.0000243Fx(x9-1)
U(9)=632124+(0.632205-0.632085)Fx(x9)=632124+0.0000243Fx(x9)
假设x9=1,则区间为[0.632124,0.6321402)
假设x9=2,则区间为[0.6321402,0.6321645)
假设x9=3,则区间为[0.6321645,0.632205)
由于0.63215699在[0.6321402,0.6321645)中,因此取x9=2
L(10)=0.6321402+(0.6321645-0.6321402)Fx(x10-1)=0.6321402+0.0000243Fx(x10-1)
U(10)=0.6321402+(0.6321645-0.6321402)Fx(x10)=0.6321402+0.0000243Fx(x10)
假设x10=1,则区间为[0.6321402,0.63214506)
假设x10=2,则区间为[0.63214506,0.63215235)
假设x10=3,则区间为[0.63215235,0.6321645)
由于0.63215699在[0.63215235,0.6321645)中,因此取x10=3
综上所述序列为3221213223即a3a2a2a1a2a1a3a2a2a3