Information Theory is one of the most popular courses in Marjar University. In this course, there is an important chapter about information entropy.
Entropy is the average amount of information contained in each message received. Here, a message stands for an event, or a sample or a character drawn from a distribution or a data stream. Entropy thus characterizes our uncertainty about our source of information. The source is also characterized by the probability distribution of the samples drawn from it. The idea here is that the less likely an event is, the more information it provides when it occurs.
Generally, "entropy" stands for "disorder" or uncertainty. The entropy we talk about here was introduced by Claude E. Shannon in his 1948 paper "A Mathematical Theory of Communication". We also call it Shannon entropy or information entropy to distinguish from other occurrences of the term, which appears in various parts of physics in different forms.
Named after Boltzmann's H-theorem, Shannon defined the entropy Η (Greek letter Η, η) of a discrete random variable X with possible values {x1, x2, ..., xn} and probability mass functionP(X) as:
Here E is the expected value operator. When taken from a finite sample, the entropy can explicitly be written as
Where b is the base of the logarithm used. Common values of b are 2, Euler's number e, and 10. The unit of entropy is bit for b = 2, nat for b = e, and dit (or digit) for b = 10 respectively.
In the case of P(xi) = 0 for some i, the value of the corresponding summand 0 logb(0) is taken to be a well-known limit:
Your task is to calculate the entropy of a finite sample with N values.
Input
There are multiple test cases. The first line of input contains an integer T indicating the number of test cases. For each test case:
The first line contains an integer N (1 <= N <= 100) and a string S. The string S is one of "bit", "nat" or "dit", indicating the unit of entropy.
In the next line, there are N non-negative integers P1, P2, .., PN. Pi means the probability of the i-th value in percentage and the sum of Pi will be 100.
Output
For each test case, output the entropy in the corresponding unit.
Any solution with a relative or absolute error of at most 10-8 will be accepted.
Sample Input
3 3 bit 25 25 50 7 nat 1 2 4 8 16 32 37 10 dit 10 10 10 10 10 10 10 10 10 10
Sample Output
1.500000000000 1.480810832465 1.000000000000
牡丹江赛区的第二道签到题。难度不大,基本A题通过后全部人I题也開始非常快通过,题目看起来非常难理解,可是一旦看懂事实上发现这事实上就是一道英文题。题目就是给一堆序列。然后给n个数a1,a2...an。然后对每一个数字求出(ai/sum)*log(ai/sum)的总和,当bit是,log以2为底,当nat时,log以e为底,当dit时。log以10为底,之后计算证明当ai=0时,对结果无影响,所以能够直接忽略,然后直接带入计算就可以AC此题~~~详细AC代码例如以下:
#include<cstdio> #include<iostream> #include<cstring> #include<cmath> #include<cstdlib> #include<algorithm> #include<map> #include<vector> #include<queue> using namespace std; int a[105]; int main() { // freopen("in.txt","r",stdin); int t; cin>>t; while(t--) { int n; string s; scanf("%d",&n); cin>>s; int sum=0; for(int i=0;i<n;i++) { scanf("%d",&a[i]); sum+=a[i]; } double res=0; if(s=="bit") { for(int i=0;i<n;i++) { if(a[i]==0) continue; double p1=-log2(a[i]*1.0/sum)*a[i]/sum; res+=p1; } } else if(s=="nat") { for(int i=0;i<n;i++) { if(a[i]==0) continue; double p1=-log(a[i]*1.0/sum)*a[i]/sum; res+=p1; } } else { for(int i=0;i<n;i++) { if(a[i]==0) continue; double p1=-log10(a[i]*1.0/sum)*a[i]/sum; res+=p1; } } printf("%.12f ",res); } return 0; }
版权声明:本文博客原创文章,博客,未经同意,不得转载。