Description
The relative frequency of characters in natural language texts is very important for cryptography. However, the statistics vary for different languages. Here are the top 9 characters sorted by their relative frequencies for several common languages:
English: ETAOINSHR German: ENIRSATUD French: EAISTNRUL Spanish: EAOSNRILD Italian: EAIONLRTS Finnish: AITNESLOKJust as important as the relative frequencies of single characters are those of pairs of characters, so called digrams. Given several text samples, calculate the digrams with the top relative frequencies.
Input
The input contains several test cases. Each starts with a number n on a separate line, denoting the number of lines of the test case. The input is terminated by n=0. Otherwise, 1<=n<=64, and there follow n lines, each with a maximal length of 80 characters. The concatenation of these n lines, where the end-of-line characters are omitted, gives the text sample you have to examine. The text sample will contain printable ASCII characters only.
Output
For each test case generate 5 lines containing the top 5 digrams together with their absolute and relative frequencies. Output the latter rounded to a precision of 6 decimal places. If two digrams should have the same frequency, sort them in (ASCII) lexicographical order. Output a blank line after each test case.
Sample Input
2 Take a look at this!! !!siht ta kool a ekaT 5 P=NP Authors: A. Cookie, N. D. Fortune, L. Shalom Abstract: We give a PTAS algorithm for MaxSAT and apply the PCP-Theorem [3] Let F be a set of clauses. The following PTAS algorithm gives an optimal assignment for F: 0
Sample Output
a 3 0.073171 !! 3 0.073171 a 3 0.073171 t 2 0.048780 oo 2 0.048780 a 8 0.037209 or 7 0.032558 . 5 0.023256 e 5 0.023256 al 4 0.018605
***********************************************************************************************************
存储很重要
***********************************************************************************************************
1 #include<iostream> 2 #include<cstdio> 3 #include<cstring> 4 #include<string> 5 #include<vector> 6 #include<algorithm> 7 #include<map> 8 using namespace std; 9 struct node 10 { 11 char ch1; 12 char ch2; 13 int num; 14 15 }p[70005]; 16 bool cmp(node a,node b) 17 { 18 if(a.num>b.num) 19 return true; 20 if(a.num<b.num) 21 return false; 22 if(a.num==b.num) 23 { 24 if(a.ch1<b.ch1) 25 return true; 26 if(a.ch1>b.ch1) 27 return false; 28 if(a.ch1==b.ch1) 29 { 30 if(a.ch2<b.ch2) 31 return true; 32 return false; 33 } 34 } 35 } 36 char str[1001]; 37 int n,j,i,k; 38 char last; 39 int tot; 40 int main() 41 { 42 while(cin>>n&&n) 43 { 44 tot=0; 45 getchar(); 46 for(i=0;i<128;i++) 47 for(j=0;j<128;j++) 48 { 49 p[i*128+j].ch1=i; 50 p[i*128+j].ch2=j; 51 p[i*128+j].num=0; 52 //cout<<p[i*128+j].ch1<<p[i*128+j].ch2<<endl; 53 } 54 last=0;tot=0; 55 for(i=0;i<n;i++) 56 { 57 gets(str); 58 int len=strlen(str); 59 //tot+=len; 60 if(i>0) 61 { 62 p[last*128+str[0]].num++; 63 tot++; 64 } 65 for(j=0;j<len-1;j++) 66 { 67 p[str[j]*128+str[j+1]].num++; 68 tot++; 69 } 70 last=str[len-1]; 71 } 72 //tot--; 73 sort(p,p+16512,cmp); 74 for(i=0;i<5;i++) 75 { 76 cout<<p[i].ch1<<p[i].ch2<<' '<<p[i].num<<' '; 77 printf("%.6llf ",((double)p[i].num)/((double)tot)); 78 79 } 80 cout<<endl; 81 } 82 83 }