zoukankan html css js c++ java

HUD1686Oulipokmp模板题/哈希模板题

The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

InputThe first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
OutputFor every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

Sample Input

3
BAPC
BAPC
AZA
AZAZAZA
VERDI
AVERDXIVYERDIAN

Sample Output

1
3
0


题意：计算模式串在原串中出现的次数

对于next数组的理解是理解kmp的关键。
next[i]:记录的是前后缀最长公共长度。


KMP：

 1 #include<stdio.h>
 2 #include<iostream>
 3 #include<string.h>
 4 using namespace std;
 5 
 6 const int N=1000020;
 7 const int M=10020;
 8 
 9 char s[N];//原串
10 char t[M];//模式串
11 int nextt[M];
12 
13 void getnext(int len)//求的是模式串的next数组
14 {
15     int i=0,j=-1;
16     nextt[0]=-1;
17     while(i<len)
18     {
19         if(j<0||t[i]==t[j])
20             nextt[++i]=++j;
21         else
22             j=nextt[j];
23     }
24 }
25 
26 int kmp(int m,int n)//m模式串长度、n原串长度
27 {
28     int i=0,j=0,ans=0;
29     while(i<n)
30     {
31         if(j==-1||t[j]==s[i])
32         {
33             i++;
34             j++;
35         }
36         else
37             j=nextt[j];
38         if(j==m)
39         {
40             ans++;
41             j=nextt[j];
42         }
43     }
44     return ans;
45 }
46 
47 int main()
48 {
49     int tt;
50     scanf("%d",&tt);
51     while(tt--)
52     {
53         memset(s,'\0',sizeof(s));
54         memset(t,'\0',sizeof(t));
55         memset(nextt,0,sizeof(nextt));
56         scanf("%s%s",t,s);//模式串、原串
57         int len1=strlen(t);//模式串
58         int len2=strlen(s);//原串
59         getnext(len1);
60         printf("%d\n",kmp(len1,len2));
61     }
62     return 0;
63 }

哈希：

 1 #include<stdio.h>
 2 #include<iostream>
 3 #include<cmath>
 4 #include<algorithm>
 5 #include<string.h>
 6 #include<queue>
 7 #include<map>
 8 using namespace std;
 9 typedef unsigned long long ull;
10 const int N=1e6+20;
11 
12 char a[N],b[N];
13 ull p[N],sum[N],x=131;
14 //求a(子串)在b(母串)中出现多少次
15 
16 void w()
17 {
18     p[0]=1;
19     for(int i=1; i<1000000; i++)
20         p[i]=p[i-1]*x;//预处理出x^n
21 }
22 
23 int main()
24 {
25     w();
26     int t;
27     scanf("%d",&t);
28     while(t--)
29     {
30         scanf("%s %s",a+1,b+1);//使得下标从1开始
31         int la=strlen(a+1);//短
32         int lb=strlen(b+1);//长
33         sum[0]=0;
34         for(int i=1; i<=lb; i++)
35             sum[i]=sum[i-1]*x+(ull)(b[i]-'A'+1);
36         ull s=0;
37         for(int i=1; i<=la; i++)
38             s=s*x+(ull)(a[i]-'A'+1);//*x是为了化成x进制数
39         int ans=0;
40         for(int i=0; i<=lb-la; i++)
41         {
42             if(s==sum[i+la]-sum[i]*p[la])
43                 ans++;
44         }
45         printf("%d\n",ans);
46     }
47     return 0;
48 }

不明白为什么哈希的时间比kmp慢而且占用的内存都快是kmp的10倍了？？？

查看全文

相关阅读:
配置文件配置网络
 安装Linux centos 7.3
java二维字符数组的输入
 前端保存JSON文件到本地
 在Springboot中使用swagger2
Vue better-scroll使用指南
 解决端口占用问题
 CheckSum(校验和)计算
 区分按字寻址与按字节寻址
 进制转换

原文地址：https://www.cnblogs.com/OFSHK/p/11390965.html