zoukankan      html  css  js  c++  java
  • HUD1686Oulipokmp模板题/哈希模板题

    The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book: 

    Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais… 

    Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces. 

    So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap. 

    InputThe first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format: 

    One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W). 
    One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000. 
    OutputFor every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T. 

    Sample Input

    3
    BAPC
    BAPC
    AZA
    AZAZAZA
    VERDI
    AVERDXIVYERDIAN

    Sample Output

    1
    3
    0


    题意:计算模式串在原串中出现的次数

    对于next数组的理解是理解kmp的关键。
    next[i]:记录的是前后缀最长公共长度。

    KMP:
    
    
    
     
     1 #include<stdio.h>
     2 #include<iostream>
     3 #include<string.h>
     4 using namespace std;
     5 
     6 const int N=1000020;
     7 const int M=10020;
     8 
     9 char s[N];//原串
    10 char t[M];//模式串
    11 int nextt[M];
    12 
    13 void getnext(int len)//求的是模式串的next数组
    14 {
    15     int i=0,j=-1;
    16     nextt[0]=-1;
    17     while(i<len)
    18     {
    19         if(j<0||t[i]==t[j])
    20             nextt[++i]=++j;
    21         else
    22             j=nextt[j];
    23     }
    24 }
    25 
    26 int kmp(int m,int n)//m模式串长度、n原串长度
    27 {
    28     int i=0,j=0,ans=0;
    29     while(i<n)
    30     {
    31         if(j==-1||t[j]==s[i])
    32         {
    33             i++;
    34             j++;
    35         }
    36         else
    37             j=nextt[j];
    38         if(j==m)
    39         {
    40             ans++;
    41             j=nextt[j];
    42         }
    43     }
    44     return ans;
    45 }
    46 
    47 int main()
    48 {
    49     int tt;
    50     scanf("%d",&tt);
    51     while(tt--)
    52     {
    53         memset(s,'\0',sizeof(s));
    54         memset(t,'\0',sizeof(t));
    55         memset(nextt,0,sizeof(nextt));
    56         scanf("%s%s",t,s);//模式串、原串
    57         int len1=strlen(t);//模式串
    58         int len2=strlen(s);//原串
    59         getnext(len1);
    60         printf("%d\n",kmp(len1,len2));
    61     }
    62     return 0;
    63 }

    哈希:

     1 #include<stdio.h>
     2 #include<iostream>
     3 #include<cmath>
     4 #include<algorithm>
     5 #include<string.h>
     6 #include<queue>
     7 #include<map>
     8 using namespace std;
     9 typedef unsigned long long ull;
    10 const int N=1e6+20;
    11 
    12 char a[N],b[N];
    13 ull p[N],sum[N],x=131;
    14 //求a(子串)在b(母串)中出现多少次
    15 
    16 void w()
    17 {
    18     p[0]=1;
    19     for(int i=1; i<1000000; i++)
    20         p[i]=p[i-1]*x;//预处理出x^n
    21 }
    22 
    23 int main()
    24 {
    25     w();
    26     int t;
    27     scanf("%d",&t);
    28     while(t--)
    29     {
    30         scanf("%s %s",a+1,b+1);//使得下标从1开始
    31         int la=strlen(a+1);//
    32         int lb=strlen(b+1);//
    33         sum[0]=0;
    34         for(int i=1; i<=lb; i++)
    35             sum[i]=sum[i-1]*x+(ull)(b[i]-'A'+1);
    36         ull s=0;
    37         for(int i=1; i<=la; i++)
    38             s=s*x+(ull)(a[i]-'A'+1);//*x是为了化成x进制数
    39         int ans=0;
    40         for(int i=0; i<=lb-la; i++)
    41         {
    42             if(s==sum[i+la]-sum[i]*p[la])
    43                 ans++;
    44         }
    45         printf("%d\n",ans);
    46     }
    47     return 0;
    48 }

    不明白为什么哈希的时间比kmp慢而且占用的内存都快是kmp的10倍了???

  • 相关阅读:
    链接唤醒IOSApp
    C#抽象属性
    c#结构体与类的区别
    广告学入门
    个性化推荐十大挑战[
    MapReduce 读取和操作HBase中的数据
    mysql sql命令大全
    从B 树、B+ 树、B* 树谈到R 树
    MapReduce操作HBase
    Meanshift,聚类算法
  • 原文地址:https://www.cnblogs.com/OFSHK/p/11390965.html
Copyright © 2011-2022 走看看