zoukankan      html  css  js  c++  java
  • HDU 1686 Oulipo 求大串中最多可匹配多少个小串(kmp)

    http://acm.hdu.edu.cn/showproblem.php?pid=1686

    Oulipo

    Time Limit: 3000/1000 MS (Java/Others)    Memory Limit: 32768/32768 K (Java/Others)
    Total Submission(s): 6098    Accepted Submission(s): 2448


    Problem Description
    The French author Georges Perec (1936–1982) once wrote a book, La disparition, without the letter 'e'. He was a member of the Oulipo group. A quote from the book:

    Tout avait Pair normal, mais tout s’affirmait faux. Tout avait Fair normal, d’abord, puis surgissait l’inhumain, l’affolant. Il aurait voulu savoir où s’articulait l’association qui l’unissait au roman : stir son tapis, assaillant à tout instant son imagination, l’intuition d’un tabou, la vision d’un mal obscur, d’un quoi vacant, d’un non-dit : la vision, l’avision d’un oubli commandant tout, où s’abolissait la raison : tout avait l’air normal mais…

    Perec would probably have scored high (or rather, low) in the following contest. People are asked to write a perhaps even meaningful text on some subject with as few occurrences of a given “word” as possible. Our task is to provide the jury with a program that counts these occurrences, in order to obtain a ranking of the competitors. These competitors often write very long texts with nonsense meaning; a sequence of 500,000 consecutive 'T's is not unusual. And they never use spaces.

    So we want to quickly find out how often a word, i.e., a given string, occurs in a text. More formally: given the alphabet {'A', 'B', 'C', …, 'Z'} and two finite strings over that alphabet, a word W and a text T, count the number of occurrences of W in T. All the consecutive characters of W must exactly match consecutive characters of T. Occurrences may overlap.

     
    Input
    The first line of the input file contains a single number: the number of test cases to follow. Each test case has the following format:

    One line with the word W, a string over {'A', 'B', 'C', …, 'Z'}, with 1 ≤ |W| ≤ 10,000 (here |W| denotes the length of the string W).
    One line with the text T, a string over {'A', 'B', 'C', …, 'Z'}, with |W| ≤ |T| ≤ 1,000,000.
     
    Output
    For every test case in the input file, the output should contain a single number, on a single line: the number of occurrences of the word W in the text T.

     
    Sample Input
    3
    BAPC
    BAPC
    AZA
    AZAZAZA
    VERDI
    AVERDXIVYERDIAN
     
     
     1 #include <iostream>
     2 #include <stdlib.h>
     3 #include <stdio.h>
     4 #include <cstring>
     5 using namespace std;
     6 int n,m,nxt[10005],kk,t;
     7 char b[10005],a[1000005];
     8 ///此题在基础的kmp上加了多次匹配。
     9 ///就意味着我们在匹配完一次字串后,要跳到最适合的位置,继续查找
    10 ///继续利用kmp的思想。某些位置已经匹配过,就不要匹配了。
    11 ///       xxxxxxxabbaab*xxxxxx
    12 ///              abbaaba
    13 //我们跳跃之后的位置 abbaaba  而跳跃的位置与next数组有关
    14 //kmp中主串的位置都没有被调动,只是next数组的下标被调动(自己写的代码乱动了,我又卖萌了。。。)
    15 void buildnxt()
    16 {
    17    int j,k;
    18    m=strlen(b);
    19     nxt[0]=-1;
    20     j=0;k=-1;
    21     while(j<m)
    22     {
    23         if((k==-1)||b[j]==b[k])
    24         {
    25             j++;
    26             k++;
    27             nxt[j]=k;
    28         }
    29         else k=nxt[k];
    30     }
    31 }
    32 int kmp()
    33 {
    34     int k=0,l=0,cou=0;
    35     n=strlen(a);
    36     /*int ans=m,kk=nxt[m];///ans在字串中下标,和起点距离ans+1
    37     while(1)
    38     {
    39         if(kk!=0&&kk!=-1) {ans=kk;kk=nxt[ans];}
    40         else break;
    41     }///要找最小的跳跃点,所以从next尾端返回去找到首个非负值。
    42 额 这个想法是没错,但是时间上还是不够优化。
    43 对与最小跳跃点的话也就是中间跳跃点,最小跳跃点对应的字符串匹配失败。没必要要该点匹配。
    44 于是我们最省事的做法还是直接往前跳一步,有可能匹配成功。
    45 */
    46     while(k<n)
    47     {
    48         if((l==-1)||a[k]==b[l])
    49         {
    50             k++;
    51             l++;
    52         }
    53         else l=nxt[l];
    54         if(l==m)
    55         {
    56             cou++;
    57             /*if(kk==0) continue;///如果是尾端next数组是0的话,主串中匹配的子串中没有重复。
    58             ///也就是说在匹配的主串中,没有可以跳跃的点。
    59             if(k==n-1) break;///如果k已经是主串末尾了,就不能还有继续可以匹配的字串了。
    60             k=k-l+ans;///k-l(起点)+ans
    61             l=0;*/
    62 l=nxt[l];//next跳到次大子串点重新匹配,跳过已经匹配好的部分
    63         }
    64     }
    65     return cou;
    66 }
    67 int main()
    68 {
    69     scanf("%d",&t);
    70     getchar();
    71     while(t--)
    72     {
    73         gets(b);
    74         gets(a);
    75         memset(nxt,0,sizeof(nxt));
    76         buildnxt();
    77         printf("%d
    ",kmp());
    78     }
    79     return 0;
    80 }
    View Code
     
  • 相关阅读:
    安装jar包到本地仓库和远程仓库
    服务之间的资源权限校验
    函数指针
    malloc分配内存
    cuda_vs_报错无法解析的外部错误
    c语言读写文件
    C++使用using namespace std报错分析与解决方案
    MPI环境配置
    c语言学习
    openMP
  • 原文地址:https://www.cnblogs.com/linxhsy/p/4449084.html
Copyright © 2011-2022 走看看