zoukankan      html  css  js  c++  java
  • ZOJ3228 Searching the String —— AC自动机 + 可重叠/不可重叠

    题目链接:https://vjudge.net/problem/ZOJ-3228

    Searching the String

    Time Limit: 7 Seconds      Memory Limit: 129872 KB

    Little jay really hates to deal with string. But moondy likes it very much, and she's so mischievous that she often gives jay some dull problems related to string. And one day, moondy gave jay another problem, poor jay finally broke out and cried, " Who can help me? I'll bg him! "

    So what is the problem this time?

    First, moondy gave jay a very long string A. Then she gave him a sequence of very short substrings, and asked him to find how many times each substring appeared in string A. What's more, she would denote whether or not founded appearances of this substring are allowed to overlap.

    At first, jay just read string A from begin to end to search all appearances of each given substring. But he soon felt exhausted and couldn't go on any more, so he gave up and broke out this time.

    I know you're a good guy and will help with jay even without bg, won't you?

    Input

    Input consists of multiple cases( <= 20 ) and terminates with end of file.

    For each case, the first line contains string A ( length <= 10^5 ). The second line contains an integer N ( N <= 10^5 ), which denotes the number of queries. The next N lines, each with an integer type and a string a ( length <= 6 ), type = 0 denotes substring a is allowed to overlap and type = 1 denotes not. Note that all input characters are lowercase.

    There is a blank line between two consecutive cases.

    Output

    For each case, output the case number first ( based on 1 , see Samples ).

    Then for each query, output an integer in a single line denoting the maximum times you can find the substring under certain rules.

    Output an empty line after each case.

    Sample Input

    ab
    2
    0 ab
    1 ab
    
    abababac
    2
    0 aba
    1 aba
    
    abcdefghijklmnopqrstuvwxyz
    3
    0 abc
    1 def
    1 jmn
    

    Sample Output

    Case 1
    1
    1
    
    Case 2
    3
    2
    
    Case 3
    1
    1
    0
    
    

    Hint

    In Case 2,you can find the first substring starting in position (indexed from 0) 0,2,4, since they're allowed to overlap. The second substring starts in position 0 and 4, since they're not allowed to overlap.

    For C++ users, kindly use scanf to avoid TLE for huge inputs.


    Author: LI, Jie
    Source: ZOJ Monthly, July 2009

    题意:

    给出一个字符串,有n次查询,每次查询为:给出一个单词,问在字符串中出现了次?并且多了一个限定:输入时,0代表单词可以在字符串中重叠,1则反之。

    题解:

    1.首先将这n个单词插入AC自动机中,由于输入的单词可能会重复,所以需要为单词重新编号。

    2.将字符串与AC自动机进行匹配,匹配分成两类:

    第一类,可重叠:与往常无异。

    第二类,不可重叠:因为不可重叠,所以在匹配的过程中,需要记录此单词上一次出现的位置,记为last,当前出现单词的位置为i,单词长度为len,如果满足 i-last>=len,即表明没有与上一次出现的重叠。

    注意:

    C语言语法问题。当把数组以参数的形式传给函数,并且在函数内调用memset对数组进行初始化,是不能实现的,原因好像是不能知道数组的大小。

    // int a[5];  定义到此处仍然不能实现
    void f(int a[]) //传数组
    {
        memset(a, -1, sizeof(a));
        for(int i = 0; i<5; i++)
            printf("%d ", a[i]);
        /*
            输出为:-1 0 0 0 0
            而目标输出为:-1 -1 -1 -1 -1
            即使把a数组定义到f()函数上面,结果也一样。
        */
    }
    
    int a[5];
    int main()
    {
        memset(a, 0, sizeof(a));
        f(a);
    }

    所以,最好把数组定义到函数的可视范围内,然后直接调用。如下:

    int a[5];
    void f() //传数组
    {
        memset(a, -1, sizeof(a));
        for(int i = 0; i<5; i++)
            printf("%d ", a[i]);
    }
    
    int main()
    {
        memset(a, 0, sizeof(a));
        f();
    }

    代码如下:

      1 #include <iostream>
      2 #include <cstdio>
      3 #include <cstring>
      4 #include <algorithm>
      5 #include <vector>
      6 #include <cmath>
      7 #include <queue>
      8 #include <stack>
      9 #include <map>
     10 #include <string>
     11 #include <set>
     12 using namespace std;
     13 typedef long long LL;
     14 const double EPS = 1e-6;
     15 const int INF = 2e9;
     16 const LL LNF = 9e18;
     17 const int MOD = 20090717;
     18 const int MAXN = 6e5+10;
     19 
     20 int ans[100010][2], type[100010], last[100010], Len[100010], Index[100010];
     21 struct Trie
     22 {
     23     int sz, base;
     24     int next[MAXN][26], fail[MAXN], end[MAXN];
     25     int root, L, id;
     26     int newnode()
     27     {
     28         for(int i = 0; i<sz; i++)
     29             next[L][i] = -1;
     30         end[L++] = 0;
     31         return L-1;
     32     }
     33     void init(int _sz, int _base)
     34     {
     35         sz = _sz;
     36         base = _base;
     37         id = L = 0;
     38         root = newnode();
     39     }
     40     int insert(char buf[], int id)
     41     {
     42         int len = strlen(buf);
     43         int now = root;
     44         for(int i = 0; i<len; i++)
     45         {
     46             if(next[now][buf[i]-base] == -1) next[now][buf[i]-base] = newnode();
     47             now = next[now][buf[i]-base];
     48         }
     49         if(!end[now]) end[now] = ++id;  //为AC自动机上的单词编号。
     50         return end[now];
     51     }
     52     void build()
     53     {
     54         queue<int>Q;
     55         fail[root] = root;
     56         for(int i = 0; i<sz; i++)
     57         {
     58             if(next[root][i] == -1) next[root][i] = root;
     59             else fail[next[root][i]] = root, Q.push(next[root][i]);
     60         }
     61         while(!Q.empty())
     62         {
     63             int now = Q.front();
     64             Q.pop();
     65             for(int i = 0; i<sz; i++)
     66             {
     67                 if(next[now][i] == -1) next[now][i] = next[fail[now]][i];
     68                 else fail[next[now][i]] = next[fail[now]][i], Q.push(next[now][i]);
     69             }
     70         }
     71     }
     72 
     73     void query(char buf[])
     74     {
     75         int len = strlen(buf);
     76         int now = root;
     77         for(int i = 0; i<len; i++)
     78         {
     79             now = next[now][buf[i]-base];
     80             int tmp = now;
     81             while(tmp != root)
     82             {
     83                 if(end[tmp])    //如果此处存在单词
     84                 {
     85                     ans[end[tmp]][0]++; //可重叠
     86                     if(i-last[end[tmp]]>=Len[end[tmp]]) //不可重叠
     87                     {
     88                         ans[end[tmp]][1]++;
     89                         last[end[tmp]] = i;   //注意:“最后一次出现”得个概念只是相对不可重叠的而言,所以这句应该放在括号里面。
     90                     }
     91                 }
     92                 tmp = fail[tmp];
     93             }
     94         }
     95     }
     96 };
     97 
     98 Trie ac;
     99 char buf[20], s[100010];
    100 int main()
    101 {
    102     int n, kase = 0;
    103     while(scanf("%s", s)!=EOF)
    104     {
    105         scanf("%d", &n);
    106         ac.init(26,'a');
    107         for(int i = 1; i<=n; i++)
    108         {
    109             scanf("%d%s", &type[i], buf);
    110             Index[i] = ac.insert(buf,i);    //Index存当前单词在AC自动机上的位置。
    111             Len[Index[i]] = strlen(buf);
    112         }
    113         ac.build();
    114 
    115         memset(last, -1, sizeof(last));
    116         memset(ans, 0, sizeof(ans));
    117         ac.query(s);
    118 
    119         printf("Case %d
    ", ++kase);
    120         for(int i = 1; i<=n; i++)
    121             printf("%d
    ",ans[Index[i]][type[i]]);
    122         printf("
    ");
    123     }
    124 }
    View Code
  • 相关阅读:
    Altium Designer下载和安装
    python中的filter()函数
    Python基础
    Linux系统中的操作命令
    Linux、Windows中的mysql数据库操作语句
    在运行Django项目时,出现127.0.0.1 拒绝了我们的连接请求
    Django的model类新增字段重新迁移时出错 django.db.utils.OperationalError: (1054, "Unknown column 'course.course_image' in 'field list'")
    使用DataGrip删除数据表
    Windows系统下安装Redis
    Python生成随机验证码需要导入ttf字体文件
  • 原文地址:https://www.cnblogs.com/DOLFAMINGO/p/8459259.html
Copyright © 2011-2022 走看看