zoukankan      html  css  js  c++  java
  • Keywords Search(AC自动机)

     In the modern time, Search engine came into the life of everybody like Google, Baidu, etc. 
     Wiskey also wants to bring this feature to his image retrieval system. 
     Every image have a long description, when users type some keywords to find the image, the system will match the keywords with description of image and show the image which the most keywords be matched. 
     To simplify the problem, giving you a description of image, and some keywords, you should tell me how many keywords will be match. 

    Input

     First line will contain one integer means how many cases will follow by. 
     Each case will contain two integers N means the number of keywords and N keywords follow. (N <= 10000) 
     Each keyword will only contains characters 'a'-'z', and the length will be not longer than 50. 
     The last line is the description, and the length will be not longer than 1000000. 
    Output

     Print how many keywords are contained in the description.

    Sample Input

    1
    5
    she
    he
    say
    shr
    her
    yasherhs

    Sample Output

    3

    多模匹配,AC自动机的模板题。

     1 #include <bits/stdc++.h>
     2 
     3 using namespace std;
     4 
     5 int tree[500005][28], counword[500005], fail[500005];
     6 int coun;
     7 
     8 void AddWord(char a[], int len)   //建树,把每一个模式串加到树上
     9 {
    10     int root, i;
    11     root = 0;
    12     for(i=0;i<len;i++)
    13     {
    14         if(tree[root][a[i]-'a']==0) tree[root][a[i]-'a'] = ++coun;   
    //tree为0表示为空(就算是root也没表示字母),
    //那么就给当前节点赋上++coun,一一对应的标记节点
    15 root = tree[root][a[i]-'a']; 16 } 17 counword[root]++; //当前节点表示的单词个数 18 } 19 20 void getfail() //fail指针的意义类似于KMP的next数组,当前节点匹配失败的时候,
    //我们可以借助fail到达前缀能对应上当前串后缀的节点,匹配继续
    21 { 22 queue <int> q; //每个节点都有自己的fail指针,用队列挨个看 23 int now, i; 24 fail[0] = 0; 25 for(i=0;i<26;i++) //第二层不为空的全部垫在队列里 26 { 27 if(tree[0][i]) 28 { 29 fail[tree[0][i]] = 0; 30 q.push(tree[0][i]); 31 } 32 }
    33 while(!q.empty()) 34 { 35 now = q.front(); 36 q.pop(); 37 for(i=0;i<26;i++) 38 { 39 if(tree[now][i])
    //now的该子节点不为空,fail指向的是now(i的父亲)的fail的i儿子,节点不为空,有意义,进队列
    40 { 41 fail[tree[now][i]] = tree[fail[now]][i]; 42 q.push(tree[now][i]); 43 } 44 else tree[now][i] = tree[fail[now]][i];
    //该节点为空,该节点不要,其编号被赋成now(i的父亲)的fail的i儿子,
    //节点本来为空,不要进队列,否则会有相同节点重复进队列
    45 } 46 } 47 } 48 49 long long query(char a[], int len) //树和fail准备就绪,跑一遍文本串出结果 50 { 51 int now, i, j; 52 long long re; 53 now = 0; 54 re = 0; 55 for(i=0;i<len;i++) 56 { 57 now = tree[now][a[i]-'a']; 58 for(j=now;j&&counword[j]!=-1;j=fail[j]) 59 { 60 re+=counword[j]; 61 counword[j] = -1; 62 } 63 } 64 return re; 65 } 66 67 int main() 68 { 69 int t, n, len, i; 70 long long re; 71 char a[55], txt[1000005]; 72 scanf("%d", &t); 73 while(t--) 74 { 75 coun = 0; 76 memset(counword, 0, sizeof(counword)); 77 memset(tree, 0, sizeof(tree)); 78 scanf("%d", &n); 79 getchar(); 80 for(i=0;i<n;i++) 81 { 82 scanf("%s", a); 83 len = strlen(a); 84 AddWord(a, len); 85 } 86 getfail(); 87 scanf("%s", txt); 88 len = strlen(txt); 89 re = query(txt, len); 90 printf("%lld ", re); 91 } 92 return 0; 93 }

    单模匹配KMP,多模用字典树+fail



  • 相关阅读:
    vnode之update 还是没太懂
    vnodec创建之标签
    1054 求平均值
    1053 住房空置率
    1052 卖个萌
    1051 复数乘法
    1050 螺旋矩阵
    1049 数列的片段和
    1048 数字加密
    1047 编程团体赛
  • 原文地址:https://www.cnblogs.com/0xiaoyu/p/11300966.html
Copyright © 2011-2022 走看看