zoukankan      html  css  js  c++  java
  • HDU 4644 BWT (KMP)

    BWT

    Time Limit: 12000/6000 MS (Java/Others)    Memory Limit: 65535/32768 K (Java/Others)
    Total Submission(s): 775    Accepted Submission(s): 242


    Problem Description
    When the problem to match S string in T string is mentioned, people always put KMP, Aho-Corasick and Suffixarray forward. But Mr Liu tells Canoe that there is an algorithm called Burrows–Wheeler Transform(BWT) which is quite amazing and high-efficiency to solve the problem.
    But how does BWT work to solve the matching S-in-T problem? Mr Liu tells Canoe the firstly three steps of it.
    Firstly, we append the ‘$’ to the end of T and for convenience, we still call the new string T. And then for every suffix of T string which starts from i, we append the prefix of T string which ends at (i – 1) to its end. Secondly, we sort these new strings by the dictionary order. And we call the matrix formed by these sorted strings Burrows Wheeler Matrix. Thirdly, we pick characters of the last column to get a new string. And we call the string of the last column BWT(T). You can get more information from the example below.



    Then Mr Liu tells Canoe that we only need to save the BWT(T) to solve the matching problem. But how and can it? Mr Liu smiles and says yes. We can find whether S strings like “aac” are substring of T string like “acaacg” or not only knowing the BWT(T)! What an amazing algorithm BWT is! But Canoe is puzzled by the tricky method of matching S strings in T string. Would you please help Canoe to find the method of it? Given BWT(T) and S string, can you help Canoe to figure out whether S string is a substring of string T or not?
     
    Input
    There are multiple test cases.
    First Line: the BWT(T) string (1 <= length(BWT(T)) <= 100086).
    Second Line: an integer n ( 1 <=n <= 10086) which is the number of S strings.
    Then n lines comes.
    There is a S string (n * length(S) will less than 2000000, and all characters of S are lowercase ) in every line.
     
    Output
    For every S, if S string is substring of T string, then put out “YES” in a line. If S string is not a substring of T string, then put out “NO” in a line.
     
    Sample Input
    gc$aaac 2 aac gc
     
    Sample Output
    YES NO
    分析:
    我们可以想到将变化后的串,转化为原串,然后进行KMP
    转化过程如下
    先将变化为的串编号
    gc$aaac
    0123456
    然后再字典序排序,排序的时候如果大小相同,那么原来编号在前就排在前面
    $aaaccg
    2345160
    再将排序后的字符将编号作为下标,跑一遍,比如一开始$的编号是2,那么对应下标为2的字符是a,就有"a",a的编号是4,那么对应下标为4的字符就是
    c,就有"ac",c的编号为1,对应下标为1的字符a,就有"aca";
    到最后得到"acaacg";
    然后再进行KMP即可
    代码如下:
    #include <cstdio>
    #include <iostream>
    #include <cstring>
    #include <vector>
    #include <algorithm>
    using namespace std;
    typedef long long ll;
    struct node
    {
        int id;
        char r;
    }str[100186];
    char s[100186];
    char str2[100186];
    char T[2000100];
    int Next[2000100];
    int tlen;
    bool cmp(node x,node y)
    {
        return x.r<y.r;
    }
    void getNext()
    {
        int j, k;
        j = 0; k = -1; Next[0] = -1;
        while(j < tlen)
            if(k == -1 || T[j] == T[k])
                Next[++j] = ++k;
            else
                k = Next[k];
    
    }
    
    bool KMP_Index(char S[],int slen)
    {
        int i = 0, j = 0;
        getNext();
    
        while(i < slen && j < tlen)
        {
            if(j == -1 || S[i] == T[j])
            {
                i++; j++;
            }
            else
                j = Next[j];
        }
        if(j == tlen)
            return true;
        else
            return false;
    }
    int main()
    {
        int n;
        while(scanf("%s",s)!=EOF)
        {
            int len=strlen(s);
          for(int i=0;i<len;i++)
          {
            str[i].id=i;
            str[i].r=s[i];
          }
          stable_sort(str,str+len,cmp);
          int now=0;
          for(int i=0;i<len-1;i++)
          {
            now=str[now].id;
            str2[i]=str[now].r;
          }
          len=len-1;
          str2[len]=0;
          scanf("%d",&n);
          while(n--)
          {
              scanf("%s",T);
              tlen=strlen(T);
              getNext();
              if( KMP_Index(str2,len))puts("YES");
              else puts("NO");
          }
        }
        return 0;
    }
  • 相关阅读:
    嵌入式成长轨迹36 【Zigbee项目】【单片机基础】【单片机SD卡】
    嵌入式成长轨迹31 【嵌入式学习阶段】【ARM环境调试】【UbuntuWin7 NAT联网】
    一个jQuery弹出层(tipsWindown)
    sql的left join 命令详解
    input javascript 之 onclick 大全
    php中调用用户自定义函数的方
    asp 正则表达式使用方法
    conn.execute的用法
    vbscript中的True和False
    JavaScript Cookie 的正确用法
  • 原文地址:https://www.cnblogs.com/a249189046/p/7624426.html
Copyright © 2011-2022 走看看