zoukankan      html  css  js  c++  java
  • POJ3450 Corporate Identity —— 后缀数组 最长公共子序列

    题目链接:https://vjudge.net/problem/POJ-3450

    Corporate Identity
    Time Limit: 3000MS   Memory Limit: 65536K
    Total Submissions: 8046   Accepted: 2710

    Description

    Beside other services, ACM helps companies to clearly state their “corporate identity”, which includes company logo but also other signs, like trademarks. One of such companies is Internet Building Masters (IBM), which has recently asked ACM for a help with their new identity. IBM do not want to change their existing logos and trademarks completely, because their customers are used to the old ones. Therefore, ACM will only change existing trademarks instead of creating new ones.

    After several other proposals, it was decided to take all existing trademarks and find the longest common sequence of letters that is contained in all of them. This sequence will be graphically emphasized to form a new logo. Then, the old trademarks may still be used while showing the new identity.

    Your task is to find such a sequence.

    Input

    The input contains several tasks. Each task begins with a line containing a positive integer N, the number of trademarks (2 ≤ N ≤ 4000). The number is followed by N lines, each containing one trademark. Trademarks will be composed only from lowercase letters, the length of each trademark will be at least 1 and at most 200 characters.

    After the last trademark, the next task begins. The last task is followed by a line containing zero.

    Output

    For each task, output a single line containing the longest string contained as a substring in all trademarks. If there are several strings of the same length, print the one that is lexicographically smallest. If there is no such non-empty string, output the words “IDENTITY LOST” instead.

    Sample Input

    3
    aabbaabb
    abbababb
    bbbbbabb
    2
    xyz
    abc
    0

    Sample Output

    abb
    IDENTITY LOST

    Source

    题意:

    给出n个字符串,求这n个字符串的最长公共子序列,输出字典序最小的一个。

    题解:

    1.将n个字符串拼接在一起,并且相邻两个之间用分隔符隔开,并且分隔符应各异。因此得到新串。

    2.求出新串的后缀数组,然后二分公共子串的长度mid:可知当前的mid可将新串的后缀按排名的顺序将其分成若干组,且每一组的最长公共前缀都大于等于mid,于是就在每一组内统计出现了多少个字符串,如果等于n,即表明当前mid合法,否则不合法,因此可以根据此规则最终求得长度。

    3.由于题目还要求输出字典序最小的。所以,如果当前mid合法,那么就记录下公共子串的起始点和结束点。因为枚举是按sa[i]从小到大的顺序,因此在同一个mid下,第一组符合条件的公共子串即为字典序最小的。

    代码如下:

      1 #include <iostream>
      2 #include <cstdio>
      3 #include <cstring>
      4 #include <algorithm>
      5 #include <vector>
      6 #include <cmath>
      7 #include <queue>
      8 #include <stack>
      9 #include <map>
     10 #include <string>
     11 #include <set>
     12 using namespace std;
     13 typedef long long LL;
     14 const int INF = 2e9;
     15 const LL LNF = 9e18;
     16 const int MOD = 1e9+7;
     17 const int MAXN = 1e6+100;
     18 
     19 int id[MAXN];   //记录属于哪个字符串
     20 int r[MAXN], sa[MAXN], Rank[MAXN], height[MAXN];
     21 int t1[MAXN], t2[MAXN], c[MAXN];
     22 
     23 bool cmp(int *r, int a, int b, int l)
     24 {
     25     return r[a]==r[b] && r[a+l]==r[b+l];
     26 }
     27 
     28 void DA(int str[], int sa[], int Rank[], int height[], int n, int m)
     29 {
     30     n++;
     31     int i, j, p, *x = t1, *y = t2;
     32     for(i = 0; i<m; i++) c[i] = 0;
     33     for(i = 0; i<n; i++) c[x[i] = str[i]]++;
     34     for(i = 1; i<m; i++) c[i] += c[i-1];
     35     for(i = n-1; i>=0; i--) sa[--c[x[i]]] = i;
     36     for(j = 1; j<=n; j <<= 1)
     37     {
     38         p = 0;
     39         for(i = n-j; i<n; i++) y[p++] = i;
     40         for(i = 0; i<n; i++) if(sa[i]>=j) y[p++] = sa[i]-j;
     41 
     42         for(i = 0; i<m; i++) c[i] = 0;
     43         for(i = 0; i<n; i++) c[x[y[i]]]++;
     44         for(i = 1; i<m; i++) c[i] += c[i-1];
     45         for(i = n-1; i>=0; i--) sa[--c[x[y[i]]]] = y[i];
     46 
     47         swap(x, y);
     48         p = 1; x[sa[0]] = 0;
     49         for(i = 1; i<n; i++)
     50             x[sa[i]] = cmp(y, sa[i-1], sa[i], j)?p-1:p++;
     51 
     52         if(p>=n) break;
     53         m = p;
     54     }
     55 
     56     int k = 0;
     57     n--;
     58     for(i = 0; i<=n; i++) Rank[sa[i]] = i;
     59     for(i = 0; i<n; i++)
     60     {
     61         if(k) k--;
     62         j = sa[Rank[i]-1];
     63         while(str[i+k]==str[j+k]) k++;
     64         height[Rank[i]] = k;
     65     }
     66 }
     67 
     68 bool vis[4040];
     69 int Le, Ri;
     70 bool test(int n, int len, int k)
     71 {
     72     int cnt = 0;
     73     memset(vis, false, sizeof(vis));
     74     for(int i = 2; i<=len; i++)
     75     {
     76         if(height[i]<k)
     77         {
     78             cnt = 0;
     79             memset(vis, false, sizeof(vis));
     80         }
     81         else
     82         {
     83             if(!vis[id[sa[i-1]]]) vis[id[sa[i-1]]] = true, cnt++;
     84             if(!vis[id[sa[i]]]) vis[id[sa[i]]] = true, cnt++;
     85             if(cnt==n)
     86             {
     87                 Le = sa[i]; Ri = sa[i]+k-1;
     88                 return true;
     89             }
     90         }
     91     }
     92     return false;
     93 }
     94 
     95 char str[MAXN];
     96 int main()
     97 {
     98     int n;
     99     while(scanf("%d", &n)&&n)
    100     {
    101         int len = 0;
    102         for(int i = 0; i<n; i++)
    103         {
    104             scanf("%s", str);
    105             int LEN = strlen(str);
    106             for(int j = 0; j<LEN; j++)
    107             {
    108                 r[len] = str[j]-'a'+1;
    109                 id[len++] = i;
    110             }
    111             r[len] = 30+i;  //分隔符要各异
    112             id[len++] = i;
    113         }
    114         r[len] = 0;
    115         DA(r,sa,Rank,height,len,30+n);
    116 
    117         int L = 0, R = strlen(str);
    118         while(L<=R)
    119         {
    120             int mid = (L+R)>>1;
    121             if(test(n,len,mid))
    122                 L = mid + 1;
    123             else
    124                 R = mid - 1;
    125         }
    126 
    127         if(R==0) puts("IDENTITY LOST");
    128         else
    129         {
    130             for(int i = Le; i<=Ri; i++)
    131                 printf("%c", r[i]+'a'-1);
    132             putchar('
    ');
    133         }
    134     }
    135 }
    View Code
  • 相关阅读:
    css深入理解之margin
    position:fixed相对父级元素定位而不是浏览器
    移动web基础
    css多行文本溢出显示省略号
    30丨案例:为什么参数化数据会导致TPS突然下降
    29丨案例:如何应对因网络参数导致的TPS%08呈锯齿状
    28丨案例:带宽消耗以及Swap(下)
    27丨案例:带宽消耗以及Swap(上)
    26丨案例:理解TPS趋势分析
    25丨SkyWalking:性能监控工具之链路级监控及常用计数器解析
  • 原文地址:https://www.cnblogs.com/DOLFAMINGO/p/8480366.html
Copyright © 2011-2022 走看看