zoukankan      html  css  js  c++  java
  • SPOJ Repeats(后缀数组+RMQ-ST)

    REPEATS - Repeats

    no tags 

    A string s is called an (k,l)-repeat if s is obtained by concatenating k>=1 times some seed string t with length l>=1. For example, the string

    s = abaabaabaaba

    is a (4,3)-repeat with t = aba as its seed string. That is, the seed string t is 3 characters long, and the whole string s is obtained by repeating t 4 times.

    Write a program for the following task: Your program is given a long string u consisting of characters ‘a’ and/or ‘b’ as input. Your program must find some (k,l)-repeat that occurs as substring within u with k as large as possible. For example, the input string

    u = babbabaabaabaabab

    contains the underlined (4,3)-repeat s starting at position 5. Since u contains no other contiguous substring with more than 4 repeats, your program must output the maximum k.

     

    Input

    In the first line of the input contains H- the number of test cases (H <= 20). H test cases follow. First line of each test cases is n - length of the input string (n <= 50000), The next n lines contain the input string, one character (either ‘a’ or ‘b’) per line, in order.

    Output

    For each test cases, you should write exactly one interger k in a line - the repeat count that is maximized.

    Example

    Input:
    1
    17
    b
    a
    b
    b
    a
    b
    a
    a
    b
    a
    a
    b
    a
    a
    b
    a
    b
    
    Output:
    4
    
    since a (4, 3)-repeat is found starting at the 5th character of the input string.

     

    题目链接:SPOJ Repeats

    论文里写的比较模糊,突然就往后匹配了,还往前匹配,完全没讲怎么匹配啊,代码还是看这个博客写的:传送门

    说一下个人理解,为什么$LCP(i,i+L)/len+1$就是出现的次数?

    首先对于一个由循环节构成的字符串$str$,假设它的长度为$len$,最小循环节长度为$k$,那么对于任意的$0 le i le len-1-k$,都有$str[i]==str[k+i]$

    现在回到LCP问题上,假设两个串的公共前缀已知记为$lcp$,我们枚举的循环节长度为$L$,当前遍历位置为$i$,那么显然有$S[i+j]==S[i+L+j], 0 le j le lcp-1$,看这条式子,是不是跟上面的定义式子很像,显然有$len-1-k=lcp-1$,化简得$len=lcp+k$,因此仅仅往后推的循环次数是$(lcp+k)/k=lcp/k+1$,那么仅仅是往后推的最优解,那说不定前面刚好多了几个位置也是相同前缀,跟$lcp\%L$多出来的数凑一凑又是$L$呢?如果这样要至少补$lcp-lcp\%L$,因此我们枚举这个"至少"的位置$i-(lcp-lcp\%L)$,如果这个位置都可以和后面多余的补出一个$L$,那么往前也肯定是可以的,这里可能又回想,那干嘛不再往前考虑考虑,补出2个、3个、4个甚至更多的L呢,应该是没这个必要,因为假如你前面可以补更多的L,那么在前几次遍历的时候它早就被算进了往后推的$lcp$里了,不需要多往前考虑,当然全过程要注意下标是否合法,往前推到负数位置肯定是不行的。还有就是这个题一开始的答案一定要是1,因为1是肯定可以的,因此我们是从$L=2$开始枚举

    代码:

    #include <stdio.h>
    #include <iostream>
    #include <algorithm>
    #include <cstdlib>
    #include <cstring>
    #include <bitset>
    #include <string>
    #include <stack>
    #include <cmath>
    #include <queue>
    #include <set>
    #include <map>
    using namespace std;
    #define INF 0x3f3f3f3f
    #define LC(x) (x<<1)
    #define RC(x) ((x<<1)+1)
    #define MID(x,y) ((x+y)>>1)
    #define fin(name) freopen(name,"r",stdin)
    #define fout(name) freopen(name,"w",stdout)
    #define CLR(arr,val) memset(arr,val,sizeof(arr))
    #define FAST_IO ios::sync_with_stdio(false);cin.tie(0);
    typedef pair<int, int> pii;
    typedef long long LL;
    const double PI = acos(-1.0);
    const int N = 50010;
    int wa[N], wb[N], cnt[N], sa[N];
    int ran[N], height[N];
    char s[N];
    
    inline int cmp(int r[], int a, int b, int d)
    {
        return r[a] == r[b] && r[a + d] == r[b + d];
    }
    void DA(int n, int m)
    {
        int i;
        int *x = wa, *y = wb;
        for (i = 0; i < m; ++i)
            cnt[i] = 0;
        for (i = 0; i < n; ++i)
            ++cnt[x[i] = s[i]];
        for (i = 1; i < m; ++i)
            cnt[i] += cnt[i - 1];
        for (i = n - 1; i >= 0; --i)
            sa[--cnt[x[i]]] = i;
        for (int k = 1; k <= n; k <<= 1)
        {
            int p = 0;
            for (i = n - k; i < n; ++i)
                y[p++] = i;
            for (i = 0; i < n; ++i)
                if (sa[i] >= k)
                    y[p++] = sa[i] - k;
            for (i = 0; i < m; ++i)
                cnt[i] = 0;
            for (i = 0; i < n; ++i)
                ++cnt[x[y[i]]];
            for (i = 1; i < m; ++i)
                cnt[i] += cnt[i - 1];
            for (i = n - 1; i >= 0; --i)
                sa[--cnt[x[y[i]]]] = y[i];
            swap(x, y);
            x[sa[0]] = 0;
            p = 1;
            for (i = 1; i < n; ++i)
                x[sa[i]] = cmp(y, sa[i - 1], sa[i], k) ? p - 1 : p++;
            m = p;
            if (m >= n)
                break;
        }
    }
    void gethgt(int n)
    {
        int i, k = 0;
        for (i = 1; i <= n; ++i)
            ran[sa[i]] = i;
        for (i = 0; i < n; ++i)
        {
            if (k)
                --k;
            int j = sa[ran[i] - 1];
            while (s[j + k] == s[i + k])
                ++k;
            height[ran[i]] = k;
        }
    }
    namespace SG
    {
        int dp[N][17];
        void init(int l, int r)
        {
            int i, j;
            for (i = l; i <= r; ++i)
                dp[i][0] = height[i];
            for (j = 1; l + (1 << j) - 1 <= r; ++j)
            {
                for (i = l; i + (1 << j) - 1 <= r; ++i)
                    dp[i][j] = min(dp[i][j - 1], dp[i + (1 << (j - 1))][j - 1]);
            }
        }
        int ask(int l, int r)
        {
            int len = r - l + 1;
            int k = 0;
            while (1 << (k + 1) <= len)
                ++k;
            return min(dp[l][k], dp[r - (1 << k) + 1][k]);
        }
        int LCP(int l, int r, int len)
        {
            l = ran[l], r = ran[r];
            if (l > r)
                swap(l, r);
            if (l == r)
                return len - sa[l];
            return ask(l + 1, r);
        }
    }
    int main(void)
    {
        int T, len, i;
        scanf("%d", &T);
        while (T--)
        {
            scanf("%d", &len);
            for (i = 0; i < len; ++i)
                scanf("%s", s + i);
            DA(len + 1, 130);
            gethgt(len);
            SG::init(1, len);
            int ans = 1;
            for (int L = 1; L < len; ++L)
            {
                for (i = 0; i + L < len; i += L)
                {
                    int lcp = SG::LCP(i, i + L, len);
                    int cnt = lcp / L + 1;
                    int j = i - (L - lcp % L);
                    if (j >= 0 && lcp % L != 0 && SG::LCP(j, j + L, len) / L + 1 > cnt)
                        ++cnt;
                    ans = max(ans, cnt);
                }
            }
            printf("%d
    ", ans);
        }
        return 0;
    }
  • 相关阅读:
    线段树
    数据结构<三> 队列
    数据结构<二>双向链表
    数据结构<一>单链表
    扩展欧几里德算法
    90 个 node.js 扩展模块,我们疯了
    nodejs的查询构造器
    express的路由配置优化
    express路由方案
    Redis学习笔记~目录
  • 原文地址:https://www.cnblogs.com/Blackops/p/7510139.html
Copyright © 2011-2022 走看看