zoukankan      html  css  js  c++  java
  • ACdream 1430——SETI——————【后缀数组,不重叠重复子串个数】

    SETI

    Time Limit: 4000/2000MS (Java/Others) Memory Limit: 128000/64000KB (Java/Others)

    Problem Description

          Amateur astronomers Tom and Bob try to find radio broadcasts of extraterrestrial civilizations in the air. Recently they received some strange signal and represented it as a word consisting of small letters of the English alphabet. Now they wish to decode the signal. But they do not know what to start with.
          They think that the extraterrestrial message consists of words, but they cannot identify them. Tom and Bob call a subword of the message a potential word if it has at least two non-overlapping occurrences in the message.

          For example, if the message is “abacabacaba”, “abac” is a potential word, but “acaba” is not because two of its occurrences overlap.
          Given a message m help Tom and Bob to find the number of potential words in it.

    Input

          Input file contains one string that consists of small letters of the English alphabet. The length of the message doesn’t exceed 10 000.

    Output

          Output one integer number — the number of potential words in a message.

    Sample Input

    abacabacaba

    Sample Output

    15

    Source

    Andrew Stankevich Contest 23
    题目大意:给你一个串,让求该串的不重叠重复子串个数有多少个。
     
    解题思路:后缀数组模板题。构造该字符串的height数组,然后枚举公共前缀长度为1---n/2的子串有多少个,累加结果。
     
     
    #include<stdio.h>
    #include<algorithm>
    #include<string.h>
    #include<stdlib.h>
    using namespace std;
    const int maxn = 1e5+200;
    const int INF = 0x3f3f3f3f;
    char s[maxn];
    int sa[maxn], t[maxn], t2[maxn], c[maxn];
    int rank[maxn], height[maxn];
    void build_sa(int n, int m){    //构造sa数组
        int i,*x = t, *y = t2;
        //初始化,基数排序
        for(i = 0; i < m; i++) c[i] = 0;
        for(i = 0; i < n; i++) c[x[i] = s[i]]++;
        for(i = 1; i < m; i++) c[i] += c[i-1];
        for(i = n-1; i >= 0; i--) sa[--c[x[i]]] = i;
        for(int k = 1; k <= n; k <<= 1){
            int p = 0;
            //直接利用上次的sa数组排序第二关键字,得到本次的伪sa数组
            for(i = n-k; i < n; i++) y[p++] = i;
            for(i = 0; i < n; i++) if(sa[i] >= k) y[p++] = sa[i]-k; //y是伪sa数组
            //基数排序第一关键字,利用本次伪sa数组和上次的排名数组得到本次的sa数组
            for(i = 0; i < m; i++) c[i] = 0;
            for(i = 0; i < n; i++) c[x[y[i]]]++;    //这里的x数组类似于后来要求的rank数组
            for(i = 1; i < m; i++) c[i] += c[i-1];
            for(i = n-1; i >= 0; i--) sa[--c[x[y[i]]]] = y[i];
            //交换后,y数组变为上次的排名数组,然后根据本次sa和上次的排名数组y得到本次的排名数组x
            swap(x,y);
            p = 1; x[sa[0]] = 0;
            for(i =1; i < n; i++)
                x[sa[i]] = y[sa[i-1]] == y[sa[i]] && y[sa[i-1]+k] ==y[sa[i]+k] ? p-1:p++;
            if(p >= n) break;
            m = p;
        }
        return ;
    }
    void getheight(int n) {
    
        int i, j, k = 0;
        for(i = 0; i < n; i++) {    //从i = 0 到 i = n-1是有用的
            rank[sa[i]] = i;
        }
        for(i = 0; i < n; i++) {
            if(k) k--;
            int j = sa[rank[i]-1];
            while(s[i+k] == s[j+k]){
                k++;
            }
            height[rank[i]] = k;
        }
    }
    int check(int mid , int n){
        int mi=INF , mx = 0, num=0, ret = 0;
        for(int i = 2;i <= n+1;i++){    //i = 2 因为最小的肯定是尾字符,所以i从2开始计算
            if(i==n+1 || height[i] < mid){
             //   printf("%d %d %d
    ",i,height[i],mid);
                mi = min(mi, sa[i-1]);
                mx = max(mx, sa[i-1]);
                if(mx - mi >= mid &&num >= 1){  //mx - mi >= mid表示不重叠 num >= 1表示存在重复子串
                    ret ++;
                }
                mx = 0; mi = INF;
                num = 0;
            }
            else if(height[i] >= mid){  //表示sa[i]跟sa[i-1]的公共前缀长度大于mid
                mi = min(mi,sa[i-1]);
                mx = max(mx,sa[i-1]);
                num++;
            }
        }
        return ret;
    }
    int main(){
        int n;
        while(scanf("%s",s)!=EOF){
            n = strlen(s);
    //        s[n++] = '#';
    //        s[n] = '';
            build_sa(n+1,200);      //如果想让字符串结尾符''作为尾字符,那么就传参n+1
            build_sa(n, 200);       //或者加一个'#'作为尾字符,传参n。  尾字符一定要是所有字符中最小的,且大于等于0
            
            getheight(n+1);         //跟构造sa同理
    
            int ans = 0 ,res;
            for(int i = 1; i <= n/2; i++){
                if(res = check(i , n)){     //枚举各个公共前缀长度
               //     printf("%d %d+++
    ", i, res);
                    ans += res;
                }
            }
            printf("%d
    ",ans);
        }
        return 0;
    }
    

      

  • 相关阅读:
    C#利用System.Net发送邮件(带 抄送、密送、附件、html格式的邮件)
    ASP.NET跨平台实践:无需安装Mono的Jexus“独立版”
    在.NET Core之前,实现.Net跨平台之Mono+CentOS+Jexus初体验
    初识Docker和Windows Server容器
    windows 7 docker oralce安装和使用
    javaweb学习总结(三十)——EL函数库
    javaweb学习总结(二十九)——EL表达式
    javaweb学习总结(二十八)——JSTL标签库之核心标签
    javaweb学习总结(二十七)——jsp简单标签开发案例和打包
    在Servlet使用getServletContext()获取ServletContext对象出现java.lang.NullPointerException(空指针)异常的解决办法
  • 原文地址:https://www.cnblogs.com/chengsheng/p/4885618.html
Copyright © 2011-2022 走看看