zoukankan      html  css  js  c++  java
  • 【字符串】【hash】【倍增】洛谷 P3502 [POI2010]CHO-Hamsters 题解

        这是一道字符串建模+图论的问题。

    题目描述

    Byteasar breeds hamsters.

    Each hamster has a unique name, consisting of lower case letters of the English alphabet.

    The hamsters have a vast and comfortable cage.
    Byteasar intends to place a display under the cage to visualize the names of his hamsters.
    This display is simply a sequence of letters, each of which can be either lit or not independently.

    Only one name will be displayed simultaneously.

    The lit letters forming the name have to stand next to each other, i.e., form a contiguous subsequence.

    Byteasar wants to be able to display the names of the hamsters on at least $m$ different positions.

    However, he allows displaying the same name on multiple different positions, and does not require to be able to display each and every hamster's name.

    Note that the occurrences of the names on the display can overlap.

    You can assume that no hamster's name occurs (as a contiguous fragment) in any other hamster's name.

    Bytesar asks your help in determining the minimum number of letters the display has to have.

    In other words, you are to determine the minimum length of a string (consisting of non-capital letters of the English alphabet) that has at least $m$ total occurrences of the hamsters' names (counting multiplicities).

    (We say that a string $s$ occurs in the string $t$ if $s$ forms a contiguous fragment of $t$.)

    输入输出格式

    输入格式:

    The first line of the standard input holds two integers $n$ and $m(1le nle 200,1le mle 10^9)$, separated by a single space, that denote the number of Byteasar's hamsters and the minimum number of occurrences of the hamsters' names on the display.

    Each of the following $n$ lines contains a non-empty string of non-capital letters of the English alphabet that is the hamster's name.

    The total length of all names does not exceed $100000$ letters.

    输出格式:

    The first and only line of the standard output should hold a single integer - the minimum number of letters the display has to have.

    输入输出样例

    输入样例#1:

    4 5
    monika
    tomek
    szymon
    bernard

    输出样例#1:

    23

    题意:

        给出$n$个字符串$s_i$,这些字符串互不包含。请求出一个最短的字符串$S$,使得这个字符串中出现了$m$次$s$中的字符串。输出$S$的长度。

    题解:

        建图是比较容易想到的。不过距离怎么定,$10^9$的长度又怎么控制呢?我们看到字符串的个数只有200,因此考虑floyd。而边有边权,点有点权(1),一个字符串中出现$m$个子串,就要让一条路径经过$m$个点。两个点$(i,j)$之间的边权是$s_i$后面至少添加几个字符能凑出$s_j$。

        因此可以用倍增floyd来做,floyd状态全面,可以表示很多东西。所以用$f[k][i][j]$表示$i$到$j$之间经过$2^k$个点的最短路径。然后做floyd,其中转移只能从$2^{k-1}$处转移。

        而每次内层都是正常的floyd,外层是倍增。此处复杂度是$n^3log m$。不过匹配字符串需要一定的技巧,这里我用的是字符串hash,虽然复杂度不对,但是可以开-o2啊,还是过了。正解用了AC自动机和KMP来保证复杂度,不过用字符串hash也算学到了一点东西。

        字符串hash就是把字符串用$26/27$进制来表示,字符串的第$i$位要乘上$26^i$或$26^{|s|-i-1}$。在比较两个字符串是否相同时,要把它们的其中一个用乘法变成与另一个同级的。比如abcbcd,把它们分解就是$1+2 imes 26+3 imes 26^2$和$2+3 imes 26+4 imes 26^2$,我们要比较第一个字符串的bc和第二个字符串的bc是否相等,就要分别取出这两段数字(用前缀和处理即可)。发现取出来是$2 imes 26+3 imes 26^2$和$2+3 imes 26$,可以计算出原来字符串中二者的商值,接着让较小的乘上这个商就可以变到同级了。

    Code:

    #include<cstdio>
    #include<cstring>
    long long Min(long long x,long long y)
    {
        return x<y?x:y;
    }
    long long f[35][205][205];
    char s[205][100010];
    int L[205];
    long long dis[205],tmp[205];
    int Hash[205][100010];
    int pow26[100100];
    bool Equal(int x,int y,int l)//默认为第一个结尾l个和第二个开头l个
    {
        return (long long)(Hash[x][L[x]-1]-Hash[x][L[x]-l-1]+19260817)%19260817==(long long)((long long)Hash[y][l-1]*pow26[L[x]-l]%19260817);
    }
    int main()
    {
        pow26[0]=1;
        for(int i=1;i<=100000;++i)
            pow26[i]=pow26[i-1]*26%19260817;
        memset(f,0x3f,sizeof(f));
        int n,m;
        scanf("%d%d",&n,&m);
        --m;
        for(int i=1;i<=n;++i)
        {
            scanf("%s",s[i]);
            L[i]=strlen(s[i]);
            dis[i]=L[i];
            for(int j=0;j<L[i];++j)
                if(j)
                    Hash[i][j]=(Hash[i][j-1]+pow26[j]*(s[i][j]-'a'+1))%19260817;
                else
                    Hash[i][j]=s[i][j]-'a'+1;
        }
        for(int i=1;i<=n;++i)
            for(int j=1;j<=n;++j)
            {
                int l=Min(L[i],L[j]);
                for(int k=(i==j?l-1:l);k;--k)
                    if(Equal(i,j,k))
                    {
                        f[0][i][j]=L[j]-k;
                        break;
                    }
                if(f[0][i][j]>10000000)
                    f[0][i][j]=L[j];
            }
        for(int t=1;t<=30;++t)
            for(int k=1;k<=n;++k)
                for(int j=1;j<=n;++j)
                    for(int i=1;i<=n;++i)
                        f[t][i][j]=Min(f[t-1][i][k]+f[t-1][k][j],f[t][i][j]);
        for(int i=0;i<=30;++i)
            if(m&(1<<i))
            {
                for(int j=1;j<=n;++j)
                {
                    tmp[j]=0x7ffffffffffffffll;
                    for(int k=1;k<=n;++k)
                        tmp[j]=tmp[j]<dis[k]+f[i][k][j]?tmp[j]:dis[k]+f[i][k][j];
                }
                for(int j=1;j<=n;++j)
                    dis[j]=tmp[j];
            }
        long long ans=0x7ffffffffffffffll;
        for(int i=1;i<=n;++i)
            ans=ans<dis[i]?ans:dis[i];
        printf("%lld
    ",ans);
        return 0;
    }
     
  • 相关阅读:
    C++11之function模板和bind函数适配器
    C++11之右值引用(三):使用C++11编写string类以及“异常安全”的=运算符
    C++11之右值引用(二):右值引用与移动语义
    C++11之右值引用(一):从左值右值到右值引用
    C++Singleton的DCLP(双重锁)实现以及性能测评
    信息熵
    ip访问网站和localhost访问网站中top使用
    方差与协方差
    js获取file控件的完整路径(上传图片预览)
    对线性回归,logistic回归和一般回归
  • 原文地址:https://www.cnblogs.com/wjyyy/p/lg3502.html
Copyright © 2011-2022 走看看