zoukankan      html  css  js  c++  java
  • BZOJ4779: [Usaco2017 Open]Bovine Genomics

    题目描述

    Farmer John owns Ncows with spots and N cows without spots. Having just completed a course in bovine

    genetics, he is convinced that the spots on his cows are caused by mutations in the bovine genome.A

    t great expense, Farmer John sequences the genomes of his cows. Each genome is a string of length Mb

    uilt from the four characters A, C, G, and T. When he lines up the genomes of his cows, he gets a ta

    ble like the following, shown here for N=3 and M=8:

    Positions: 1 2 3 4 5 6 7 8

    Spotty Cow 1: A A T C C C A T

    Spotty Cow 2: A C T T G C A A

    Spotty Cow 3: G G T C G C A A

    Plain Cow 1: A C T C C C A G

    Plain Cow 2: A C T C G C A T

    Plain Cow 3: A C T T C C A T

    Looking carefully at this table, he surmises that the sequence from position 2 through position 5 is

    sufficient to explain spottiness. That is, by looking at the characters in just these these positio

    ns (that is, positions 2…5), Farmer John can predict which of his cows are spotty and which are not

    . For example, if he sees the characters GTCG in these locations, he knows the cow must be spotty.Pl

    ease help FJ find the length of the shortest sequence of positions that can explain spottiness.

    给定n个A串和n个B串,长度均为m,求一个最短的区间[l,r]

    使得不存在一个A串a和一个B串b,使得a[l,r]=b[l,r]

    n,m≤500


    输入格式

    The first line of input contains N(1≤N≤500) and M (3≤M≤500). The next N lines each contain a str

    ing of M characters; these describe the genomes of the spotty cows. The final Nlines describe the ge

    nomes of the plain cows. No spotty cow has the same exact genome as a plain cow.


    输出格式

    Please print the length of the shortest sequence of positions that is sufficient to explain spottine

    ss. A sequence of positions explains spottiness if the spottiness trait can be predicted with perfec

    t accuracy among Farmer John's population of cows by looking at just those locations in the genome.


    样例输入

    3 8
    AATCCCAT
    ACTTGCAA
    GGTCGCAA
    ACTCCCAG
    ACTCGCAT
    ACTTCCAT


    样例输出

    4


    提示

    没有写明提示


    题目来源

    Gold

    题解

    我的做法是(O(nmlog^2n))的。

    先把字符串hash掉,然后这个判断可行一看就知道是可以二分的。那就二分一波答案。判断那里,考虑用set来维护相同hash值。

    枚举长度为x(二分的值)的区间,然后将A串里面这个区间的hash值塞进set里面。对每个B串在set里面find一下这个字串有没有出现过即可。

    #include <bits/stdc++.h>
    #define ll long long
    #define inf 0x3f3f3f3f
    #define il inline
    #define ull unsigned long long
    
    namespace io {
    
    #define in(a) a = read()
    #define out(a) write(a)
    #define outn(a) out(a), putchar('
    ')
    
    #define I_int ll
    inline I_int read() {
        I_int x = 0, f = 1;
        char c = getchar();
        while (c < '0' || c > '9') {
            if (c == '-') f = -1;
            c = getchar();
        }
        while (c >= '0' && c <= '9') {
            x = x * 10 + c - '0';
            c = getchar();
        }
        return x * f;
    }
    char F[200];
    inline void write(I_int x) {
        if (x == 0) return (void) (putchar('0'));
        I_int tmp = x > 0 ? x : -x;
        if (x < 0) putchar('-');
        int cnt = 0;
        while (tmp > 0) {
            F[cnt++] = tmp % 10 + '0';
            tmp /= 10;
        }
        while (cnt > 0) putchar(F[--cnt]);
    }
    #undef I_int
    
    }
    using namespace io;
    
    using namespace std;
    
    #define N 510
    #define base 13131
    
    int n = read(), m = read();
    char s[N][N], t[N][N];
    ull h1[N][N], h2[N][N], p[N];
    set<ull>S;
    
    ull get(ull *h, int l, int r) {
    	return h[r] - h[l-1] * p[r-l+1];
    }
    
    bool check(int x) {
    	bool ans = 0;
    	for(int l = 1; l + x - 1 <= m; ++l) {
    		int r = l + x - 1, flag = 0;
    		S.clear();
    		for(int i = 1; i <= n; ++i) {
    			S.insert(get(h1[i], l, r));
    		}
    		for(int i = 1; i <= n; ++i) {
    			if(S.find(get(h2[i], l, r)) != S.end()) {
    				flag = 1;
    				break;
    			}
    		}
    		if(!flag) {
    			ans = 1;
    			break;
    		}
    	}
    	return ans;
    }
    
    int main() { 
    	for(int i = 1; i <= n; ++i) scanf("%s",s[i]+1);
    	for(int i = 1; i <= n; ++i) scanf("%s",t[i]+1);
    	p[0] = 1;
    	for(int i = 1; i <= m; ++i) p[i] = p[i - 1] * base;
    	for(int i = 1; i <= n; ++i) {
    		for(int j = 1; j <= m; ++j) h1[i][j] = h1[i][j-1]*base+(ull)s[i][j];
    		for(int j = 1; j <= m; ++j) h2[i][j] = h2[i][j-1]*base+(ull)t[i][j]; 
    	}
    	int l = 1, r = m, ans = m;
    	while(l <= r) {
    		int mid = (l + r) >> 1;
    		if(check(mid)) ans = mid, r = mid - 1;
    		else l = mid + 1;
    	}
    	outn(ans);
    	return 0;
    }
    
  • 相关阅读:
    论文写作参考文献格式规范
    中国人正在上的四个大当,你上了没?
    Visual Basic.NET中访问数据的方法
    [转]怎样写好论文一个大学教授、审稿专家的写作经验
    在simulink环境下实现实时仿真
    用matlab做经典功率谱估计
    显示不了隐藏文件的解决办法
    改proe里面背景颜色
    推荐一款免费电脑打电话软件,只要注册一次就可以获得8分钟免费通话时间
    最小二乘法曲线拟合
  • 原文地址:https://www.cnblogs.com/henry-1202/p/10631415.html
Copyright © 2011-2022 走看看