zoukankan      html  css  js  c++  java
  • HDU DNA repair (AC自动机+DP)

    DNA repair

    Time Limit : 5000/2000ms (Java/Other)   Memory Limit : 32768/32768K (Java/Other)
    Total Submission(s) : 2   Accepted Submission(s) : 2

    Font: Times New Roman | Verdana | Georgia

    Font Size:  

    Problem Description

    Biologists finally invent techniques of repairing DNA that contains segments causing kinds of inherited diseases. For the sake of simplicity, a DNA is represented as a string containing characters 'A', 'G' , 'C' and 'T'. The repairing techniques are simply to change some characters to eliminate all segments causing diseases. For example, we can repair a DNA "AAGCAG" to "AGGCAC" to eliminate the initial causing disease segments "AAG", "AGC" and "CAG" by changing two characters. Note that the repaired DNA can still contain only characters 'A', 'G', 'C' and 'T'.

    You are to help the biologists to repair a DNA by changing least number of characters.

    Input

    The input consists of multiple test cases. Each test case starts with a line containing one integers N (1 ≤ N ≤ 50), which is the number of DNA segments causing inherited diseases.
    The following N lines gives N non-empty strings of length not greater than 20 containing only characters in "AGCT", which are the DNA segments causing inherited disease.
    The last line of the test case is a non-empty string of length not greater than 1000 containing only characters in "AGCT", which is the DNA to be repaired.

    The last test case is followed by a line containing one zeros.

    Output

    For each test case, print a line containing the test case number( beginning with 1) followed by the
    number of characters which need to be changed. If it's impossible to repair the given DNA, print -1.

    Sample Input

    2
    AAA
    AAG
    AAAG    
    2
    A
    TG
    TGAATG
    4
    A
    G
    C
    T
    AGT
    0
    

    Sample Output

    Case 1: 1
    Case 2: 4
    Case 3: -1
    

    Source

    2008 Asia Hefei Regional Contest Online by USTC
     
     
     
    题意:已知一个DNA串和一些病毒DNA序列,求出最少改变DNA串中多少个字符,能使得串中不包含任意一个病毒序列。
    题解:如果DNA串中含病毒串,则在AC自动机上能匹配,反之则不匹配。为了不匹配,就应该在AC自动机匹配过程中改变DNA序列使其不匹配。而我们在AC自动机上枚举的状态就是使其不能匹配到病毒串的可行状态,(可以理解成用AC自动机来压缩状态)
     

    题意:先给出m个DNA片段(含致病基因),然后给一个长为n的DNA序列,求最少需要修改多少次,使得这个DNA序列不含致病基因。修改操作定义为将DNA中某个碱基变为另一个碱基,如将A变为G

    数据范围:1<=m<=50,1<=n<=1000

    分析:先建自动机,然后DP。

    状态设计:dp[i][j]为从根结点出发走 i 步后到达状态 j 最少需要修改的次数。

    状态转移:

    1、dp[i][j]=MIN(dp[i-1][k]),从状态k能根据s[i]跳到状态j,无需修改;

    2、dp[i][j]=MIN(dp[i-1][k])+1,从状态k不能根据s[i]跳到状态j,需要修改s[i]。(注意区分DP的状态和自动机的状态)

    初始化:dp[0][0]=0,其余的dp[0][i]=INF.

    #include<iostream>
    #include<cstdio>
    #include<cstring>
    
    using namespace std;
    
    const int N=1010;
    const int INF=0x3f3f3f3f;
    
    struct Trie{
        int count;
        Trie *fail;
        Trie *next[4];
        void init(){
            count=0;
            fail=NULL;
            memset(next,NULL,sizeof(next));
        }
    }*root,*q[N],a[N];
    
    int k,dp[N][N];
    char wrd[30];
    char str[1010];
    
    int find(char ch){
        switch(ch){
            case 'A':return 0;
            case 'C':return 1;
            case 'T':return 2;
            case 'G':return 3;
        }
        return 0;
    }
    
    void Insert(char *str){
        Trie *loc=root;
        int i=0;
        while(str[i]!='\0'){
            int id=find(str[i]);
            if(loc->next[id]==NULL){
                a[k].init();
                loc->next[id]=&a[k++];
            }
            loc=loc->next[id];
            i++;
        }
        loc->count=1;
    }
    
    void AC_automation(){
        int head=0,tail=0;
        q[tail++]=root;
        Trie *cur,*tmp;
        while(head!=tail){
            cur=q[head++];
            tmp=NULL;
            for(int i=0;i<4;i++){
                if(cur->next[i]==NULL){
                    if(cur==root)
                        cur->next[i]=root;
                    else
                        cur->next[i]=cur->fail->next[i];
                }else{
                    if(cur==root)
                        cur->next[i]->fail=root;
                    else{
                        tmp=cur->fail;
                        while(tmp!=NULL){
                            if(tmp->next[i]!=NULL){
                                cur->next[i]->fail=tmp->next[i];
                                cur->next[i]->count |= tmp->next[i]->count;
                                break;
                            }
                            tmp=tmp->fail;
                        }
                        if(tmp==NULL)
                            cur->next[i]->fail=root;
                    }
                    q[tail++]=cur->next[i];
                }
            }
        }
    }
    
    int main(){
    
        //freopen("input.txt","r",stdin);
    
        int n,cases=0;
        while(~scanf("%d",&n) && n){
            k=0;
            root=&a[k++];
            root->init();
            for(int i=0;i<n;i++){
                scanf("%s",wrd);
                Insert(wrd);
            }
            AC_automation();
            scanf("%s",str);
            int len=strlen(str);
            memset(dp,0x3f,sizeof(dp));
            dp[0][0]=0;
            for(int i=1;i<=len;i++)
                for(int j=0;j<k;j++){
                    for(int idx=0;idx<4;idx++){
                        Trie *ptr=a[j].next[idx];
                        if(ptr->count)
                            continue;
                        int tmp=ptr-root;
                        dp[i][tmp]=min(dp[i][tmp],dp[i-1][j]+(idx!=find(str[i-1])));
                    }
                }
            int ans=INF;
            for(int i=0;i<k;i++)
                ans=min(ans,dp[len][i]);
            printf("Case %d: %d\n",++cases,ans==INF?-1:ans);
        }
        return 0;
    }
  • 相关阅读:
    MongoDB之Limit及Skip方法
    MongoDB之$type操作符
    MongoDB之条件操作符
    MongoDB之文档的增删改查
    MongoDB之集合的创建与删除
    MongoDB之数据库的创建及删除
    MongoDB之术语解析
    很少用的U盘,今天居然无法打开(插入盘后能看到盘符但是无法打开的问题)
    IDEA安装后必须设置的选项
    IDEA2020离线更新迭代小版本
  • 原文地址:https://www.cnblogs.com/jackge/p/3057868.html
Copyright © 2011-2022 走看看