zoukankan      html  css  js  c++  java
  • DNA repair问题

    问题:Biologists finally invent techniques of repairing DNA that contains segments causing kinds of inherited diseases. For the sake of simplicity, a DNA is represented as a string containing characters 'A', 'G' , 'C' and 'T'. The repairing techniques are simply to change some characters to eliminate all segments causing diseases. For example, we can repair a DNA "AAGCAG" to "AGGCAC" to eliminate the initial causing disease segments "AAG", "AGC" and "CAG" by changing two characters. Note that the repaired DNA can still contain only characters 'A', 'G', 'C' and 'T'.

    You are to help the biologists to repair a DNA by changing least number of characters.
     
    Input
    The input consists of multiple test cases. Each test case starts with a line containing one integers N (1 ≤ N ≤ 50), which is the number of DNA segments causing inherited diseases.
    The following N lines gives N non-empty strings of length not greater than 20 containing only characters in "AGCT", which are the DNA segments causing inherited disease.
    The last line of the test case is a non-empty string of length not greater than 1000 containing only characters in "AGCT", which is the DNA to be repaired.

    The last test case is followed by a line containing one zeros.
     
    Output
    For each test case, print a line containing the test case number( beginning with 1) followed by the
    number of characters which need to be changed. If it's impossible to repair the given DNA, print -1.
     

    Sample Input
    2
    AAA
    AAG
    AAAG    
    2
    A
    TG
    TGAATG
    4
    A
    G
    C
    T
    AGT
    0

     
    Sample Output
    Case 1: 1
    Case 2: 4
    Case 3: -1

    回答:题意给出一些不合法的模式DNA串,给出一个原串,问最少需要修改多少个字符,使得原串中不包含非法串
    多串匹配,先想到AC自动机,需要求出最少需要修改多少字符,DP。
    结合在一起
    每一次沿着Trie树往下走,不能到达叶子结点罢了。不过对于为空但是合法的孩子需要进行处理。
    DP方面,dp[i][j]表示前i个字符,当前为状态j的时候,需要修改的最少字符数。
    从i-1的状态,找到之后的状态,如果字符与原串相同,则不变,否则+1。代码如下:

    #include<iostream>
    #include<cstdio>
    #include<cstring>
    #include<cmath>
    #include<algorithm>
    #define N 100005
    #define MOD 100000
    #define inf 1<<29
    #define LL long long
    using namespace std;
    struct Trie{
        Trie *next[4];
        Trie *fail;
        int kind,isword;
    };
    Trie *que[N],s[N];
    int idx;
    int id(char ch){
        if(ch=='A') return 0;
        else if(ch=='T') return 1;
        else if(ch=='C') return 2;
        return 3;
    }
    Trie *NewNode(){
        Trie *tmp=&s[idx];
        for(int i=0;i<4;i++) tmp->next[i]=NULL;
        tmp->isword=0;
        tmp->kind=idx++;
        tmp->fail=NULL;
        return tmp;
    }
    void Insert(Trie *root,char *s,int len){
        Trie *p=root;
        for(int i=0;i<len;i++){
            if(p->next[id(s[i])]==NULL)
                p->next[id(s[i])]=NewNode();
            p=p->next[id(s[i])];
        }
        p->isword=1;
    }
    void Bulid_Fail(Trie *root){
        int head=0,tail=0;
        que[tail++]=root;
        root->fail=NULL;
        while(head<tail){
            Trie *tmp=que[head++];
            for(int i=0;i<4;i++){
                if(tmp->next[i]){
                    if(tmp==root) tmp->next[i]->fail=root;
                    else{
                        Trie *p=tmp->fail;
                        while(p!=NULL){
                            if(p->next[i]){
                               tmp->next[i]->fail=p->next[i];
                               break;
                            }
                            p=p->fail;
                        }
                        if(p==NULL) tmp->next[i]->fail=root;
                    }
                    if(tmp->next[i]->fail->isword) tmp->next[i]->isword=1;
                    que[tail++]=tmp->next[i];
                }
                else if(tmp==root) tmp->next[i]=root;
                else tmp->next[i]=tmp->fail->next[i];
            }
        }
    }
    int dp[1005][2005];
    int slove(char *str,int len){
        for(int i=0;i<=len;i++) for(int j=0;j<idx;j++) dp[i][j]=inf;
        dp[0][0]=0;
        for(int i=1;i<=len;i++){
            for(int j=0;j<idx;j++){
                if(s[j].isword) continue;
                if(dp[i-1][j]==inf) continue;
                for(int k=0;k<4;k++){
                    int r=s[j].next[k]->kind;
                    if(s[r].isword) continue;
                    dp[i][r]=min(dp[i][r],dp[i-1][j]+(id(str[i-1])!=k));
                }
            }
        }
        int ans=inf;
        for(int i=0;i<idx;i++) ans=min(ans,dp[len][i]);
        return ans==inf?-1:ans;
    }
    char str[1005];
    int main(){
        int n,cas=0;
        while(scanf("%d",&n)!=EOF&&n){
            idx=0;
            Trie *root=NewNode();
            for(int i=0;i<n;i++){
                scanf("%s",str);
                Insert(root,str,strlen(str));
            }
            Bulid_Fail(root);
            scanf("%s",str);
            printf("Case %d: %d ",++cas,slove(str,strlen(str)));
        }
        return 0;
    }

  • 相关阅读:
    VIM 基本配置
    VIM 基本配置
    CWnd与HWND的区别与转换 如何获取本窗体对象
    CWnd与HWND的区别与转换 如何获取本窗体对象
    CWnd与HWND的区别与转换 如何获取本窗体对象
    XEN
    XEN
    Xinyu Zhang
    (OK)(OK) SEEM ALL Testing Results
    如何写好一篇高质量的IEEE/ACM Transaction级别的计算机科学论文?
  • 原文地址:https://www.cnblogs.com/benchao/p/4537927.html
Copyright © 2011-2022 走看看