zoukankan      html  css  js  c++  java
  • sicily 1035 DNA matching

    1035. DNA matching

    Constraints

    Time Limit: 1 secs, Memory Limit: 32 MB

    Description

    DNA (Deoxyribonucleic acid) is founded in every living creature as the storage medium for genetic information. It is comprised of subunits called nucleotides that are strung together into polymer chains. DNA polymer chains are more commonly called DNA strands.

        There are four kinds of nucleotides in DNA, distinguished by the chemical group, or base attached to it. The four bases are adenine, guanine, cytosine and thymine, abbreviated as A, G, C and T(these letters will be used to refer to nucleotides containing these bases). Single nucleotides are linked together end-to-end to form DNA single strands via chemical reactions. For simplicity, we can use a string composed of letters A, T, C and G to denote a single strand, such as ATTCGAC, but we must also note that the sequence of nucleotides in any strand has a natural orientation, so ATTCGAC and CAGCTTA can not be viewed as identical strands.

        DNA does not usually exist in nature as free single strands, though. Under appropriate conditions single strands will pair up and twist around each other, forming the famous double helix structure. This pairing occurs because of a mutual attraction, call hydrogen bonding, that exists between As and Ts, and between Gs and Cs. Hence A/T and G/C are called complementary base pairs.

    In the Molecular Biology experiments dealing with DNA, one important process is to match two complementary single strands, and make a DNA double strand. Here we give the constraint that two complementary single strands must have equal length, and the nucleotides in the same position of the two single strands should be complementary pairs. For example, ATTCGAC and TAAGCTG are complementary, but CAGCTTA and TAAGCTG are not,  neither are ATTCGAC and GTAAGCT.

    As a biology research assistant, your boss has assigned you a job: given n single strands, find out the maximum number of double strands that could be made (of course each strand can be used at most once). If n is small, of course you can find the answer with the help of pen and paper, however, sometimes n could be quite large… Fortunately you are good at programming and there is a computer in front of you, so you can write a program to help yourself. But you must know that you have many other assignments to finish, and you should not waste too much time here, so, hurry up please!

    Input

    Input may contain multiple test cases. The first line is a positive integer T(T<=20), indicating the number of test cases followed. In each test case, the first line is a positive integer n(n<=100), denoting the number of single strands below. And n lines follow, each line is a string comprised of four kinds of capital letters, A, T, C and G. The length of each string is no more than 100.

    Output

    For each test case, the output is one line containing a single integer, the maximum number of double strands that can be formed using those given single strands.

    Sample Input

    2
    3
    ATCG
    TAGC
    TAGG
    2
    AATT
    ATTA
    

    Sample Output

    1
    0
    

    Problem Source

    ZSUACM Team Member

    很简单的一道题,用map做。开始由于没有搞明白hash函数的返回值的问题,导致一直WA,后来知道了字符串的hash函数,可能对于不同的字符串返回相同的hash值!!!!切记切记

    错误题解:

    // Problem#: 1035
    // Submission#: 2933983
    // The source code is licensed under Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Unported License
    // URI: http://creativecommons.org/licenses/by-nc-sa/3.0/
    // All Copyright reserved by Informatic Lab of Sun Yat-sen University
    #include <iostream>
    #include <stdio.h>
    #include <string.h>
    using namespace std;
    // BKDR Hash Function
    unsigned int BKDRHash(char *str)
    {
        unsigned int seed = 131; // 31 131 1313 13131 131313 etc..
        unsigned int hash = 0;
    
        while (*str)
        {
            hash = hash * seed + (*str++);
        }
    
        return (hash & 0x7FFFFFFF);
    }
    void doubleStrand(char *strand, char *doublestrand);
    
    int main() {
    	int t;
    	int n;
    	
    	cin >> t;
    	while (t-- > 0) {
    		cin >> n;
    		char strand[101];
    		char strands[101][101];
    		int hash[101];
    		for (int i = 0; i < 101; i++)
    			hash[i] = 0;
    		int max = 0;
    		for (int i = 0; i < n; i++) {
    			cin >> strand;
    			strcpy(strands[i], strand); 
    			hash[BKDRHash(strands[i]) % 101]++;
    		}
    		for (int i = 0; i < n; i++) {
    			char *dstrand = new char[strlen(strands[i])+1];
    			
    			doubleStrand(strands[i], dstrand);
    			if (hash[BKDRHash(dstrand) % 101] != 0 && hash[BKDRHash(strands[i]) % 101] != 0) {
    				max++;
    				hash[BKDRHash(strands[i]) % 101]--;	
    				hash[BKDRHash(dstrand) % 101]--;
    			}
    			
    			delete []dstrand;		
    		}
    		cout << max << endl;
    	}
    	return 0;
    }
    
    void doubleStrand(char *strand, char *doublestrand) {
    	for (int i = 0; i < sizeof(strand); i++) {
    		switch (strand[i]) {
    			case 'A':
    				doublestrand[i] = 'T';
    				break;
    			case 'T':
    				doublestrand[i] = 'A';
    				break;
    			case 'C':
    				doublestrand[i] = 'G';
    				break;
    			case 'G':
    				doublestrand[i] = 'C';
    				break;
    			default:
    				break;
    		}
    	}
    	doublestrand[sizeof(strand)] = '';
    }                                 
    

      

     

      正确解法:

    #include <iostream>
    #include <stdio.h>
    #include <string.h>
    #include <map>
    using namespace std;
    
    void doubleStrand(string,string&);
    
    int main() {
    	int t;
    	int n;
    	
    	cin >> t;
    	while (t-- > 0) {
    		cin >> n;
    		string strand;
    		string strands[101];
    		map<string, int> s;
    	
    		int max = 0;
    		for (int i = 0; i < n; i++) {
    			cin >> strand;
    			strands[i] = strand;
    			//用map记录每种DNA单链的个数 
    			s[strand]++;
    		}
    		for (int i = 0; i < n; i++) {
    			string dstrand = strands[i];
    			//获得该单链对应的兄弟链 
    			doubleStrand(strands[i], dstrand);
    			//如果该单链与其对应的兄弟链均有未匹配的,则进行匹配,max加一 
    			if (s[dstrand] != 0 && s[strands[i]] != 0) {
    				max++;
    				s[dstrand]--; //记得此时未配对的数量减一 
    				s[strands[i]]--;
    			}	
    		}
    		cout << max << endl;
    	}
    	return 0;
    }
    //获得单链对应的兄弟链 
    void doubleStrand(string strand, string& doublestrand) {
    	for (int i = 0; i < strand.length(); i++) {
    		switch (strand[i]) {
    			case 'A':
    				doublestrand[i] = 'T';
    				break;
    			case 'T':
    				doublestrand[i] = 'A';
    				break;
    			case 'C':
    				doublestrand[i] = 'G';
    				break;
    			case 'G':
    				doublestrand[i] = 'C';
    				break;
    			default:
    				break;
    		}
    	}
    }
    

      

  • 相关阅读:
    HDU 1800 Flying to the Mars 字典树,STL中的map ,哈希树
    字典树 HDU 1075 What Are You Talking About
    字典树 HDU 1251 统计难题
    最小生成树prim算法 POJ2031
    POJ 1287 Networking 最小生成树
    次小生成树 POJ 2728
    最短路N题Tram SPFA
    poj2236 并查集
    POJ 1611并查集
    Number Sequence
  • 原文地址:https://www.cnblogs.com/xieyizun-sysu-programmer/p/sicily1035.html
Copyright © 2011-2022 走看看