zoukankan      html  css  js  c++  java
  • [leedcode 187] Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",
    
    Return:
    ["AAAAACCCCC", "CCCCCAAAAA"].
    public class Solution {
        public List<String> findRepeatedDnaSequences(String s) {
            //因为只有4个字母,所以可以创建自己的hashkey, 每两个BITS, 对应一个 incoming character. 超过20bit 即10个字符时, 只保留20bits.
            Map<Character,Integer> map=new HashMap<Character,Integer>();
            map.put('A',0);
            map.put('C',1);
            map.put('G',2);
            map.put('T',3);
            List<String> res=new ArrayList<String>();
            int hash=0;
            Set<Integer> set=new HashSet<Integer>();
            for(int i=0;i<s.length();i++){
                char c=s.charAt(i);
                if(i<9){
                    hash=(hash<<2)+map.get(c);
                }else{
                    hash=(hash<<2)+map.get(c);
                    hash&=(1<<20)-1;
                    if(set.contains(hash)){
                        if(!res.contains(s.substring(i-9,i+1)))
                        res.add(s.substring(i-9,i+1));
                    }else{
                        set.add(hash);
                    }
                }
            }
            return res;
        }
    }
  • 相关阅读:
    MSP430程序库<十四>DMA程序库
    MSP430程序库<十三>硬件乘法器使用
    MSP430程序库<十五>Flash控制器
    MSP430程序库<九>数码管显示
    [debug] 调试小结
    SourceInsight Shortcuts
    git commands
    Linux常用命令
    [转] 宏点滴
    Linux 开发
  • 原文地址:https://www.cnblogs.com/qiaomu/p/4697092.html
Copyright © 2011-2022 走看看