zoukankan      html  css  js  c++  java
  • Repeated DNA Sequences 解答

    Question

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].

    Solution -- Bit Manipulation

    Original idea is to use a set to store each substring. Time complexity is O(n) and space cost is O(n). But for details of space cost, a char is 2 bytes, so we need 20 bytes to store a substring and therefore (20n) space.

    If we represent DNA substring by integer, the space is cut down to  (4n).

     1 public List<String> findRepeatedDnaSequences(String s) {
     2     List<String> result = new ArrayList<String>();
     3  
     4     int len = s.length();
     5     if (len < 10) {
     6         return result;
     7     }
     8  
     9     Map<Character, Integer> map = new HashMap<Character, Integer>();
    10     map.put('A', 0);
    11     map.put('C', 1);
    12     map.put('G', 2);
    13     map.put('T', 3);
    14  
    15     Set<Integer> temp = new HashSet<Integer>();
    16     Set<Integer> added = new HashSet<Integer>();
    17  
    18     int hash = 0;
    19     for (int i = 0; i < len; i++) {
    20         if (i < 9) {
    21             //each ACGT fit 2 bits, so left shift 2
    22             hash = (hash << 2) + map.get(s.charAt(i)); 
    23         } else {
    24             hash = (hash << 2) + map.get(s.charAt(i));
    25             //make length of hash to be 20
    26             hash = hash &  (1 << 20) - 1; 
    27  
    28             if (temp.contains(hash) && !added.contains(hash)) {
    29                 result.add(s.substring(i - 9, i + 1));
    30                 added.add(hash); //track added
    31             } else {
    32                 temp.add(hash);
    33             }
    34         }
    35  
    36     }
    37  
    38     return result;
    39 }
  • 相关阅读:
    本地存储 localStorage
    正则对象
    面向对象
    事件
    日期对象
    网易适配与淘宝适配
    自动把网页px单位转换成rem
    湖南省web应用软件(中慧杯)
    百度图片审核功能
    百度ai语音识别
  • 原文地址:https://www.cnblogs.com/ireneyanglan/p/4809078.html
Copyright © 2011-2022 走看看