zoukankan      html  css  js  c++  java
  • LeetCode-Repeated DNA Sequence

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT", Return: ["AAAAACCCCC", "CCCCCAAAAA"].

    Analysis:

    We want to code a 10-letter-long substring into a integer, to perform hashset add and check for duplication.

    Since each letter only has 4 cases: A,C,G,T, we can use 2-bit to represent it. Therefore, we can use a 20-bits integer to represent the substring.

    Solution:

    public class Solution {
        // Use mask to only maintain the last 20 bits.
        int mask = (1 << 20) - 1;
    
        public List<String> findRepeatedDnaSequences(String s) {
            List<String> resList = new ArrayList<String>();
            if (s.length() < 10)
                return resList;
    
            HashSet<Integer> codeSet = new HashSet<Integer>();
            HashSet<Integer> resSet = new HashSet<Integer>();
            char[] charArray = s.toCharArray();
    
            // Get code of the first 9 letters.
            int code = 0;
            for (int i = 0; i < 9; i++) {
                code = moveCode(code, charArray[i]);
            }
    
            for (int i = 9; i < s.length(); i++) {
                // Get code.
                code = moveCode(code, charArray[i]);
                // if current code has existed and have not appeared twice (i.e.,
                // not added to resList), then add it into resList.
                if (!codeSet.add(code) && resSet.add(code)) {
                    resList.add(s.substring(i - 9, i + 1));
                }
            }
            return resList;
        }
    
        public int moveCode(int value, char c) {
            value <<= 2;
            // if (c=='A') value += 0;
            if (c == 'C')  value += 1;
            if (c == 'G')  value += 2;
            if (c == 'T')  value += 3;
            value &= mask;
            return value;
        }
    
    }
  • 相关阅读:
    MKMapVIew学习系列2 在地图上绘制出你运行的轨迹
    WPF SDK研究 Intro(6) WordGame1
    WPF SDK研究 Intro(3) QuickStart3
    WPF SDK研究 Layout(1) Grid
    WPF SDK研究 目录 前言
    WPF SDK研究 Intro(7) WordGame2
    WPF SDK研究 Layout(2) GridComplex
    对vs2005创建的WPF模板分析
    WPF SDK研究 Intro(4) QuickStart4
    《Programming WPF》翻译 第6章 资源
  • 原文地址:https://www.cnblogs.com/lishiblog/p/5824312.html
Copyright © 2011-2022 走看看