zoukankan      html  css  js  c++  java
  • 187. Repeated DNA Sequences

    All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.

    Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.

    For example,

    Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",
    
    Return:
    ["AAAAACCCCC", "CCCCCAAAAA"].


    java的int占了4个字节位,总共32位;本题是将A,T,C,G四个字母每个字母占两位进行考虑,十个字母长度就是20位,每达到20位就存找到hashset里面。答案有一个地方非常巧妙,就是创建
    两个hashset,一个来存储出现一次,另一个hashset来存储出现第二次,这里面有一个细节,在if条件句里面&&连接的两个hashset,如果第一个为false,那么第二个将不会执行;代码如下:
     1 public class Solution {
     2     public List<String> findRepeatedDnaSequences(String s) {
     3         List<String> res = new ArrayList<String>();
     4         Set<Integer> words = new HashSet<Integer>();
     5         Set<Integer> doublewords = new HashSet<Integer>();
     6         int[] map = new int[26];
     7         map[0] = 0;
     8         map['C'-'A'] = 1;
     9         map['T'-'A'] = 2;
    10         map['G'-'A'] = 3;
    11         for(int i=0;i<s.length()-9;i++){
    12             int v = 0;
    13             for(int j=i;j<i+10;j++){
    14                 v<<=2;
    15                 v|=map[s.charAt(j)-'A'];
    16             }
    17             if(!words.add(v)&&doublewords.add(v)){
    18                 res.add(s.substring(i,i+10));
    19             }
    20         }
    21         return res;
    22     }
    23 }
  • 相关阅读:
    embeding 是什么
    linux xlearn安装
    argmin ,argmax函数
    随机森林算法OOB_SCORE最佳特征选择
    Scikit-Learn 机器学习笔记 -- 线性回归、逻辑回归、softma
    Spring回调方法DisposableBean接口
    java中InvocationHandler 用于实现代理。
    Spring之FactoryBean
    weblogic 的安装和配置
    JBoss7 安装配置
  • 原文地址:https://www.cnblogs.com/codeskiller/p/6592521.html
Copyright © 2011-2022 走看看