zoukankan      html  css  js  c++  java
  • Colidity--GenomicRangeQuery

    思路:统计每一个字符前面的四个字符的个数(利用前缀和数组),这样就能在O(1)时间得到某个区间某个字符的个数

    开始想到的是O(n^2)的空间,这个思路比较好

     1 // you can use includes, for example:
     2 // #include <algorithm>
     3 
     4 // you can write to stdout for debugging purposes, e.g.
     5 // cout << "this is a debug message" << endl;
     6 
     7 vector<int> solution(string &S, vector<int> &P, vector<int> &Q) {
     8     // write your code in C++11
     9     int len = S.length();
    10     vector<int> res;
    11     vector<vector<int> > arr(len);
    12     
    13     for(int k = 0 ; k < len ; ++k)
    14     {
    15         arr[k].resize(4);
    16     }
    17     int i,j;
    18     for(i = 0; i < len; i++)
    19     {
    20      char c = S[i];
    21      if(c == 'A') arr[i][0] = 1;
    22      if(c == 'C') arr[i][1] = 1;
    23      if(c == 'G') arr[i][2] = 1;
    24      if(c == 'T') arr[i][3] = 1;
    25    }
    26     for(i = 1 ; i < len ; ++i)
    27     {
    28         for(j = 0 ; j < 4 ; ++j)
    29         {
    30             arr[i][j] += arr[i-1][j];
    31         }
    32     }
    33     for(i = 0 ; i < P.size() ; ++i)
    34     {
    35         int x = P[i];
    36         int y = Q[i];
    37         for(j = 0 ; j < 4 ; ++j)
    38         {
    39             int sub = 0;
    40             if(x-1>=0)
    41             {
    42                 sub = arr[x-1][j];
    43             }
    44             if(arr[y][j] - sub > 0)
    45             {
    46                 res.push_back(j+1);
    47                 break;
    48             }
    49         }
    50     }
    51     return res;
    52 }

    A DNA sequence can be represented as a string consisting of the letters ACG and T, which correspond to the types of successive nucleotides in the sequence. Each nucleotide has an impact factor, which is an integer. Nucleotides of types ACG and T have impact factors of 1, 2, 3 and 4, respectively. You are going to answer several queries of the form: What is the minimal impact factor of nucleotides contained in a particular part of the given DNA sequence?

    The DNA sequence is given as a non-empty string S =S[0]S[1]...S[N-1] consisting of N characters. There are M queries, which are given in non-empty arrays P and Q, each consisting of M integers. The K-th query (0 ≤ K < M) requires you to find the minimal impact factor of nucleotides contained in the DNA sequence between positions P[K] and Q[K] (inclusive).

    For example, consider string S = CAGCCTA and arrays P, Q such that:

        P[0] = 2    Q[0] = 4
        P[1] = 5    Q[1] = 5
        P[2] = 0    Q[2] = 6

    The answers to these M = 3 queries are as follows:

    • The part of the DNA between positions 2 and 4 contains nucleotides G and C (twice), whose impact factors are 3 and 2 respectively, so the answer is 2.
    • The part between positions 5 and 5 contains a single nucleotide T, whose impact factor is 4, so the answer is 4.
    • The part between positions 0 and 6 (the whole string) contains all nucleotides, in particular nucleotide Awhose impact factor is 1, so the answer is 1.

    Write a function:

    vector<int> solution(string &S, vector<int> &P, vector<int> &Q);

    that, given a non-empty zero-indexed string S consisting of N characters and two non-empty zero-indexed arrays P and Q consisting of M integers, returns an array consisting of M integers specifying the consecutive answers to all queries.

    The sequence should be returned as:

    • a Results structure (in C), or
    • a vector of integers (in C++), or
    • a Results record (in Pascal), or
    • an array of integers (in any other programming language).

    For example, given the string S = CAGCCTA and arrays P, Q such that:

        P[0] = 2    Q[0] = 4
        P[1] = 5    Q[1] = 5
        P[2] = 0    Q[2] = 6

    the function should return the values [2, 4, 1], as explained above.

    Assume that:

    • N is an integer within the range [1..100,000];
    • M is an integer within the range [1..50,000];
    • each element of arrays P, Q is an integer within the range [0..N − 1];
    • P[K] ≤ Q[K], where 0 ≤ K < M;
    • string S consists only of upper-case English letters A, C, G, T.

    Complexity:

    • expected worst-case time complexity is O(N+M);
    • expected worst-case space complexity is O(N), beyond input storage (not counting the storage required for input arguments).
  • 相关阅读:
    kb,mb
    搜狗浏览器“Alt+Z”(重新打开刚关闭的页面)失效的解决方案——使用“hkexplr”查看占用的快捷键
    在 Windows server 中备份数据、分区、磁盘,以及硬盘对拷——傲梅轻松备份2.1.0汉化破解技术员版
    使用了阵列卡的服务器,在Windows系统内看到硬盘的品牌、型号信息——aida64
    利用“VeraCrypt”创建加密卷(文件夹加密,较高强度)
    Windows下多个硬盘显示为一个分区的方案
    win10企业版400年密钥
    Easy2Boot——可制作多包含多个原版系统(.iso)的工具
    第三方资源管理器——XYplorer(可自定义文件夹颜色)
    Windows10专业版(版本号:21H1)安装后的设置
  • 原文地址:https://www.cnblogs.com/cane/p/3973828.html
Copyright © 2011-2022 走看看