zoukankan      html  css  js  c++  java
  • Longest Common Substring

    Problem Statement

    Give two string $s_1$ and $s_2$, find the longest common substring (LCS). E.g: X = [111001], Y = [11011], the longest common substring is [110] with length 3.

    One terse way is to use Dynamic Programming (DP) to analyze the complex problem.

    Instead of dealing with irregular substring, we can first deal with substring indexed by last character.

    Define $dp[i][j] =$ the length of longest common substring of $s_1[0$~$i]$ and $s_2[0$~$j]$ ending with $s1[i]$ and $s2[j]$.

    Then, the maximum LCS length could be the maximum number in array $dp$.

    In order to get the value of $dp[i][j]$, we need to know if $s1[i]$ == $s2[j]$. If it is, then the $dp[i][j] = dp[i-1][j-1]+1$, else it'll be zero. Thus:

    dp[i][j] = (s1[i] == s2[j] ? (dp[i-1][j-1] + 1) : 0);
    

    As we want to know the concrete string with LCM, we just need to do a few modifications.

    When we get a larger $dp[i][j]$ than present maxLength, we'll update the maxLength by $dp[i][j]$.

    if(dp[i][j] > maxLen)
        maxLen = dp[i][j];
    

    At the same time, we can also record the starting index of the new longer substring. For string $s_1$, the beginning index of LCM is the present index $i$ adding 1 minus the length of LCM, i.e.

    if(dp[i][j] > maxLen){
        maxLen = dp[i][j];
        maxIndex = i + 1 - maxLen; 
    }
    

    Finally, we need to initialize state of $dp$. That's simple:

    for(int i = 0; i < s1.length(); ++i)
        dp[i][0] = (s1[i] == s2[0] ? 1 : 0);
    
    for(int j = 0; j < s2.length(); ++j)
        dp[0][j] = (s1[0] == s2[j] ? 1 : 0);
    

    The complete code is:

    void LCM(const string s1, const string s2, int &sIndex, int &length)
    {
        n1 = s1.length();
        n2 = s2.length();
        
        if(0 == n1 || 0 == n2) 
        {
            sIndex = -1;
            length = 0;
            return;
        }
        
        // initialize dp
        vector<vector<int> > dp;
        for(int i = 0; i < n1; ++i){
            vector<int> tmp;
            tmp.push_back((s1[i] == s2[0] ? 1 : 0));  // Initialize the bottom line
            for(int j = 1; j < n2; ++j)
            {
                if(0 == i){
                    tmp.push_back((s1[0] == s2[j] ? 1 : 0));  // Initialize the left line
                }else{
                    tmp.push_back(0);  // Empty the interior area
                }
            }
            
            dp.push_back(tmp);
        }
        
        // compute max length and index
        length = 0;
        for(int i = 1; i < n1; ++i){
            for(int j = 1; j < n2; ++j){
                if(st1[i] == st2[j])
                    dp[i][j] = dp[i-1][j-1] + 1;
                    
                if(dp[i][j] > length){
                    length = dp[i][j];
                    sIndex = i + 1 - length;
                }
            }
        }    
    }
    
  • 相关阅读:
    Java反射中Class.forName与classLoader的区别
    Java各种成员初始化顺序
    crontab python脚本不执行
    Java mybatis缓存(转)
    Java Synchronized及实现原理
    JVM类加载器
    SSH掉线问题
    SSH登陆远程卡、慢的解决的办法
    shell脚本执行python脚本时,python如何将返回值传给shell脚本
    使用scrapy进行数据爬取
  • 原文地址:https://www.cnblogs.com/kid551/p/4321392.html
Copyright © 2011-2022 走看看