zoukankan      html  css  js  c++  java
  • Longest Common Substring

    Problem Statement

    Give two string $s_1$ and $s_2$, find the longest common substring (LCS). E.g: X = [111001], Y = [11011], the longest common substring is [110] with length 3.

    One terse way is to use Dynamic Programming (DP) to analyze the complex problem.

    Instead of dealing with irregular substring, we can first deal with substring indexed by last character.

    Define $dp[i][j] =$ the length of longest common substring of $s_1[0$~$i]$ and $s_2[0$~$j]$ ending with $s1[i]$ and $s2[j]$.

    Then, the maximum LCS length could be the maximum number in array $dp$.

    In order to get the value of $dp[i][j]$, we need to know if $s1[i]$ == $s2[j]$. If it is, then the $dp[i][j] = dp[i-1][j-1]+1$, else it'll be zero. Thus:

    dp[i][j] = (s1[i] == s2[j] ? (dp[i-1][j-1] + 1) : 0);
    

    As we want to know the concrete string with LCM, we just need to do a few modifications.

    When we get a larger $dp[i][j]$ than present maxLength, we'll update the maxLength by $dp[i][j]$.

    if(dp[i][j] > maxLen)
        maxLen = dp[i][j];
    

    At the same time, we can also record the starting index of the new longer substring. For string $s_1$, the beginning index of LCM is the present index $i$ adding 1 minus the length of LCM, i.e.

    if(dp[i][j] > maxLen){
        maxLen = dp[i][j];
        maxIndex = i + 1 - maxLen; 
    }
    

    Finally, we need to initialize state of $dp$. That's simple:

    for(int i = 0; i < s1.length(); ++i)
        dp[i][0] = (s1[i] == s2[0] ? 1 : 0);
    
    for(int j = 0; j < s2.length(); ++j)
        dp[0][j] = (s1[0] == s2[j] ? 1 : 0);
    

    The complete code is:

    void LCM(const string s1, const string s2, int &sIndex, int &length)
    {
        n1 = s1.length();
        n2 = s2.length();
        
        if(0 == n1 || 0 == n2) 
        {
            sIndex = -1;
            length = 0;
            return;
        }
        
        // initialize dp
        vector<vector<int> > dp;
        for(int i = 0; i < n1; ++i){
            vector<int> tmp;
            tmp.push_back((s1[i] == s2[0] ? 1 : 0));  // Initialize the bottom line
            for(int j = 1; j < n2; ++j)
            {
                if(0 == i){
                    tmp.push_back((s1[0] == s2[j] ? 1 : 0));  // Initialize the left line
                }else{
                    tmp.push_back(0);  // Empty the interior area
                }
            }
            
            dp.push_back(tmp);
        }
        
        // compute max length and index
        length = 0;
        for(int i = 1; i < n1; ++i){
            for(int j = 1; j < n2; ++j){
                if(st1[i] == st2[j])
                    dp[i][j] = dp[i-1][j-1] + 1;
                    
                if(dp[i][j] > length){
                    length = dp[i][j];
                    sIndex = i + 1 - length;
                }
            }
        }    
    }
    
  • 相关阅读:
    day10 文件内指针移动 小练习 函数的基本使用 函数定义与调用的各三种形式 函数返回值 函数参数的使用
    day09 文件基本操作 上下文管理 文件的打开模式 文件修改的两种模式 今日作业
    day07 列表类型 练习题 元祖类型 元祖vs列表 字典类型 集合类型
    java笔试之参数解析(正则匹配)
    java笔试之提取不重复的整数
    java笔试之自守数
    java笔试之尼科彻斯定理
    java笔试之简单密码
    java笔试之求最大连续bit数
    java笔试之放苹果
  • 原文地址:https://www.cnblogs.com/kid551/p/4321392.html
Copyright © 2011-2022 走看看