zoukankan      html  css  js  c++  java
  • EditDistance,求两个字符串最小编辑距离,动态规划

    问题描述:

    题目描述
    Edit Distance
    Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)
    You have the following 3 operations permitted on a word:
         a) Insert a character
         b) Delete a character
         c) Replace a character

    算法分析: 

    也就是说,就是将一个字符串变成另外一个字符串所用的最少操作数,每次只能增加、删除或者替换一个字符。
    首先我们令word1和word2分别为:michaelab和michaelxy(为了理解简单,我们假设word1和word2字符长度是一样的),dis[i][j]作为word1和word2之间的Edit Distance,我们要做的就是求出michaelx到michaely的最小steps。

    首先解释下dis[i][j]:它是指word1[i]和word2[j]的Edit Distance。dis[0][0]表示word1和word2都为空的时候,此时他们的Edit Distance为0。很明显可以得出的,dis[0][j]就是word1为空,word2长度为j的情况,此时他们的Edit Distance为j,也就是从空,添加j个字符转换成word2的最小Edit Distance为j;同理dis[i][0]就是,word1长度为i,word2为空时,word1需要删除i个字符才能转换成空,所以转换成word2的最小Edit Distance为i。下面及时初始化代码:

           for (int i = 0; i < row; i++) dis[i][0] = i;
           for (int j = 0; j < col; j++) dis[0][j] = j;

    下面来分析下题目规定的三个操作:添加,删除,替换。
    假设word1[i]和word2[j](此处i = j)分别为:michaelab和michaelxy
    显然如果b==y, 那么dis[i][j] = dis[i-1][j-1]。
    如果b!=y,那么:
    添加:也就是在michaelab后面添加一个y,那么word1就变成了michaelaby,此时
    dis[i][j] = 1 + dis[i][j-1];
    上式中,1代表刚刚的添加操作,添加操作后,word1变成michaelaby,word2为michaelxy。dis[i][j-1]代表从word[i]转换成word[j-1]的最小Edit Distance,也就是michaelab转换成michaelx的最小Edit Distance,由于两个字符串尾部的y==y,所以只需要将michaelab变成michaelx就可以了,而他们之间的最小Edit Distance就是dis[i][j-1]。
    删除:也就是将michaelab后面的b删除,那么word1就变成了michaela,此时
    dis[i][j] = 1 + dis[i-1][j];
    上式中,1代表刚刚的删除操作,删除操作后,word1变成michaela,word2为michaelxy。dis[i-1][j]代表从word[i-1]转换成word[j]的最小Edit Distance,也就是michaela转换成michaelxy的最小Edit Distance,所以只需要将michaela变成michaelxy就可以了,而他们之间的最小Edit Distance就是dis[i-1][j]。
    替换:也就是将michaelab后面的b替换成y,那么word1就变成了michaelay,此时
    dis[i][j] = 1 + dis[i-1][j-1];
    上式中,1代表刚刚的替换操作,替换操作后,word1变成michaelay,word2为michaelxy。dis[i-1][j-1]代表从word[i-1]转换成word[j-1]的最小Edit Distance,也即是michaelay转换成michaelxy的最小Edit Distance,由于两个字符串尾部的y==y,所以只需要将michaela变成michaelx就可以了,而他们之间的最小Edit Distance就是dis[i-1][j-1]。

    /*
    if x == y, then dp[i][j] == dp[i-1][j-1]
    if x != y, and we insert y for word1, then dp[i][j] = dp[i][j-1] + 1
    if x != y, and we delete x for word1, then dp[i][j] = dp[i-1][j] + 1
    if x != y, and we replace x with y for word1, then dp[i][j] = dp[i-1][j-1] + 1
    When x!=y, dp[i][j] is the min of the three situations.
    Initial condition:
    dp[i][0] = i, dp[0][j] = j
    */
    public class EditDistance 
    {
    	public int minDistance(String word1, String word2) 
    	{
            int len1 = word1.length();
            int len2 = word2.length();
            
            int dp[][] = new int[len1+1][len2+1];
            for(int i = 0; i <= len1; i ++)//word1删除元素
            {
            	dp[i][0] = i;
            }
            for(int j = 0; j <= len2; j ++)//word1插入元素
            {
            	dp[0][j] = j;
            }
            
            for(int i = 0; i < len1; i ++)
            {
            	char c1 = word1.charAt(i);
            	for(int j = 0; j < len2; j ++)
            	{
            		char c2 = word2.charAt(j);
            		if(c1 == c2)
            		{
            			dp[i+1][j+1] = dp[i][j];
            		}
            		else
            		{ 
            			int insert = dp[i+1][j] + 1;
            			int delete = dp[i][j+1] + 1;
            			int replace = dp[i][j] + 1;
            			int min = insert>delete ? delete : insert;
            			min = min > replace ? replace : min;
            			dp[i+1][j+1] = min;
            		}
            	}
            }
            return dp[len1][len2];
        }
    }
    
    
    
    
     
  • 相关阅读:
    左偏树
    论在Windows下远程连接Ubuntu
    ZOJ 3711 Give Me Your Hand
    SGU 495. Kids and Prizes
    POJ 2151 Check the difficulty of problems
    CodeForces 148D. Bag of mice
    HDU 3631 Shortest Path
    HDU 1869 六度分离
    HDU 2544 最短路
    HDU 3584 Cube
  • 原文地址:https://www.cnblogs.com/masterlibin/p/5785092.html
Copyright © 2011-2022 走看看