zoukankan      html  css  js  c++  java
  • [Leetcode] Edit Distance

    Given two words word1 and word2, find the minimum number of steps required to convert word1 to word2. (each operation is counted as 1 step.)

    You have the following 3 operations permitted on a word:

    a) Insert a character
    b) Delete a character
    c) Replace a character

    自然语言处理(NLP)中,有一个基本问题就是求两个字符串的minimal Edit Distance, 也称Levenshtein distance。受到一篇Edit Distance介绍文章的启发,本文用动态规划求取了两个字符串之间的minimal Edit Distance. 动态规划方程将在下文进行讲解。 

    1. what is minimal edit distance?

    简单地说,就是仅通过插入(insert)、删除(delete)和替换(substitute)个操作将一个字符串s1变换到另一个字符串s2的最少步骤数。熟悉算法的同学很容易知道这是个动态规划问题。 

    其实一个替换操作可以相当于一个delete+一个insert,所以我们将权值定义如下:

    I  (insert):1

    D (delete):1

    S (substitute):2

    2. example:

    intention->execution

    Minimal edit distance:

    delete i ; n->e ; t->x ; insert c ; n->u 求和得cost=8

    3.calculate minimal edit distance dynamically
    思路见注释,这里D[i,j]就是取s1前i个character和s2前j个character所得minimal edit distance

    三个操作动态进行更新:

    D(i,j)=min { D(i-1, j) +1, D(i, j-1) +1 , D(i-1, j-1) + s1[i]==s2[j] ? 0 : 2};中的三项分别对应D,I,S。(详见我同学的博客

     1 class Solution {
     2 public:
     3     int minDistance(string word1, string word2) {
     4         int len1 = word1.length();
     5         int len2 = word2.length();
     6         if (len1 == 0) return len2;
     7         if (len2 == 0) return len1;
     8         vector<vector<int> > dp(len1 + 1, vector<int>(len2 + 1));
     9         for (int i = 0; i <= len1; ++i) dp[i][0] = i;
    10         for (int j = 0; j <= len2; ++j) dp[0][j] = j;
    11         int cost;
    12         for (int i = 1; i <= len1; ++i) {
    13             for (int j = 1; j <= len2; ++j) {
    14                 cost = (word1[i-1] == word2[j - 1]) ? 0 : 1;
    15                 dp[i][j] = min(dp[i-1][j-1] + cost, min(dp[i][j-1] + 1, dp[i-1][j] + 1));
    16             }
    17         }
    18         return dp[len1][len2];
    19     }
    20 };
  • 相关阅读:
    request.getParameterMap 跟request.getParameter区别
    SQL语句中---删除表数据drop、truncate和delete的用法
    今日出现两个错误
    html和jsp的区别及优缺点
    怎么将 美国的日期格式改成中国的日期格式
    java web相关的面试题
    i++与++i的关系
    Oracle常见的语法,以及跟MySQL的区别
    DBA
    java基础之印象笔记
  • 原文地址:https://www.cnblogs.com/easonliu/p/3661537.html
Copyright © 2011-2022 走看看