zoukankan      html  css  js  c++  java
  • Edit Distance

    Introduce

    Through the following, u will know what's edit distance.

    target : you are cute  and  I         love u.

    source:   I   am cute          I don't love u.

    For the two sentences, u can use edit including insert, delete and substitute the source so that it can be same to the target.

    u will subsutitute "you" for "I", which will cost 2

    subsutitue "are" for "am", which will cost 2

    insert "and", which will cost 1

    delete "don't", which will cost 1

    so, totally, u will cost 2 + 2 + 1 + 1 = 6, which is the edit distance.

    How to calculate the minimum edit distance

    Now, we take another example

    target : s t o p

    source: s o t

    u can insert "t", which will cost 1, to get "stot", then u can substitute "p" for "t", which will cost 2, to get "stop".

    Besides, u can substitue "t" for "o" to get "stt", then u can substitute "o" for "t" to get "sto", last, u can insert "p" to get "stop"

    Alright, u can have plenty of method to edit to get the target, so how to get the minimum edit distance. We will introduce the algorithm as following, which is dynamic programming.

    i : the length of the target

    j : the length of the source

    D(0, 0) = 0

    D(i, 0) = insertCost * i, means the length of the target is i and the length of the source is 0, so u should insert i characters

    D(0, j) = deleteCost * j, means the length of the target is 0 and the length of the source is j, so u should delete j characters

    D(i, j) = min {D(i - 1, j) + insertCost(target[i]);

                        D(i - 1, j - 1) + substituteCost(source[j], target[i]);

                        D(i, j - 1) + deleteCost(source[j])} means u should choose the method which will cost least.substituteCost = {0, 2}, if source[j] is equal to target[i], it's 0, or it's 2.

    insertCost = 1

    deleteCost = 1

    Well, maybe u will feel confuse, but no problem, We will use the algorithm to solve the above question.

    firstly, u can gain the belew form.

     

    D(1, 1) = min {D[0, 1] + insert(t[1]) = 2;

                          D[0, 0] + substitute(s[1], t[1]) = 0;

            D[1, 0] + delete(s[1]) = 2}          =    0

    D[1, 2] = min{ D[0, 2] + insert(t[1]) = 3;

            D[0, 1] + substitute(s[2], t[1]) = 3;

            D[1, 1] + delete(s[2] = 1)}            =   1

    ................

    then u will get the last number

    D[4, 3] = min {D[3, 2] + insert(t[4]) = 3;

            D[3, 2] + substitute(s[3], t[4]) = 3;

            D[4, 2] + delete(s[3]) = 3}    =    3

    Actually, u can get the path how to get edit through every step. The below picture show the results.

  • 相关阅读:
    超强、超详细Redis数据库入门教程
    性能测试学习线路图(建议)
    性能测试分析软件汇总–开源、商业全部收集
    基于Locust、Tsung的百万并发秒杀压测案例[转]
    brew update失败提示:/System/Library/Frameworks/Ruby.framework/。。。解决方法
    XML特殊符号
    tsuang简易教程-MAC版
    【转】SQL Server获取日期或时间
    autoIT 传参数含引号
    Autoit 脚本以及 Java 调用
  • 原文地址:https://www.cnblogs.com/chuanlong/p/2993774.html
Copyright © 2011-2022 走看看