zoukankan      html  css  js  c++  java
  • 最喜欢的算法(们)

    String Matching: Levenshtein distance

    • Purpose: to use as little effort to convert one string into the other
    • Intuition behind the method: replacement, addition or deletion of a charcter in a string
    • Steps

    Step

    Description

    1

    Set n to be the length of s.

    Set m to be the length of t.

    If n = 0, return m and exit.

    If m = 0, return n and exit.

    Construct a matrix containing 0..m rows and 0..n columns.

    2

    Initialize the first row to 0..n.

    Initialize the first column to 0..m.

    3

    Examine each character of s (i from 1 to n).

    4

    Examine each character of t (j from 1 to m).

    5

    If s[i] equals t[j], the cost is 0.

    If s[i] doesn't equal t[j], the cost is 1.

    6

    Set cell d[i,j] of the matrix equal to the minimum of:

    a. The cell immediately above plus 1: d[i-1,j] + 1.

    b. The cell immediately to the left plus 1: d[i,j-1] + 1.

    c. The cell diagonally above and to the left plus the cost: d[i-1,j-1] + cost.

    7

    After the iteration steps (3, 4, 5, 6) are complete, the distance is found in cell d[n,m].

    • Example

    This section shows how the Levenshtein distance is computed when the source string is "GUMBO" and the target string is "GAMBOL".

    Steps 1 and 2

        G U M B O
      0 1 2 3 4 5
    G 1          
    A 2          
    M 3          
    B 4          
    O 5          
    L 6          

    Steps 3 to 6 When i = 1

        G U M B O
      0 1 2 3 4 5
    G 1 0        
    A 2 1        
    M 3 2        
    B 4 3        
    O 5 4        
    L 6 5        

    Steps 3 to 6 When i = 2

        G U M B O
      0 1 2 3 4 5
    G 1 0 1      
    A 2 1 1      
    M 3 2 2      
    B 4 3 3      
    O 5 4 4      
    L 6 5 5      

    Steps 3 to 6 When i = 3

        G U M B O
      0 1 2 3 4 5
    G 1 0 1 2    
    A 2 1 1 2    
    M 3 2 2 1    
    B 4 3 3 2    
    O 5 4 4 3    
    L 6 5 5 4    

    Steps 3 to 6 When i = 4

        G U M B O
      0 1 2 3 4 5
    G 1 0 1 2 3  
    A 2 1 1 2 3  
    M 3 2 2 1 2  
    B 4 3 3 2 1  
    O 5 4 4 3 2  
    L 6 5 5 4 3  

    Steps 3 to 6 When i = 5

        G U M B O
      0 1 2 3 4 5
    G 1 0 1 2 3 4
    A 2 1 1 2 3 4
    M 3 2 2 1 2 3
    B 4 3 3 2 1 2
    O 5 4 4 3 2 1
    L 6 5 5 4 3 2

    Step 7

    The distance is in the lower right hand corner of the matrix, i.e. 2. This corresponds to our intuitive realization that "GUMBO" can be transformed into "GAMBOL" by substituting "A" for "U" and adding "L" (one substitution and 1 insertion = 2 changes).

     

  • 相关阅读:
    关于asp.net中Repeater控件的一些应用
    Linux查看程序端口占用情况
    php 验证身份证有效性,根据国家标准GB 11643-1999 15位和18位通用
    给Nginx配置一个自签名的SSL证书
    让你提升命令行效率的 Bash 快捷键 [完整版]
    关系数据库常用SQL语句语法大全
    php 跨域 form提交 2种方法
    Vimium~让您的Chrome起飞
    vim tab设置为4个空格
    CENTOS 搭建SVN服务器(附自动部署到远程WEB)
  • 原文地址:https://www.cnblogs.com/postmodernist/p/5177424.html
Copyright © 2011-2022 走看看