zoukankan      html  css  js  c++  java
  • python实现编辑距离edit distance

    1.定义理解

    edit distance——指两个字符串之间,一个转为另一个的最小编辑次数(方式有:插入/删除/替换)

    若edit distance越小,则字符串之间的相似度越高。

    例1:

    输入: word1 = "horse", word2 = "ros"
    输出: 3
    解释: 
    horse -> rorse (将 'h' 替换为 'r')
    rorse -> rose (删除 'r')
    rose -> ros (删除 'e')
    例2:

    输入: word1 = "intention", word2 = "execution"
    输出: 5
    解释:
    intention -> inention (删除 't')
    inention -> enention (将 'i' 替换为 'e')
    enention -> exention (将 'n' 替换为 'x')
    exention -> exection (将 'n' 替换为 'c')
    exection -> execution (插入 'u')

    2. python实现

    # -*- coding: utf8 -*-
    def ld(str1, str2):
    m, n = len(str1) + 1, len(str2) + 1

    # 初始化矩阵
    matrix = [[0] * n for i in range(m)]
    matrix[0][0] = 0
    for i in range(1, m):
    matrix[i][0] = matrix[i - 1][0] + 1
    for j in range(1, n):
    matrix[0][j] = matrix[0][j - 1] + 1
    # 动态规划计算ld值
    for i in range(1, m):
    for j in range(1, n):
    if str1[i - 1] == str2[j - 1]:
    matrix[i][j] = matrix[i - 1][j - 1]
    else:
    matrix[i][j] = min(matrix[i - 1][j - 1], matrix[i - 1][j], matrix[i][j - 1]) + 1

    return matrix[m - 1][j - 1]


    str1 = 'GAATTCAGTTA'
    str2 = 'GGATCGA'
    print(ld(str1, str2))

     

  • 相关阅读:
    AGC044D Guess the Password
    CF1290E Cartesian Tree
    loj2537. 「PKUWC2018」Minimax
    loj3166. 「CEOI2019」魔法树
    CF702F T-Shirts
    CF1260F Colored Tree
    CF1340F Nastya and CBS
    CF1017G The Tree
    CF150E Freezing with Style
    前端开发 -- HTML
  • 原文地址:https://www.cnblogs.com/qijiujiu/p/13259572.html
Copyright © 2011-2022 走看看