zoukankan      html  css  js  c++  java
  • 动态规划求字符串编辑距离

    字符串编辑距离
    给定一个源字符串str1和目标字符串str2,能够对源串str1进行如下3种操作:
    在给定位置上
    1. 插入一个字符
    2. 替换任意字符
    3. 删除任意字符
    在对源字符串str1进行最小操作数的操作后使其等于目标字符串str2,则称该最小操作数为字符串编辑距离。

    动态规划
    分治方法求解问题的方式是通过组合子问题的解来求解原问题。但分治法可能会对子问题进行反复求解,动态规划与分治方法类似,但是将求解的子问题先保存下来(python中可以利用缓存装饰器@lru_cathe来保存子问题的解),在需要再此求解该子问题时,不进行调用函数,而是先查找是否已经保存,来避免反复求解,节约计算时间。

    • 动态规划有以下三个特点:
    1. 子问题
      通过分治的思想将问题分解为子问题,通过递归的方式来求解。
    2. 将求解的子问题保存在表格中
      将求解的问题保存,避免后续求解过程中的反复求解。
    3. 解析存储的结果
      求解过程是一个递归的过程,有时需要对求解的过程进行解析。

    字符串编辑距离的python实现

    from collections import defaultdict
    from functools import lru_cache
    
    solution = {}
    
    @lru_cache(maxsize=2**10)
    def edit_distance(string1, string2):
        
        if len(string1) == 0: return len(string2)
        if len(string2) == 0: return len(string1)
        
        tail_s1 = string1[-1]
        tail_s2 = string2[-1]
        
        candidates = [
            (edit_distance(string1[:-1], string2) + 1, 'DEL {}'.format(tail_s1)),  # string 1 delete tail
            (edit_distance(string1, string2[:-1]) + 1, 'ADD {}'.format(tail_s2)),  # string 1 add tail of string2
        ]
        
        if tail_s1 == tail_s2:
            both_forward = (edit_distance(string1[:-1], string2[:-1]) + 0, '')
        else:
            both_forward = (edit_distance(string1[:-1], string2[:-1]) + 1, 'SUB {} => {}'.format(tail_s1, tail_s2))
    
        candidates.append(both_forward)
        
        min_distance, operation = min(candidates, key=lambda x: x[0])
        
        solution[(string1, string2)] = operation 
        
        return min_distance
    
    @lru_cache(maxsize=2**10)
        def parse_solution(self, string1 = '', string2 = ''):
            if self.string_used == False:
                string1 = self.string1
                string2 = self.string2
                self.string_used = True
                
            operation = self.solution[string1, string2]
            if len(string1) == 0: 
                self.string_used = False
                self.parsed_solution.append(operation + ' before ind={}'.format(len(string1)))
                return self.parsed_solution[::-1]
            if len(string2) == 0: 
                self.string_used = False
                self.parsed_solution.append(operation + ' before ind={}'.format(len(string1)))
                return self.parsed_solution[::-1]
            
            if operation == '':
                self.parse_solution(string1[:-1], string2[:-1])
            else:
                operator = operation[:3]
                operand = operation[3:]
                if operator == 'ADD':
                    self.parse_solution(string1, string2[:-1])
                    self.parsed_solution.append(operation + ' after ind={}'.format(len(string1)-1))
                elif operator == 'DEL':
                    self.parse_solution(string1[:-1], string2)
                    self.parsed_solution.append(operation + ' at ind={}'.format(len(string1)-1))
                elif operator == 'SUB':
                    self.parse_solution(string1[:-1], string2[:-1])
                    self.parsed_solution.append(operation + ' at ind={}'.format(len(string1)-1))
            return self.parsed_solution[::-1]
    

    结果

    edit_distance('ABCDE', 'ABCCEF')
    >> 4
    
    
  • 相关阅读:
    ABAP接口用法
    监听textarea数值变化
    The first step in solving any problem is recognizing there is one.
    Wrinkles should merely indicate where smiles have been.
    God made relatives.Thank God we can choose our friends.
    Home is where your heart is
    ABAP跳转屏幕
    Python 工具包 werkzeug 初探
    atom通过remote ftp同步本地文件到远程主机的方法
    Mongodb学习笔记一
  • 原文地址:https://www.cnblogs.com/bitbitbyte/p/12536584.html
Copyright © 2011-2022 走看看