zoukankan      html  css  js  c++  java
  • KMP算法描述-python

    KMP算法理论主要参考

      阮一峰的博客:http://www.ruanyifeng.com/blog/2013/05/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm.html

      GeeksforGeeks上的文章 :https://www.geeksforgeeks.org/searching-for-patterns-set-2-kmp-algorithm/

    KMP 比 原始搜索的优势:

      1 pattern不用每次都去wholeString回溯

       2 通过 partial table记录了pattern,所以避免了重复搜索

    KMP算法步骤:

      1 wholeString 和 Pattern 进行首部匹配,否则wholeSring向后移

      2 当首匹配,wholeString和Pattern同时向后移,直到位置不匹配

      3 当不匹配时,通过partial table 可以让Pattern向后移,当移到Pattern首的时候,回到步骤1

      4 期间如果Pattern被完全匹配,结果添加 当前位置-Pattern的长度

      5 搜索继续

    python代码描述

    class KMP(object):
    
        # partial table
        def partial(self, pattern):
            """ Calculate partial match table: String -> [Int]"""
            partialList = []
            for i in xrange(len(pattern)):
                p = pattern[:i+1]
                pre, last = len(p)-1, 1
                while pre>0 and p[:pre] != p[last:]:  # trickier: from long to short
                    pre -= 1
                    last += 1
                # print p, len(p[:pre])
                partialList.append(len(p[:pre]))
            # print partialList
            return [0]+partialList[:-1]  # nextList
    
        def search(self, T, P):
            """
            KMP search main algorithm: String -> String -> [Int]
            Return all the matching position of pattern string P in S
            """
            ansList = []
            partial = self.partial(P)
            print partial
            i, j = 0, 0  # T的index; P的index
            while i < len(T):
                if T[i] == P[j]:
                    # 两个index都向后走
                    i += 1
                    j += 1
                    # 全部匹配
                    if j == len(P)-1 and T[i] == P[j]:
                        ansList.append(i-j)
                        j = 0
                    # 当发生部分没有匹配的时候
                    while j>0 and T[i] != P[j]:
                        j = partial[j]  # P在向后移动, 直到移动到P的首位
                else:  # 找出P与T第一个相遇的点
                    i += 1
            print ansList
            return ansList

    测试用例

    s1 = 'BBCABCDABABCDABCDABDEABCDABD'
    p1 = 'ABCDABD'
    s2 = '&quot;ABABDABACDABABCABAB&quot'
    p2 = 'ABABCAB'
    KMP().search(s2, p2)

    还有一个比较好的写法,来自m00nlight的github:https://gist.github.com/m00nlight/daa6786cc503fde12a77#file-gistfile1-py,代码如下:

    class KMP:
        def partial(self, pattern):
            """ Calculate partial match table: String -> [Int]"""
            ret = [0]
            
            for i in range(1, len(pattern)):
                j = ret[i - 1]
                while j > 0 and pattern[j] != pattern[i]:
                    j = ret[j - 1]
                ret.append(j + 1 if pattern[j] == pattern[i] else j)
            return ret
            
        def search(self, T, P):
            """ 
            KMP search main algorithm: String -> String -> [Int] 
            Return all the matching position of pattern string P in S
            """
            partial, ret, j = self.partial(P), [], 0
            
            for i in range(len(T)):
                while j > 0 and T[i] != P[j]:
                    j = partial[j - 1]
                if T[i] == P[j]: j += 1
                if j == len(P): 
                    ret.append(i - (j - 1))
                    j = 0
                
            return ret
  • 相关阅读:
    转发与重定向的区别
    Servlet开发详讲
    Servlet的常见错误
    HTTP请求方式之POST和GET的区别
    Spring各种类型数据的注入
    Spring容器的基本使用
    Python接口自动化-测试用例编写
    Python接口自动化-设计测试用例
    python简明教程之数据结构(列表、元组、字典、集合)
    python简明教程之函数
  • 原文地址:https://www.cnblogs.com/fuzzier/p/9182568.html
Copyright © 2011-2022 走看看