zoukankan      html  css  js  c++  java
  • KMP string pattern matching

    The function used here is from the leetcode. Details can be found in leetcode problem: Implement strStr()

    The best explanation should be made in the comments, which can be understood by the leading of code.

    // next[j]: the smallest valid position we need to check next when detect mismatch at jth character pattern[j]
    // Here, valid position means "pattern[0, ..., next[j]-1]" are matched with "text".
    
    void getNext(char *pattern, int next[]){
    	int i = 0, j = -1;
    	
    	// If the "text[i]" fails to match "patter[0]", then we need to 
    	// check "text[i+1]" and "patter[0]", which also means "text[i]"
    	// would check with "patter[-1]".
    	next[i] = j;	 
    	
    	// while loop 1:
    	while(pattern[i] != '){
    			// while loop 2:
    			while(j >= 0 && pattern[i] != pattern[j]){
    				// First, j need to be valid index, so it needs to be not less than 0.
    				// Then, if "pattern[i]" fails to match "pattern[j]", we can also think as
    				// "text[i]" fails to match "pattern[j]".
    				// So we need to check if "text[i]" matches with "pattern[next[j]]", as next[j]
    				// is the position we need to check when we fail at position j.
    				j = next[j];				
    			}
    			
    			// After the above while loop, we can know that "text[0, ..., i]" matches 
    			// "pattern[0, ..., j]", so we can move one more step for both "text" and "pattern".
    			++i; ++j;
    			
    			// For the new i, marked as i_new, we can determine its "next value" now!!
    			// As we've known that "text[0, ..., i_new - 1]" matches "pattern[0, ..., j_new - 1]",
    			// if we fail to match at position "text[i_new]", we can move pattern to the j_new position to
    			// check if "text[i_new]" matches "pattern[j_new]".
    			// P.S:
    			// Also, we can know the j_new position is the optimized position. If we can get a valid position j' (valid 
    			// means "pattern[0, ..., j'-1]" are matched with "text") smaller
    			// than j_new, then we'd get "(j' - 1)" (which is valid at position j'-1) is smaller than "next[j]", 
    			// which is contradicted to the definition of "next" table.
    			if(pattern[i] == pattern[j])
    				next[i] = next[j];
    			else
    				next[i] = j;			
    	}	
    }
    
    
    char *strStr(char *text, char *pattern){
    	if(NULL == text || NULL == pattern)
    		return NULL;
    	if('' == pattern[0])
    		return text;
    	
    	// i is the pointer of text, j is the pointer of pattern.
    	int i = 0, j = 0;
    	char *pos = NULL;
    	int *next = new int[strlen(pattern) + 1];	// include the ''
    	
    	getNext(pattern, next);
    	
    	while(text[i] != ''){
    		// Same optimization in getNext(), that is
    		// if we fail at one position, we may also fail at the 
    		// next position, which means we can continue along the "next" table
    		// Also, we need the index to be valid first.
    		while(j >= 0 && text[i] != pattern[j])
    			j = next[j];
    		
    		// After the while loop, we can know "text[0, ..., i]" matches "pattern[0, ..., j]"
    		// So we need to move one more step for both "text" and "pattern".
    		++i; ++j;
    		
    		if(pattern[j] == ''){
    			pos = (text + i) - j;	// The beginning position in text which corresponding to the matched pattern position.
    			return pos;
    		}		
    	}
    	
    	return pos;
    	
    }
    
  • 相关阅读:
    获取远程图片的Blob资源
    Vue使用SCSS进行模块化开发
    Vue设置页面的title
    Vue里边接口访问Post、Get
    module.exports 、 exports 和 export 、 export default 、 import
    Vue设置不同的环境发布程序
    记一个鼠标略过时候的css动画
    关于正则表达式中^和$
    [LOJ#2305]「NOI2017」游戏
    [LOJ#6437][BZOJ5373]「PKUSC2018」PKUSC
  • 原文地址:https://www.cnblogs.com/kid551/p/4370387.html
Copyright © 2011-2022 走看看