zoukankan      html  css  js  c++  java
  • Minimum Window Substring @LeetCode

    不好做的一道题,发现String Algorithm可以出很多很难的题,特别是多指针,DP,数学推导的题。参考了许多资料:

    http://leetcode.com/2010/11/finding-minimum-window-in-s-which.html

    http://www.geeksforgeeks.org/find-the-smallest-window-in-a-string-containing-all-characters-of-another-string/

    http://tianrunhe.wordpress.com/2013/03/23/minimum-window-substring/


    最有用的最后一个资料,因为里面有一个例子详细说明了如何变化,我加上了一些中文备注:

    For example,
    S = “ADOBECODEBANC”
    T = “ABC”
    Minimum window is “BANC”.

    Thoughts:
    The idea is from here. I try to rephrase it a little bit here. The general idea is that we find a window first, not necessarily the minimum, but it’s the first one we could find, traveling from the beginning of S. We could easily do this by keeping a count of the target characters we have found. After finding an candidate solution, we try to optimize it. We do this by going forward in S and trying to see if we could replace the first character of our candidate. If we find one, we then find a new candidate and we update our knowledge about the minimum. We keep doing this until we reach the end of S. For the giving example:

    1. We start with our very first window: “ADOBEC”, windowSize = 6. We now have “A”:1, “B”:1, “C”:1 (保存在needToFind数组里)
    2. We skip the following character “ODE” since none of them is in our target T. We then see another “B” so we update “B”:2. Our candidate solution starts with an “A” so getting another “B” cannot make us a “trade”. (体现在代码就是只有满足hasFound[S.charAt(start)] > needToFind[S.charAt(start)]) 才能移动左指针start)
    3. We then see another “A” so we update “A”:2. Now we have two “A”s and we know we only need 1. If we keep the new position of this “A” and disregard the old one, we could move forward of our starting position of window. We move from A->D->O->B. Can we keep moving? Yes, since we know we have 2 “B”s so we can also disregard this one. So keep moving until we hit “C”: we only have 1 “C” so we have to stop. Therefore, we have a new candidate solution, “CODEBA”. Our new map is updated to “A”:1, “B”:1, “C”:1.
    4. We skip the next “N” (这里忽略所有不在T的字符:用needToFind[S.charAt(start)] == 0来判断) and we arrive at “C”. Now we have two “C”s so we can move forward the starting position of last candidate: we move along this path C->O->D->E until we hit “B”. We only have one “B” so we have to stop. We have yet another new candidate, “BANC”.
    5. We have hit the end of S so we just output our best candidate, which is “BANC”.



    package Level4;
    
    /**
     * Minimum Window Substring
     * 
     * Given a string S and a string T, find the minimum window in S which will contain all the characters in T in complexity O(n).
    
    For example,
    S = "ADOBECODEBANC"
    T = "ABC"
    Minimum window is "BANC".
    
    Note:
    If there is no such window in S that covers all characters in T, return the emtpy string "".
    
    If there are multiple such windows, you are guaranteed that there will always be only one unique minimum window in S.
    
    Discuss
    
    
     *
     */
    public class S76 {
    
    	public static void main(String[] args) {
    	}
    	
    	public String minWindow(String S, String T) {
    		// 因为处理的是字符,所以可以利用ASCII字符来保存
    		int[] needToFind = new int[256];	// 保存T中需要查找字符的个数,该数组一旦初始化完毕就不再改动
    		int[] hasFound = new int[256];		// 保存S中已经找到字符的个数,该数组会动态变化
    		
    		for(int i=0; i<T.length(); i++){	// 初始化needToFind为需要查找字符的个数,
    			needToFind[T.charAt(i)]++;	// 如例子中T为ABC,则将会被初始化为:needToFind[65]=1, nTF[66]=2, nTF[67]=3
    		}
    		
    		int count = 0;		// 用于检测第一个符合T的S的字串
    		int minWindowSize = Integer.MAX_VALUE;	// 最小窗口大小
    		int start = 0, end = 0;		// 窗口的开始喝结束指针
    		String window = "";			// 最小窗口对应的字串
    		
    		for(; end<S.length(); end++){		// 用end来遍历S字符串
    			if(needToFind[S.charAt(end)] == 0){	// 表示可以忽略的字符,即除了T(ABC)外的所有字符
    				continue;
    			}
    			char c = S.charAt(end);
    			hasFound[c]++;		// 找到一个需要找的字符
    			
    			if(hasFound[c] <= needToFind[c]){	// 如果找到的已经超过了需要的,就没必要继续增加count
    				count++;
    			}
    			if(count == T.length()){	// 该窗口中至少包含了T
    				while(needToFind[S.charAt(start)] == 0 || 	// 压缩窗口,往后移start指针,一种情况是start指针指的都是可忽略的字符
    						hasFound[S.charAt(start)] > needToFind[S.charAt(start)]){	// 另一种情况是已经找到字符的个数超过了需要找的个数,因此可以舍弃掉多余的部分
    					if(hasFound[S.charAt(start)] > needToFind[S.charAt(start)]){
    						hasFound[S.charAt(start)]--;		// 舍弃掉多余的部分
    					}
    					start++;	// 压缩窗口
    				}
    				
    				if(end-start+1 < minWindowSize){		// 保存最小窗口
    					minWindowSize = end-start+1;
    					window = S.substring(start, end+1);
    				}
    			}
    		}
    		return window;
    	}
    }
    


  • 相关阅读:
    HDU5470 Typewriter SAM 动态规划 单调队列
    BZOJ4556 [Tjoi2016&Heoi2016]字符串 SA ST表 二分答案 主席树
    Codeforces 235C Cyclical Quest 字符串 SAM KMP
    HDU4622 Reincarnation 字符串 SAM
    Codeforces 452E Three strings 字符串 SAM
    BZOJ3926 [Zjoi2015]诸神眷顾的幻想乡 字符串 SAM
    2018牛客网暑假ACM多校训练赛(第三场)I Expected Size of Random Convex Hull 计算几何,凸包,其他
    2018牛客网暑假ACM多校训练赛(第三场)G Coloring Tree 计数,bfs
    2018牛客网暑假ACM多校训练赛(第三场)D Encrypted String Matching 多项式 FFT
    UOJ#310 【UNR #2】黎明前的巧克力 FWT 多项式
  • 原文地址:https://www.cnblogs.com/riasky/p/3478565.html
Copyright © 2011-2022 走看看