滑动窗口(Sliding Window)问题经常使用快慢指针(slow, fast pointer)
[0, slow) 的区域为滑动窗口已经探索过的区域
[slow, fast]的区域为滑动窗口正在探索的区域
(fast, end of array)为待探索的区域
Sliding Window的问题主要分为:
fixed size sliding window 和 dynamic size sliding window
fixed size sliding window: 当快指针增加的时候慢指针必须增加
non-fixed size sliding window: 快指针增加,慢指针不一定变化
使用滑动窗口可以线性时间解决问题而且可以减少空间消耗
Fixed Length Sliding Window:
1.Strstr:
Return the index of the first occurrence of needle in haystack, or -1 if needle is not part of haystack.
Input: haystack = "hello", needle = "ll"
Output: 2
要求找到短字符串在的起始位置在长字符串中的位置
所以只需要保持一个fixed sliding window的长度为短字符串的长度然后扫长字符串来寻找起始位置
class Solution{
public int strStr(String long, String short) {
//sanity check
if(long == null || short == null) return -1;
int i = 0;
int j = needle.length();
while(i <= haystack.length() - needle.length() && j <= haystack.length()) {
if(haystack.substring(i, j).equals(needle)) {
return i;
}
i++;
j++;
}
return -1;
}
}
2.Repeated DNA Sequennce
All DNA is composed of a series of nucleotides abbreviated as A, C, G, and T, for example: "ACGAATTCCG". When studying DNA, it is sometimes useful to identify repeated sequences within the DNA.
Write a function to find all the 10-letter-long sequences (substrings) that occur more than once in a DNA molecule.
Given s = "AAAAACCCCCAAAAACCCCCCAAAAAGGGTTT",
Return:
["AAAAACCCCC", "CCCCCAAAAA"]
这道题给一个碱基序列,要求我们返回在given的碱基序列中重复的碱基序列
所以这道题我们可以用一个定长的滑动窗口,每次去match在given的碱基序列中任意的position从而返回所用出现过的重复的碱基序列,可以用一个HashSet的数据结构来判断是否已经检查过已经出现的序列
class Solution{
public List<String> repeatedDNASequence(String s) {
HashSet<String> window = new HashSet<String>();
HashSet<String> repeated = new HashSet<String>();
for(int i = 0; i < s.length() - 9; i++) {
if(!window.add(s.substring(i, i + 10))) {
repeated.add(s.substring(i, i + 10));
}
}
return new ArrayList<String>(repeated);
}
}
Non-fixed Size Sliding-Window
3.find all anagrams of shortString in longString
Given a string s and a non-empty string p, find all the start indices of p's anagrams in s.Strings consists of lowercase English letters only and the length of both strings s and p will not be larger than 20,100.The order of output does not matter.
Example 1:
Input:s: "cbaebabacd" p: "abc"
Output:[0, 6]
Explanation:
The substring with start index = 0 is "cba", which is an anagram of "abc".
The substring with start index = 6 is "bac", which is an anagram of "abc".
Example 2:
Input: s: "abab" p: "ab"
Output: [0, 1, 2]
Explanation:
The substring with start index = 0 is "ab", which is an anagram of "ab".
The substring with start index = 1 is "ba", which is an anagram of "ab".
The substring with start index = 2 is "ab", which is an anagram of "ab".
这道题是寻找input长字符串中所有出现子串的起始字母在长字符串中的位置
因为我们需要找到长字符串中所有match子串的字符串并且返回需要match的字串中第一个字母在长字符串中的位置,所以需要用一个动态的滑动窗口慢指针在match的子字符串的第一个字母在长字符串中的位置,快指针在最后一个match的字母在长字符串中的位置, 然后需要一个hashmap来记录每个字母出现的频率,利用length来teminate
class Solution{
public List<Integer> findAnagrams(String s, String p) {
//sanity check
List<Integer> res = new ArrayList<Integer>();
//count the frequency of each appeared character
Map<Character, Integer> map = new HashMap<Character, Integer>();
for(char c : p.toCharArray()) {
map.put(c, map.getOrDefault(0, c) + 1);
}
int fast = 0;
int slow = 0;
int count = map.size();//记录所有出现过字符的频率
//update fast pointer
while(fast < s.length()) {
char c = s.charAt(fast);
if(map.containsKey(s.charAt(fast)) {
map.put(c, map.get(fast) - 1);
if(map.get(c) == 0) count--;
}
fast++;
//update slow pointer
while(count == 0) {
char temp = s.charAt(slow);
if(map.containsKey(temp)) {
map.put(temp, map.get(temp) + 1));
if(map.get(temp) > 0) count++;
}
//store res;
if(fast - slow == p.length()) {
res.add(slow);
}
slow++;
}
}
return res;
}
}
4.Maximum Value of size K subarray
Given an array nums, there is a sliding window of size k which is moving from the very left of the array to the very right. You can only see the k numbers in the window. Each time the sliding window moves right by one position.
这题要求找到given数组中任意定长的滑动窗口中数的最大值,因此需要考虑一个数据结构可以在移动的滑动窗口中找到最大值,因此有几种想法:
1.在定长的滑动窗口里维持一个最大堆,因此我们可以用constant时间去找到最大值,但是考虑到每次heapify的时间需要O(logn),所以找到k个最大值需要花费O(klogn)的时间
2.还是同样在定长的滑动窗口里维持一个treeset,但是考虑到每次在treeset中添加或者删除元素需要花费O(logn)的时间,所以是否存在一个数据结构可以在线性时间内得到定长滑动窗口里的最大值?
3.因而,想到了双端队列(Deque),可以维持一个递增的双端队列
EX:[|1, 4|, 5, 3, 9], k = 3
我们先将k-1个元素放入队列:|2|
然后从第k个元素开始,一次加入新元素并删除旧元素,并且保持滑动窗口的size不变
[|1, 4, 5|, 3, 9], Deque: 5, Output: [5];
[1, |4, 5, 3|, 9], Deque: 5, 5, Output: [5, 5];
[1, 4, |5, 3, 9|], Deque: 8, Output: [5, 5, 8];
因为对于每个数组中的元素只扫描一次,所以总体时间在deque操作中也近似于线性,所以总运行时间:O(n)(amortized), 空间复杂度:O(1)
class slidingWindowMax{
public void inQueue(Deque<Integer> deque, int k) {
while(!deque.isEmpty() && deque.peekLast() < k) {
deque.pollLast();
}
deque.offerLast(num);
}
public void outQueue(Deque<Integer> deque, int k) {
if(deque.peekFirst() == k) {
deque.pollFirst();
}
}
public int[] maxSlidingWindow(int[] nums, int k) {
List<Integer> ans = new ArrayList<Integer>();
Deque<Integer> deque = new ArrayDeque<Integer>();
if(nums == null || nums.length == 0) {
return new int[]{};
}
for(int i = 0; i < k - 1; i++) {
inQueue(deque, nums[i]);
}
for(int i = k - 1; i < nums.length; i++) {
inQueue(deque, nums[i]);
res.add(deque.peekFirst());
outQueue(deque, nums[i - k + 1]);//delete old element
}
int[] res = new int[ans.size()];
int h = 0;
for(int num : res) {
res[h++] = num;
}
return res;
}
}