With respect to a given puzzle
string, a word
is valid if both the following conditions are satisfied:
word
contains the first letter ofpuzzle
.- For each letter in
word
, that letter is inpuzzle
.
For example, if the puzzle is "abcdefg", then valid words are "faced", "cabbage", and "baggage"; while invalid words are "beefed" (doesn't include "a") and "based" (includes "s" which isn't in the puzzle).
Return an array answer
, where answer[i]
is the number of words in the given word list words
that are valid with respect to the puzzle puzzles[i]
.
Example :
Input:
words = ["aaaa","asas","able","ability","actt","actor","access"],
puzzles = ["aboveyz","abrodyz","abslute","absoryz","actresz","gaswxyz"]
Output: [1,1,3,2,4,0]
Explanation:
1 valid word for "aboveyz" : "aaaa"
1 valid word for "abrodyz" : "aaaa"
3 valid words for "abslute" : "aaaa", "asas", "able"
2 valid words for "absoryz" : "aaaa", "asas"
4 valid words for "actresz" : "aaaa", "asas", "actt", "access"
There're no valid words for "gaswxyz" cause none of the words in the list contains letter 'g'.
Constraints:
1 <= words.length <= 10^5
4 <= words[i].length <= 50
1 <= puzzles.length <= 10^4
puzzles[i].length == 7
words[i][j]
,puzzles[i][j]
are English lowercase letters.- Each
puzzles[i]
doesn't contain repeated characters.
这道题说对于一个 puzzle 字符串,当满足两个条件就表示某个 word 是合法的。第一个是当 word 包含 puzzle 的首字母,第二个是对于 word 中的所有字母,均在 puzzle 中出现(这里不考虑次数,只考虑字母种类)。现在给了一个单词数组,和一个谜语数组,问对于每个 puzzle,各有多少个单词是合法的。这道题的题目挺长的,但是好在给的例子有详细的解释,题意并不难理解。大家能想到的最简单暴力的破解方法就是对于每个 puzzle,都遍历一遍所有的单词,去验证给的两个条件是否成立。但这种方法根本对不住本题的 Hard 身价,会被 OJ 无情拒绝,所以必须想更高效巧妙的方法。先从两个条件入手,第一个需要包含 puzzle 的首字母,这个没啥说的,很好满足,主要是第二个条件,word 中所有字母都需要在 puzzle 中出现,暴力搜索中的遍历会很耗时。因为这里不考虑出现次数,所以不用建立次数映射,而只是考虑字母种类的话,就可以用位操作 Bit Manipulation 来做,因为只有 26 个字母,若用每一个位来表示对应的字母,1表示存在,0表示不存在,那么最多只需要大小为 2^26 - 1 的整型数就能表示所有的情况。这样每个单词都可以用一个 mask 来表示了,注意不同的单词可能会有相同的 mask,比如 apple 和 pale,但不影响做题,因为字母的次数和顺序都无关紧要。由于这里关心的是相同 mask 的个数,则可以用个 HashMap 来建立 mask 和其出现次数之间的映射,方便之后的查询。
接下里就要开始处理 puzzles 数组了,遍历每个 puzzle 字符串,同样需要统计其字母种类,即算出 mask。跟 puzzle 的 mask 相同的单词肯定都是满足题意的,但不仅仅是相同的,还有 mask 的严格子集也是满足题意的,比如 puzzle 是 apple 的话,那么 word 可以是 apple,也可以是 apple 的子集,比如 app,ap,al 等等,只要包含首字母就可以了。那么如何求子集的 mask 呢,由于子集是减少字母的,所以 mask 值一定比原来的小,所以可以每次减1,但由于只能将原来1的位置变为0,而不能把0变为1,所以还需要 '与' 上原来的 mask。这样就可以使用一个 while 循环,若当前 sub(即为 mask)'与' 上 first 还是 first,且 sub 在 maskMap 中存在,则将其的映射值累加到 cnt。若此时 sub 为0,则 break 掉循环,否则 sub 自减1并 '与' 上原 mask 即可,参见代码如下:
class Solution {
public:
vector<int> findNumOfValidWords(vector<string>& words, vector<string>& puzzles) {
vector<int> res;
unordered_map<int, int> maskMap;
for (string word : words) {
int mask = 0;
for (char c : word) {
mask |= 1 << (c - 'a');
}
++maskMap[mask];
}
for (string puzzle : puzzles) {
int mask = 0;
for (char c : puzzle) {
mask |= 1 << (c - 'a');
}
int sub = mask, cnt = 0, first = 1 << (puzzle[0] - 'a');
while (true) {
if ((sub & first) == first && maskMap.count(sub)) {
cnt += maskMap[sub];
}
if (sub == 0) break;
sub = (sub - 1) & mask;
}
res.push_back(cnt);
}
return res;
}
};
Github 同步地址:
https://github.com/grandyang/leetcode/issues/1178
参考资料:
https://leetcode.com/problems/number-of-valid-words-for-each-puzzle/