Let's define (pi) function to descripe the longest common prefix and suffix ( also called "border" ) of substring (s_i) :
We appoint that (pi^{(t)}(s)) means (pi(s_{pi^{(t-1)}(s)})), and (pi^{(1)}(s)=pi(s)). Then define
Obviously, (pi(s_i)=pi^{(eta)}(s_{i-1})+1),so we can get (pi(s_{1sim |s|})) in (mathcal{O}(|s|)) time to solve the problem.
Of course, if we can't find a positive integer (eta) satisfying the condition, let (pi(s_i)=0) : )
#include <bits/stdc++.h>
using namespace std;
const int N = 1e6 + 20;
int n, m, Fail[N]; char s[N], t[N];
int main(void) {
scanf( "%s", s + 1 ), n = strlen( s + 1 );
scanf( "%s", t + 1 ), m = strlen( t + 1 );
for (int i = 2, j = 0; i <= m; i++) {
while ( j && t[j+1] != t[i] ) j = Fail[j];
j += t[j+1] == t[i], Fail[i] = j;
}
for (int i = 1, j = 0; i <= n; i++) {
while ( j && ( t[j+1] != s[i] || j == m ) ) j = Fail[j];
j += t[j+1] == s[i];
if ( j == m ) printf( "%d
", i - m + 1 );
}
for (int i = 1; i <= m; i++)
printf( "%d%c", Fail[i], "
"[ i == m ] );
return 0;
}
A deterministic finite automaton (mathcal{T}) called Knuth–Morris–Pratt Automaton when (mathcal{T}=(Q,alpha,delta,q_0,F)) refers to :
- (|Q|=|s|+1,alpha = { exttt{a}, exttt{b}, exttt{c},cdots})
- (delta(p,c)=pi(s_p+c))
- (q_0=0,|F|=1,|s|in F,mathcal{T}(s)= ext{true})
Actually, Knuth–Morris–Pratt Automaton is a simple automaton (mathcal{T}') which only accepts string (s) extended several transition accroding to our (pi) function. So we intend to construct the automaton by using the transition method of (pi(s_i)).
Let's assume for a case where we have already constructed the automaton of (s_i),then we want to extend a character (c). How the algorithm works ? Firstly, we can get the length of the border of (s_i) ( (=pi(s_i)) ) . So it's not hard to find that the state (delta(pi(s_i),c)) is the exact state which we are finding. That is (delta(s_i,c)gets delta(pi(s_i),c)).
Specially, let (delta(s_i,s(i+1))gets i+1) in order to recognize the whole string (s). To sum up, we construct the automaton in (mathcal O(|s| imes |alpha|)) time.
inline void Build(char *s) {
int len = strlen( s + 1 );
trans[0][s[1]-'a'] = 1;
for (int i = 1, j = 0; i < len; i++) {
for (int k = 0; k < 26; k++) trans[i][k] = trans[j][k];
trans[i][s[i+1]-'a'] = i+1, j = trans[j][s[i+1]-'a'];
}
}