zoukankan      html  css  js  c++  java
  • CodeForces 1098F. Ж-function

    题目简述:给定字符串$s[1 dots n](n leq 2 imes 10^5)$,以及$Q leq 2 imes 10^5$个询问,每个询问有两个参数$1 leq l leq r leq n$,求

    $$ sum_{i=l}^r operatorname{lcp}(s[l dots r], s[i dots r]), $$

    其中$operatorname{lcp}(s, t)$表示字符串$s$和$t$的最长公共前缀(Longest Common Prefix)的长度。

    解:

    code

    后缀自动机(Suffix Automaton)与后缀树(Suffix Tree)

    后缀自动机学习资料:

    0. 论文:E. Ukkonen. On-line construction of suffix trees. Algorithmica, 14(3): 249-260 (1995)

    1. CSDN

    2. SAISUMIT

    3. CodeForces

    后缀树学习资料:

    1. cnblogs

    为了方便起见,我们在这里再次简要申明后缀自动机和后缀树的定义,以防止混淆各种不同定义的细微差别。

    后缀自动机

    一个(确定)有限状态自动机((Deterministic) Finite-State Automaton)$A = (Q, Sigma, delta, q_0, F)$,其中

      ·$Q$是一个有限的状态集合。

      ·$Sigma$是一个有限的字符集。

      ·$delta: Q imes Sigma o Q$是转移函数。若$delta(p,a)=q$,则可看作是$p$至$q$有一条权值为$a$的边。

      ·$q_0 in Q$是初始状态。

      ·$F subseteq Q$是接受状态的集合。

    一个非空字符串$s$的后缀自动机$ ext{SA}(s)$是一个有限状态自动机$A = (Q, Sigma, delta, q_0, F)$,其中

      ·$Q$和$delta$构成了一个连通的有向无环图(Directed Acyclic Graph)。

      ·$F = { f_1, f_2, dots, f_{|s|} }$,其中$f_{k}$对应后缀$s[kdots |s|]$。若对$q in Q$,定义$ ext{str}(q) = { t in Sigma^*: delta(q_0, t) = q } $,则$ ext{str}(f_k) supseteq { s[k dots |s|] }$。

    这样,被$ ext{SA}(s)$接受的字符串即为$s$的后缀,严格的说,$mathcal{L}( ext{SA}(s)) = ext{suffix}(s)$,其中$ ext{suffix}(s) = { s[k dots |s|]: 1 leq k leq |s| }$。

    状态$q$的代表字符串是$ ext{str}(q)$中的最长串,为方便起见仍然记作$q$。对任意$q in Q$,存在$1 leq k leq |q|$,使得$ ext{str}(q) = { q[idots |q|]: 1 leq i leq k }$。定义状态$q$的后缀链接(或者称为失败指针)$ ext{link}(q)$为$|p|$最长的$p$,使得$p in ext{suffix}(q)$且$p otin ext{str}(q)$,即$ ext{link}(q) = q[k+1 dots |q|]$。注意到$|q| > | ext{link}(q)|$,从而$Q$和$ ext{link}$构成了一棵有根树,树根为$q_0$,记这棵树为$T( ext{SA}(s))$。

    定理:存在$O(n)$算法,对给定字符串$s$,构造其后缀自动机$ ext{SA}(s)$。

    后缀树

    一个非空字符串$s$的后缀树$ ext{ST}(s)$是一棵(最简)字典树(Trie),这棵字典树的单词表为$ ext{suffix}(s)$,即$s$的所有后缀。$ ext{ST}(s)$的每条边上都有$s$的某个子串作为其权值,从而$ ext{ST}(s)$每个节点$v$都对应一个$s$的子串(即从根到这个节点路径上所有边权的连接),记作$ ext{str}(v)$。每个$s$的后缀$s[k dots |s|]$都对应$ ext{ST}(s)$的一个叶子节点。而每个节点$v$的父节点$w$满足$ ext{str}(w)$是$ ext{str}(v)$的前缀,且$ ext{str}(w)$长度最长。

    定理:$T( ext{SA}(s)) = ext{ST}( ext{rev}(s))$,其中$ ext{rev}(s)$表示$s$的翻转字符串。

    因此,可以通过先构造字符串$ ext{rev}(s)$的后缀自动机,来构造$s$的后缀树。(这是因为代码量较短)

    问题转化

    定义$ ext{ST}(s)$中,节点$v$的深度为$ ext{depth}(v) = | ext{str}(v)|$。对$1 leq l leq |s| = n$,我们把$s[ldots |s|]$对应的后缀树节点记作$l$,即$ ext{str}(l) = s[l dots n]$。令$L = r-l+1$,则

    $$
    egin{aligned}
    sum_{i=l}^r operatorname{lcp}(s[l dots r], s[i dots r])
    & = sum_{k=1}^{L} operatorname{lcp}(s[l dots r], s[l+k-1 dots r]) \
    & = sum_{k=1}^{L} min{ ext{depth}( ext{LCA}(l, l+k-1)), L-k+1 },
    end{aligned}
    $$

    其中$ ext{LCA}(u, v)$表示节点$u$和$v$的最近公共祖先(Least/Lowest Common Ancestor)。令$v(k)$表示节点$v$的祖先中,其父节点深度$< k$的最近祖先,即:若$u = v(k)$,则$ ext{depth}(u) geq k$且$ ext{depth}( ext{father}(u)) < k$。令$ ext{leaf}(v) subseteq [n]$表示节点$v$后代中的所有叶子节点,其中$[n] = {1,2,dots,n}$。从而,

    $$
    egin{aligned}
    & = sum_{k=1}^L left| ext{leaf}(l(k)) cap [l, l+L-k] ight| \
    & = sum_{k=1}^L left| ext{leaf}(l(k)) cap [l+L-k] ight| - sum_{k=1}^L left| ext{leaf}(l(k)) cap [l-1] ight| \
    & = S_2(L, l, l+L) - S_1(L, l, l-1)
    end{aligned}
    $$

    其中$S_1(L, l, M) = sum_{k=1}^L left| ext{leaf}(l(k)) cap [M] ight|$,$S_2(L, l, M) = sum_{k=1}^L left| ext{leaf}(l(k)) cap [M-k] ight|$。

    树链剖分

    令$ ext{son}(v)$表示节点$v$的所有儿子节点,$ ext{size}(v)$表示以节点$v$为根的子树的大小。定义$ ext{heavy}(v) = arg max_{u in ext{son}(v)} ext{size}(u)$表示节点$v$具有最大子树的儿子,他们之间的边称为重边,而与其余儿子之间的边称为轻边。

    设根节点为$r$,对任意节点$v$,节点$v$到根节点$r$的路径$P = ext{path}(v o r)$可划分成若干段子路径$P_1, P_2, dots, P_m$,其中$P_i = ext{path}(x_i o y_i)$,满足$x_1 = v$,$y_m = r$且$ ext{father}(y_{i}) = x_{i+1}$,每段子路径$P_i$都仅由重边组成。我们称$P_1, P_2, dots, P_m$是路径$P$的剖分。我们把$P_i = (mathit{belong}_i, mathit{no}_i, mathit{len}_i)$记作$(y_i, v, ext{depth}(x_i)- ext{depth}( ext{father}(y_i)))$,表示该路径的根为$mathit{belong}_i$,是路径$ ext{path}(mathit{no}_i, r)$的一段子路径,且这段路径的长度为$mathit{len}_i$。

    定理:子路径段数$m leq lfloor log_2 |V| floor+1$,其中$|V|$表示节点个数。

    记$ ext{route}(v)$表示路径$ ext{path}(v o r)$的剖分,记$ ext{route}_{leq L}(v)$表示路径$ ext{path}(v o r)$中深度$leq L$部分的剖分,即

    $$ ext{route}_{leq L}(v) = { (mathit{belong}, mathit{no}, min{mathit{len},L- ext{depth}( ext{father}(mathit{belong}))}): (mathit{belong}, mathit{no}, mathit{len}) in ext{route}(v) land ext{depth}( ext{father}(mathit{belong}))<L }. $$

    特别地,令$ ext{route}_{=L}(v) = ext{route}_{leq L}(v) - ext{route}_{leq L-1}(v)$。

    计算$S_1(L,l,M)$

    于是

    $$
    egin{aligned}
    S_1(L, l, M) 
    & = sum_{k=1}^L sum_{v = 1}^{M} [v in ext{leaf}(l(k))] \
    & = sum_{b} sum_{k=1}^L sum_{(b, l, mathit{len}_q) in ext{route}_{= k}(l)} sum_{v=1}^{M} sum_{(b, v, mathit{len}_p) in ext{route}(v)} [mathit{len}_q leq mathit{len}_p] \
    & = sum_{(b, l, mathit{len}_q) in ext{route}_{leq L}(l)} sum_{v=1}^{M} sum_{(b, v, mathit{len}_p) in ext{route}(v)} min{mathit{len}_q, mathit{len}_p}
    end{aligned}
    $$

    在固定$b$和$mathit{len}_q$后,令

    $$ a[i] = sum_{v=1}^M sum_{(b,v,i)in ext{route}(v)} 1. $$

    $$
    egin{aligned}
    sum_{v=1}^{M} sum_{(b, v, mathit{len}_p) in ext{route}(v)} min{mathit{len}_q, mathit{len}_p}
    & = sum_{i} a[i] min{mathit{len}_q, i} \
    & = mathit{len}_q sum_{i > mathit{len}_q} a[i] + sum_{i=1}^{mathit{len}_q} ia[i],
    end{aligned}
    $$

    可通过维护$a[i]$的前缀和以及$ia[i]$的前缀和来计算。

    我们可以在枚举$b$之后,从小到大枚举$v$(这就是扫描线),并维护$a[i]$及其相关前缀和,当枚举到$v = M$时,可计算$S_1(L,l,M)$在固定$b$后的贡献值。

    计算$S_2(L,l,m)$

    $$
    egin{aligned}
    S_2(L, l, M) 
    & = sum_{k=1}^L sum_{v = 1}^{M-k} [v in ext{leaf}(l(k))] \
    & = sum_{b} sum_{k=1}^L sum_{(b, l, mathit{len}_q) in ext{route}_{leq k}(l)} sum_{v=1}^{M-k} sum_{(b, v, mathit{len}_p) in ext{route}(v)} [mathit{len}_q leq mathit{len}_p] \
    & = sum_{(b, l, mathit{len}_q) in ext{route}_{leq L}(l)} sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+k-1 leq M],
    end{aligned}
    $$

    其中$mathit{pos}_p = v+ ext{depth}( ext{father}(b))+1$。

    固定$b$和$mathit{len}_q$后,讨论$mathit{pos}_p+mathit{len}_p$与$M+1$的大小关系,$S_2(L,l,M)$求和可分为两部分:

    $$
    egin{aligned} & quad sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+k-1 leq M] \
    & = sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+k-1 leq M land mathit{pos}_p+mathit{len}_p leq M+1] \ & quad + sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+k-1 leq M land mathit{pos}_p+mathit{len}_p > M+1] end{aligned}
    $$

    Part 1

    若$mathit{pos}_p+mathit{len}_p leq M+1$,则条件$mathit{pos}_p+k-1 leq M$是冗余的,于是求和可简化为

    $$ egin{aligned} & quad sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+k-1 leq M land mathit{pos}_p+mathit{len}_p leq M+1] \ & = sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+mathit{len}_p leq M+1] \ & = sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [mathit{pos}_p+mathit{len}_p leq M+1] min{mathit{len}_q, mathit{len}_p} end{aligned} $$

    类似于$S_1$的计算方式,令

    $$ a[i] = sum_{v=1}^n sum_{(b,v,mathit{len}_p)in ext{route}(v)} [mathit{len}_p+mathit{pos}_p leq M+1 land mathit{len}_p = i]. $$

    $$egin{aligned}
     sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [mathit{pos}_p+mathit{len}_p leq M+1] min{mathit{len}_q, mathit{len}_p} 
    & = sum_{i} a[i] min{mathit{len}_q, i} \
    & = mathit{len}_q sum_{i > mathit{len}_q} a[i] + sum_{i=1}^{mathit{len}_q} ia[i],
    end{aligned}$$

    我们以$mathit{pos}_p+mathit{len}_p$从小到大做扫描线,维护$a[i]$及其相关前缀和,当枚举到$mathit{pos}_p+mathit{len}_p = M+1$时,可计算 Part 1 的贡献值。

    Part 2

    若$mathit{pos}_p+mathit{len}_p > M+1$,则$1+mathit{pos}_p leq k+mathit{pos}_p leq M+1 < mathit{len}_p+mathit{pos}_p$。

    $$
    egin{aligned}
    & quad sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [k leq mathit{len}_p land mathit{pos}_p+k-1 leq M land mathit{pos}_p+mathit{len}_p > M+1] \
    & = sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [1+mathit{pos}_p leq k+mathit{pos}_p leq M+1 land mathit{pos}_p+mathit{len}_p > M+1] \
    & = sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [mathit{pos}_p leq M-k+1 land mathit{pos}_p+mathit{len}_p > M+1]
    end{aligned}
    $$

    $$ a[i] = sum_{v=1}^n sum_{(b,v,mathit{len}_p)in ext{route}(v)} [mathit{len}_p+mathit{pos}_p > M+1 land mathit{pos}_p = i]. $$

    $$
    egin{aligned}
    & quad sum_{k=1}^{mathit{len}_q} sum_{v=1}^n sum_{(b, v, mathit{len}_p) in ext{route}(v)} [mathit{pos}_p leq M-k+1 land mathit{pos}_p+mathit{len}_p > M+1] \
    & = sum_{k=1}^{mathit{len}_q} s[M-k+1] \
    & = mathit{len}_q s[M-mathit{len}_q+1] + (M+1) (s[M]-s[M-mathit{len}_q+1]) - (t[M]-t[M-mathit{len}_q+1]),
    end{aligned}
    $$

    其中$s[i] = a[1]+dots+a[i], t[i] = a[1]+2a[2]+dots+ia[i]$。

    我们以$mathit{pos}_p+mathit{len}_p$从大到小做扫描线,维护$a[i]$及其相关前缀和,当枚举到$mathit{pos}_p+mathit{len}_p = M+2$时,可计算 Part 2 的贡献值。

    时间复杂度

    设字符串$s$的长度为$|s| = n$,构建后缀树的时间为$O(n)$。根据树链剖分,子问题一共有$O((n+q) log n)$个,每个子问题需要离线做扫描线利用树状数组求解,故总的时间复杂度为$O((n+q)log^2 n)$。

  • 相关阅读:
    react脚手架
    快速创建一个node后台管理系统
    vue脚手架结构及vue-router路由配置
    Spring 事务管理-只记录xml部分
    Spring-aspectJ
    Maven 自定义Maven插件
    JVM
    JVM
    Spring
    Digester
  • 原文地址:https://www.cnblogs.com/TinyWong/p/10363110.html
Copyright © 2011-2022 走看看