String Powers
Published:
We are given as input a string $w$ of length $n$ asked to check whether it can be written as $u^k$ for some $k>1$. A straight-forward algorithm for this problem is to try all divisors $k$ of $n$ and check whether $w$ can be written as $u^k$. The implementation that tests for divisors up to $√n$ takes time $O(n^{1.5})$. Can we do better? Yes. We can do it in $O(n)$ time as follows: $w=u^k$ for some $u$ and some $k>1$ if and only if $w$ is a substring of $uv$ where $u$ is the string obtained from $w$ by removing the first symbol and $v$ is the string obtained from $w$ by removing the last symbol. Since $uv$ and $w$ have length $O(n)$, the Knuth-Morris-Pratt string search algorithm can be used to obtain the required $O(n)$ time algorithm.
we now prove that the algorithm is correct. First, we prove the following lemma.
Let $w_1$ and $w_2$ be non-empty strings. Then, the equality $w_1w_2=w_2w_1$ holds if and only if there exists a string $u$ and natural numbers $k_1$, $k_2$ such that $w_1=u^k_1$ and $w_2=u^{k_2}$.
We use induction on $|w_1w_2|$. In the base case, we have $|w_1|=|w_2|=1$. So $w_1w_2=w_2w_1$ implies $w_1=w_2$. We pick $u=w_1$ and $k_1=k_2=1$. We split the inductive case into three sub-cases:
- $|w_1|=|w_2|$: $w_1w_2=w_2w_1$ implies $w_1=w_2$. So we pick $u=w_1$ and $k_1=k_2=1$.
- $|w_1|<|w_2|$: $w_1w_2=w_2w_1$ implies that $w_2=w_1x$ for some non-empty $x$. So we have the equation $w_1w_1x=w_1xw_1$. This implies $w_1x=xw_1$. Since both $w_1$ and $x$ are non-empty and $x$ is strictly shorter than $w_2$, we can apply the inductive hypothesis to obtain $u$, $k_1$, and $k_2$ such that $w_1=u^{k_1}$ and $x=u^{k_2}$. Then, observe that $w_2=u^{k_1+k_2}$. This completes the proof of the lemma.
Now, we prove that $w=x^k$ for some $k > 1$ if and only if $w$ is a sub-string of $uv$ as in the algorithm. The “only if” direction is trivial. For the “if” direction, suppose that $w$ is a sub-string of $uv$. Since both $u$ and $v$ are strictly shorter than $w$, there must be non-empty $w_1$ and $w_2$ such that $w=w_1w_2$, $w_1$ is a suffix of $u$, and $w_2$ is a prefix of $v$. Since $u$ is a suffix of $w$, $w_1$ must be a suffix of $w$. Since $v$ is a prefix of $w$, $w_2$ must be a prefix of $w$. So $w=w_2xw_1$ for some string $x$. Since $w=w_1w_2$, $x$ must be the empty string. So $w=w_1w_2=w_2w_1$. We now apply the above lemma and conclude that the algorithm is correct.