- the number $n$ of times the head visited $C_i$ modulo a fixed constant $k$;
- if it has entered $C_i$ from the left or from the right.

This information is enough to build $k$*-ary counters* and also a sort of *comparator between numbers written in different bases*. So, some one-state Turing machines can *“recognize” languages that are not Context-Free* (the quotes are due to a different accept/reject conditions that are necessary because we cannot distinguish between accepting and non accepting states).

]]>

Busy Beavers are allmost incompressible! … easy fact, but I didn’t find it anywhere.

Let $\sigma(M)$ be the number of $1$s left on the tape after the excution of the Turing machine $M$ (the tape is initially blank). A Turing machine $M$ of length $n$ is a Busy Beaver if, for all Turing machines $M’$ of length $n$, we have $\sigma(M’) \leq \sigma(M)$; and we define the busy-beaver function $\Sigma: \mathbb{N} \to \mathbb{N} $ as $\Sigma(n) = \sigma(M)$. Note that our definition of “Busy Beaver” is slightly different from the standard one: we consider the *length of the Turing machines* in a fixed reasonable encoding scheme, not the number of states.

Suppose that a Turing machine $B$ of length $n$ is a Busy Beaver. If $p$ is the smallest Turing machine that outputs a representation of $B$ when executed on a blank tape ($U(p) = B$) we can build a Turing machine $p’$ that extends the behaviour of $p$ in this way:

- mirror the execution of $p$ until it halts;
- then simulate the execution of the Turing machine represented by the output of $p$ until it halts (i.e. it simulates the execution of $B$);
- and finally find a $0$, overwrite it with a $1$ and halt.

In the simulation of $B$ we can make a one-to-one correspondence between the $i$-th simulated cell $T^B[i]$ and a group of $k$ cells of the actual tape $T[i_1,..,i_k]$ of $p’$ in such a way that if cell $T^B[i] = 1$ during a simulation step, then at least one of $T[i_1],..,T[i_k]$ is $1$.

Hence, by construction, at the end of the whole execution of $p’$, the number of $1$ on the tape is greater than those left on the tape by $B$: $\sigma(p’) > \sigma(B) = \Sigma(n)$.

So we must have $|p’| > n$, but $|p’| = |p| + c = C(B) + c’$ where $C(B)$ is the Kolmogorov complexity of $B$; and finally:

$$C(B) > n – c’$$

where $c’$ doesn’t depend on $n$ but only on the computational model.

]]>

**Definition [Pizza Regina problem]**

**Input**: A grid $M$ with some marked squares, a threshold $T\in \mathbb{N}$, a maximal area $A \in\mathbb{N}$

**Output**: The largest possible total area of a set of disjoint rectangles with integer coordinates in $M$ such that each rectangle includes at least $T$ marked squares and each rectangle has area at most $A$.

The problem can be converted to a decision problem adding a parameter $k \in \mathbb{N}$ and asking:

**Question**: Does there exist a set of disjoint rectangles satisfying the conditions (each rectangle has integer coordinates in $M$, includes at least $T$ marked squares and has area at most $A$) whose total area is at least $k$ squares?

The problem is clearly in $\mathsf{NP}$, and after struggling a little bit I found that it is $\mathsf{NP}$-hard (so the Pizza Regina problem is $\mathsf{NP}$-complete). This is a sketch of a reduction from MONOTONE CUBIC PLANAR 1-3 SAT:

**Definition [1-3 SAT problem]:**

**Input:** A 3-CNF formula $\varphi = C_1 \land C_2 \land … \land C_m$, in which every clause $C_j$ contains exactly three literals: $C_j = (\ell_{j,1} \lor \ell_{j,2} \lor \ell_{j,3})$.

**Question:** Does there exist a satisfying assignment for $\varphi$ such that each clause $C_j$ contains exactly one true literal.

The problem remains NP-complete even if all literals in the clauses are positive (MONOTONE), if the graph built connecting clauses with variables is planar (PLANAR) and every variable is contained in exactly 3 clauses (CUBIC) (C. Moore and J. M. Robson, Hard tiling problems with simple tiles, Discrete Comput. Geom. 26 (2001), 573-590.).

We use $T=3, A=6$, and in the figures ham is represented with blue boxes (transgenic ham?), pizza with orange boxes.

The idea is to use tracks of ham that carry positive or negative signals; the track is made with an alternation of 1 and 2 pieces of hams placed far enough so that they can be covered exactly by one slice of pizza of area $A$; the segments of the track are marked alternately with $+$ or $-$, the track will carry a positive signal if slices are cut on the positive segments:

Each variable $x_i$, which is connected to exactly 3 SAT clauses, is represented by three adjacent endpoints of three ham tracks (positive segment), in such a way that there are 2 distinct ways to cut it, one will “generate” a positive signal on all 3 tracks (it reppresent the $x_i = TRUE$ assignment) the other a negative signal ($x_i = FALSE$). Notice that we can also generate mixed positive and negative signals, but in that case *at least one ham remains uncovered*.

Each clause $C_j$ of the 1-3 SAT formula with 3 literals $L_{i,1}, L_{i,2}, L_{i,3}$ is simply represented by a single ham with three incoming positive segments of three distinct ham tracks; by construction *only one of the three tracks* carrying a positive signal can “cover” the ham-clause.

Finally we can build shift and turn gadgets to carry the signals according to the underlying planar graph and adjust the endpoints:

Suppose that the resulting graph contains $H$ hams. By construction every slice of pizza must contain exactly 3 hams, and in all cases every slice can be enlarged up to area $A$.

If the original 1-3 SAT formula is satisfiable then by construction we can cut $H /3$ pieces of pizza (with total area of $A H / 3$) and no ham remains uncovered.

On the oppposite direction, if we can cut $H /3$ pieces of pizza (with total area $A H / 3$) then no ham remains uncovered, and the signals on the variables gadgets and on the clauses are consistent: the ham on the clause is covered by exactly one positive slice coming from a positive variable, and every variable generates 3 positive signals or 3 negative signals (no mixed signals); so the cuts induce a valid 1-3 SAT assignment.

**Conclusion**: … so unless $\mathsf{(P)izza} =\mathsf{(NP)izza}$, cutting a pizza can be really hard. I would like to thank Antoine for posting the funny question and for spending a bit of time checking my proof.

The (easy) proof that the uncomputability of Kolmogorov complexity implies the undecidability of the Halting problem can be found in many lectures notes and books; usually the proof assumes that the Halting problem is decidable and derive the computability of Kolmogorov complexity which is a contradiction. In other words given an oracle for the Halting problem, we can compute the Kolmogorov complexity of a string $x$.

But we can also derive the uncomputability of Kolmogorov complexity from the undecidability of the Halting problem; the proof is “less popular” but nevertheless can be found after a few searches on Google. For example the technical report: Gregory J. Chaitin, Asat Arslanov, Cristian Calude: Program-size Complexity Computes the Halting Problem. Bulletin of the EATCS 57 (1995) contains two different proofs, and the great book Li, Ming, Vitányi, Paul M.B.; An Introduction to Kolmogorov Complexity and Its Applications presents it as an exercise (with a hint on how to solve it that is credited to P. Gács by W. Gasarch in a personal communication Feb 13, 1992). Here we give an extended proof, with more details, that the Halting problem can be decided using an oracle that computes the Kolmogorov complexity of a string, i.e. that the Halting problem is Turing reducible to the Kolmogorov complexity.

The Halting problem is represented using the halting set:

$$HALT = \{ \langle M,x \rangle \mid M \text{ is a Turing machine that halts on input } x\}$$

Let $T_{max}(n)$ be the longest running time among the Turing machines of size $n$ that halt on empty tape; and let $BB_n$ be the Turing machine that achieves $T_{max}(n)$ (note that $T_{max}$ is simply a variant of the Busy Beaver function in which the score is the number of steps required to halt) .

Starting from the given pair $\langle M, x \rangle$, we can build $M’$ that on empty tape writes $x$ on it and simulate $M(x)$, let $|M’| =n$. Clearly if $M’$ of size $n$ is still running after $T_{max}(n)$ steps then it will run forever.

For large enough $n$, it there exists another machine $Z$ of size $n+c < 2n$ that “embeds” the machine $BB_n$ and:

- calculates $T_{max}(n)$ simulating $BB_n$;
- then enumerates all Turing machines $M_1, M_2,…$ of size $< 2n$ and simulate them on empty tape;
- if $M_i$ halts before $T_{max}(n)$ then $Z$ keeps track of the string $y_i$ left on tape and its length $l_i$; otherwise it continues with the next machine;
- $Z$ halts when all the $M_i$s of size $< 2n$ have been scanned;
- $Z$ outputs one of the string of length $2n$ that hasn’t been generated during the process; at least one of them exists by the pigeonhole principle (there are only $2^n-1$ progrms of length less than $2n$).

Note that we are not able to actually build $Z$ (because we don’t know $BB_n$), but it exists.

Now suppose that Kolmogorov complexity $C(x)$ of a string $x$ is computable, then for every string $y_i$ of length $l$ we can actually find the shortest Turing machine that generates it: simply dovetail the excution of all the Turing machines of length $C(y_i)$ until one of them halts with $y_i$ on the output tape.

So we can find all the strings $y_i$ of length $2n$ that has $C(y_i) < 2n$ and the corresponding Turing machines $M_i$ ($|M_i| < 2n$) that generate them; we can also calculate the time $t_i$ needed for $M_i$ to generate $y_i$.

If $T = \max \{ t_i \}$ then we must have that $T > T_{max}(n)$ otherwise $Z$ above is able to simulate all the programs of size $< 2n$ that generate a string of length $2n$ until they halt, and record their outputs; so, by construction, its output is a string $y$ of length $2n$ with $C(y) = 2n$ (i.e. uncompressible), but $|Z| = n + c < 2n$ and this is a contradiction because we have a program of size less than $2n$ which outputs a string whose Kolmogorov complexity is $2n$.

So $T$ is an upper bound for $T_{max}(n)$ and can be used to check if $M’$ halts and thus decide $\langle M, x \rangle \in HALT$.

$\Box$

The two reductions, from the Halting problem to the Kolmogorov complexity and from the Kolmogorov complexity to the Halting problem imply that the two problems are Turing equivalent (belongs to the same Turing degree).

]]>

In the last years the study of the complexity of puzzles and (video)games has gained much attention (see for example the survey [1]). Most games can be generalized to arbitrary instance size and transformed to decision problems in which the question is usually: “Given an instance of size $m \times n$ of the game X, does it have a solution?”. It turns out that most static puzzles (sudoku, kakuro, binary puzzle, light up, …) are NP-complete and that most dynamic puzzles (sokoban, rush hour, atomix, …) are PSPACE-complete.

One of the puzzles for which the complexity was still unknown is Subway Shuffle; we proved, as conjectured in [2], that its rules are rich enough to be PSPACE-complete. The proof uses the framework of the nondeterministic constraint logic model of computation ([2], [3]): given a planar constraint graph in normal form, it is PSPACE-complete to find a sequence of edge reversals (moves) that keep the constraint graph valid, ending in the reversal of a special edge $e^*$.

We build an equivalent subway shuffle board with EDGE, AND and LATCH gadgets that has a solution (i.e. a sequence of moves that shift the special token to its final target position) if and only if there is a sequence of moves that reverses $e^*$.

You can download a preliminary draft version of the paper here:

click here to download a preliminary draft of the paper

[1] Graham Kendall, Andrew J. Parkes, Kristian Spoerer: A Survey of NP-Complete Puzzles. ICGA Journal 31(1): 13-34 (2008)

[2] Robert A. Hearn, Erik D. Demaine: Games, puzzles and computation. A K Peters 2009, ISBN 978-1-56881-322-6, pp. I-IX, 1-237

[3] Robert A. Hearn, Erik D. Demaine: The Nondeterministic Constraint Logic Model of Computation: Reductions and Applications. ICALP 2002: 401-413

]]>[SQUARE-FREE SUBSET PRODUCT] Given $3N$ integers, find $N$ of them whose product is square free.

I didn’t find it anywhere, so it can be somewhat “original”.

Proof sketch: starting from a an Exact Cover by 3 sets (X3C) instance (strongly NPC) label each element of the universe with a distinct prime (you can generate $3|X|$ of them in polynomial time); then convert every triple $(x,y,z)$ of the subsets to $xyz$.

It obviously resembles the better known SUBSET PRODUCT (which is not strongly NPC due to the presence of the target product $B$, see David S. Johnson: The NP-Completeness Column: An Ongoing Guide. 393-405).

It can also be hacked a little bit to get other variants, like:

* Given $3N$ integers, find $N$ of them whose product is a perfect $21$-th power;

* Given $N$ integers, find a subset whose product is the $3N$-th primorial (kind of cheating :-).

If you find that SQUARE-FREE SUBSET PRODUCT has been used/defined in some paper (possibly under another name), let me know!

]]>The following variant of the 3-Dimensional Matching problem (3DM) was posted on cstheory.stackexchange.com, a question and answer site for professional researchers in theoretical computer science and related fields:

**Definition.** 3-DIMENSIONAL MATCHING VARIANT

**Input:** Set $M \subseteq X \times Y \times Z$ where $X,Y,Z$ are disjoint sets; we call $M_{XY}= \{(x,y)\mid \exists z \text{ s.t. } (x,y,z)\in M \}$ the set of pairs of $X \times Y$ that appear in the triples of $M$, and $M_{XZ}= \{(x,z)\mid \exists y \text{ s.t. } (x,y,z)\in M \}$ the set of pairs of $X \times Z$ that appear in the triples of M.

**Question:** Does there exist a set $M’ \subseteq M \cup M_{XY} \cup M_{XZ} $ such that every element of $X \cup Y \cup Z$ is included in a triple or a pair of $M’$ exactly once?

Informally we want to build an exact cover of $X \cup Y \cup Z$ using the triples of $M$ or one of the two pairs $(x,y), (x,z)$ that are contained in a triple $(x,y,z) \in M$.

The problem is NP-complete, click the following link to download the technical report with the reduction (we also discuss some other variants of the problem).

Click here to download the technical report

If you have some comments or find some errors, contact me via email.

]]>

We begin with a weird problem that involves those cute animals known as Busy Beavers : in computational theory a **busy beaver** is a Turing machine that attains the maximum number of steps performed before halting or the number of nonblank symbols finally on the tape, among all Turing machines of the same size. The Turing machine must follow some rigid design specifications: it operates on a single two-way unbounded tape initially filled with $0$s and the tape alphabet is $\{0,1\}$ (0 is the blank symbol); it has $n$ states plus a halting state, and every transition is a 5-tuple: (current non-halting state, current symbol, symbol to write, direction of shift, next state).

The *n-state busy beaver (BB-n) game* (introduced by Tibor Radó in 1962, [1]) is a contest to find such an n-state Turing machine having the largest possible score, i.e. the largest number of 1s on its tape after halting. A machine that attains the largest possible score among all n-state Turing machines is called an *n-state busy beaver*.

The busy beaver function $\Sigma(n)$ (with $\Sigma: \mathbb{N} \to \mathbb{N}$), is the maximum attainable “score” (the maximum number of 1s finally on the tape) among all halting 2-symbol $n$-state Turing machines, when started on a blank tape. In addition to $\Sigma$, Radó also considered the maximum number of attaianable steps (or, equivalently, number of shifts, because the Turing machine makes a move at every transition) before halting and he denoted it with $S(n)$ (while we denote with $s(M)$ the number of steps made by $M$ before halting).

It is easy to prove that both functions are **uncomputable**: for example $S(n)$ could be used to decide the Halting problem of a Turing machine $M$ on input $x$: just build $M’$ that writes $x$ on the empty tape and simulate $M$ for $S(|M’|)$ steps; if it doesn’t halt it will run forever because $S(n)$ is the maximum number of steps that a halting Turing machine of size $n$ can perform). We have that $\Sigma$ grows faster than any computable function $f : \mathbb{N} \to \mathbb{N}$, and $S(n) \geq \Sigma(n)$.

The current record holder (a busy-beaver candidate), among the 6-states 2-symbols Turing machines, runs for more than $s(n) > 7.4 \times 10^{36534}$ steps (and it writes $ \Sigma > 3.5 \times 10^{18267} $ 1s on the tape). For more information on record holders, busy beaver history and “how to **search yourself a busy beaver**” you can visit Heiner Marxen’s site or Pascal Michel’s historical survey.

Now the open problem; if we consider two Turing machines *different only if their underlying transition graphs are not isomorphic*:

**Yet Another (Tiny) Open Problem #1**: What can be said about the number of busy beavers of a given size? Does the set:

$$BB(n) = \{ M \mid M \text{ has n states and is a busy beaver, i.e. } s(M)=S(n)\}$$

always contain only **one** busy beaver for all $n$?

If the answer is yes, then $\#BB(n) =| BB(n) |$ is trivially computable, but what can be said if there can be two different busy beavers $M_1 \neq M_2$, both with $n$ states (but whose underlying transition graphs are not isomorphic) that run for $s(M_1)=s(M_2)=S(n)$? Can we show that $\# BB(n)$ is uncomputable?

[1] Radó, Tibor (May 1962). “On non-computable functions“. Bell System Technical Journal 41 (3): 877–884

]]>The *Traveling Salesman Problem* (TSP) is a well–known problem from graph theory: we are given $n$ cities and a nonnegative integer distance $d_{ij}$ between any two cities $i$ and $j$ (assume that the distances are symmetric, i.e. for all $i,j, d_{ij} = d_{ji}$). We are asked to find the* shortest tour* of the cities, that is a permutation $\pi$ of $[1..n]$ such that

$\sum_{i=1}^n d_{\pi(i),\pi(i+1)}$ (where $\pi(n+1) = \pi(n)$) is as small as possible. Its well-known NP-complete version is the following (TSPDECISION): If a nonnegative integer bound $B$ (the traveling salesman’s “budget”) is given along with the distances, does there exist a tour of all the cities having total length no more than $B$?

But what about checking that a tour has effectively minimal length? We prove that the problem:

TSPMINDECISION: Given a set of $n$ cities, the distance between all city pairs and a tour $T$, is T visiting each city exactly once and is T of minimal length?

is coNP-complete. As a secondary result we prove that given a graph $G$ and an Hamiltonian path, it is NP-complete to check if $G$ contains an Hamiltonian cycle as well.

**UPADTE 2014-03-21**: after publishing the proof on arXiv, it turned out that the same result was proved Papadimitriou and Steiglitz in [1] (see also [2] Section 19.9). Our proof is slightly different and it may be interesting in and of itself, so we decided not to withdraw the paper.

[1] Christos H. Papadimitriou and Kenneth Steiglitz. 1982. *Combinatorial Optimization: Algorithms and Complexity*. Prentice-Hall, Inc., Upper Saddle River, NJ, USA.

[2] Christos H. Papadimitriou and Kenneth Steiglitz. *On the Complexity of Local Search for the Traveling Salesman Problem*. SIAM J. Comput., 6(1), 76–83, 1977.

The EXACT COVER BY 3-SETS (X3C) problem is:

**Instance**: Set $X = \{x_1,x_2,…,x_{3q}\}$ and a collection $C = \{C_1,…,C_m\}$ of 3-element subsets of $X$.

**Question**: Does $C$ contain an exact cover for $X$, i.e. a subcollection $C’ \subseteq C$ such that every element of $X$ occurs in exactly one member of $C’$?

X3C is NP-complete [1], and as shown in [2] it remains NP-complete even if every element $x_i$ contained in exactly 3 subsets of $C$ (*Restricted Exact Cover by 3-Sets* – RX3C).

We proved that it remains NP-complete even if every pair of subsets in $C$ share at most one element; i.e. for all $i \neq j,\; | C_i \cap C_j | \leq 1$ and we call this restricted version * SINGLE OVERLAP RX3C*.

- Link to the original question on cstheory
- Printed page with the question and a sketch of the proof (NPC_Pill_001_restriction_of_exact_cover_by_3_sets.pdf)

[1] M. R. Garey, David S. Johnson: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman 1979, ISBN 0-7167-1044-7.

[2] Teofilo F. Gonzalez: Clustering to Minimize the Maximum Intercluster Distance. Theor. Comput. Sci. 38: 293-306 (1985).

]]>