… nearly 42 …

Noodling with TCS: PCP variant

admin — Sun, 20 Dec 2020 21:52:19 +0000

On my favourite TCS site cstheory.stackexchange.com I found a simple question about the Post Correspondence Problem:

If the upper and lower words of each domino must have different lengths, is the problem still undecidable? (we call this variant $PCP^{\neq}$).

The answer is yes, it remains undecidable …

A simple reduction from standard $PCP$ to $PCP^{\neq}$ is the following:

Suppose that we have $n$ dominos $[\alpha_1 / \beta_1], [\alpha_2 / \beta_2], …. [\alpha_n / \beta_n]$ with $\alpha_i, \beta_i \in \Sigma^*$

We expand the alphabet $\Sigma$ to a new alphabet $\Sigma’$ in which each symbol $a_i \in \Sigma$ is represented by $m$ distinct new symbols $\{a_{i_1},a_{i_2},….,a_{i_n} \}$, where $m$ is the first odd number greater than or equal to $n+1$ ($m = 3 \lceil (n + 1)/ 3 \rceil $).

We replace each symbol in the words of the original dominos with the sequence of the new $m$ corresponding symbols. For example suppose that $n = 3$ and the dominos are $D_1 = [a/baa]$,$D_2 = [ab/aa]$,$D_3 = [bba/bb]$ we get the new set of dominos:

$D’_1 = [a_1 a_2 a_3 a_4 a_5/ b_1 b_2 b_3 b_4 b_5 \; a_1 a_2 a_3 a_4 a_5 \; a_1 a_2 a_3 a_4 a_5]$
$D’_2 = [a_1 a_2 a_3 a_4 a_5 \; b_1 b_2 b_3 b_4 b_5 / a_1 a_2 a_3 a_4 a_5 \; a_1 a_2 a_3 a_4 a_5 ]$
$D’_3 = [b_1 b_2 b_3 b_4 b_5 \; b_1 b_2 b_3 b_4 b_5 \; a_1 a_2 a_3 a_4 a_5 / b_1 b_2 b_3 b_4 b_5 \; b_1 b_2 b_3 b_4 b_5 ]$

Finally we transform each domino $D’_i$ in which the upper and lower words have the same length, into two equivalent dominos $D^L_i, D^R_i$. We pick the first sequence of length $m$ (that represents the first symbol in the original word) in the upper word of $D’_i$ and we split it at position $i$, we pick the first sequence of length $m$ in the lower word and we split it at position $m-i$. For example, domino $D’_2$ becomes:

$D^L_2 = [a_1 a_2 / a_1 a_2 a_3 ]$
$D^R_2 = [ a_3 a_4 a_5 \; b_1 b_2 b_3 b_4 b_5 / a_4 a_5 \; a_1 a_2 a_3 a_4 a_5 ]$

Also note that each split point is different for each domino $D’_i$, so there is no way to “reconstruct” the same upper/lower subwords using a combination of other dominos; in other word each domino $D^L_i$ must be followed by $D^R_i$.

So the original $PCP$ problem has a solution if and only if the corresponding $PCP^{\neq}$ has a solution; so $PCP^{\neq}$ is undecidable.

Final stuff: … don’t forget that the PCP variant in which all tiles have equal length is decidable.

Final stuff 2: I also found a nice relation between PCP and Context Free palindromes, but it happens that someone else had the same insight before me; see Post’s Correspondence Problem PCP is about context-free grammars
by Wim H. Hesselink.

Open question: the above reduction is simple but it could probably be further simplified and optimized. Perhaps there is also a simpler reduction … let me know.

Cellular Automata variant

admin — Sun, 11 Oct 2020 22:11:53 +0000

This is a simple note on Turing completeness of 2-neighbourhood 1-dimensional Cellular Automata.

A Cellular Automaton (pl. Cellular Automata) is a model of computation based on a grid of cells that evolve according to a simple set of rules. The grid can be multi-dimensional, the most famous and studied cellular automata are 1-dimensional and 2-dimensional. Each cell can be in a particular finite state ($k$-states CA), at each step of the evolution (generation) the cell changes its state according to its current state and the state of its neighboring cells. The number of neighbours considered is finite; for a 1-dimensional CA, the usual choice is the adjacent neighbours. A m-neighbourhood 1-dimensional CA is a CA in which the next state of cell $c_i$ depends on the state of $c_i$ and the state of the $m-1$ adjacent cells; usually $c_i$ is the central cell.

Formally, if $t_n(c_i)$ is the state of cell $c_i$ at time $t_n$:

$$t_{n+1}(c_i) = f( t_n(c_{i-\ell}), …, t_n(c_{i-2}), t_n(c_{i-1}), t_n(c_i), t_n(c_{i+1}), t_n(c_{i+2}), …, t_n(c_{i+\ell}) )$$

and $m = 2\ell+1$.

For more details and basic references see: Das D. (2012) A Survey on Cellular Automata and Its Applications.

Even basic cellular automata can be Turing Complete, for example see the (controversial) proof of Turing completeness of the 3-neighbourhood Cellular Autaton identified by Rule 110 .

If only 1 neigbour is considered (i.e. the next state of a cell only depend on its current state), then we cannot achieve Turing completeness. But with 2-neighbourhood and enough states it’s possible to emulate every CA.

We give a simple proof of a simulation of a 2-states 3-neighbourhood 1-dimensional CA $A_{CA}$ by a 6-states 2-neighbourhood 1-dimensional CA $B_{CA}$. Without loss o generality we will assume that the state of cell $c_i$ at time $n+1$ depends only on the state of cell $c_i$ at time $n$ and teh state of the cell $c_{i+1}$ on its right at time $n$:

$$t_{n+1}(c_i)=f( t_n(c_i), t_n(c_{i+1}) )$$

The idea is to spend one generation to “stack” the state of the adjacent cell and then apply the original rule.

The 4 states are: $(0,*), (1,*), (0,0),(0,1),(1,0),(1,1)$

Initially, at generation 1, the cells are only in state $(0,*)$ or $(1,*)$, which represent the $0,1$ states of the original configuration of the simulated CA $A_{CA}$. We add the following rules that stack the current state of the cell $c_i$ and the current state of the adjacent cell $c_{i+1}$:

These rules are applied and lead to generation 2 of $B_{CA}$. Now every cell $c_i$ is able to “see” its current state, the current state of cell $c_{i+1}$ and the current state of cell $c_{i+2}$ and we can treat it as the original cell $c_{i+1}$ of the simulated CA and we can apply the corresponding 3-neighbourhood rule.

For every rule of the original CA $(x,y,z) \to y’$ we add the following rules:

In this way at generation 3 of $B_{CA}$, the state of cell $c_i$ is the state of generation 2 of cell $c_{i+1}$ in the simulated CA; i.e. informally the configuration is the same of the simulated CA, but shifted to the left by one cell.

The same technique can be applied to simulate more complex CAs:

Thereom 1: A $k$-states $d$-neighbourhood 1-dimensional CA can be simulated by a $k + k^{d-1}$-states 2-neighbourhood 1-dimensional CA.

Note that the simulation is only slowed by a constant factor: the $n(d-1)$-th generation of the 2-neighbourhood CA is equivalent to the $n$-th generation of the simulated $d$-neighbourhood CA. We can simulate Rule 110, so we can also conclude:

Corollary 2: There exists a 6-states 2-neighbourhood 1-dimensional CA which is (weakly) Turing Complete.

The image below represents the evolution of the 6-states 2-neighbourhood CA that simulates the Rule 110 (on the left) and the corresponding evolution of the original Rule 110 (on the right), note that odd generations are the same, but only left shifted.

Open problem: we didn’t dig too much in the literature, but we think no one has studied the intermediate cases: $k$-states 2-neighbourhood 1D CA for $1 < k < 6$, perhaps some of them could be Turing complete. For $k=2$ (two states) the CA behaviour is simple, but for $k=3$ there are interesting examples that seem to fit in Class-4 of the classification presented in Stephen Wolfram’s A New Kind of Science, and perhaps they could be universal. For example the 3-states 2-neighbourhood CA $(1,2,2,0,1,0,2,2,1)$ exhibits the following behaviour (starting from a configuration in which all cells are 0 except three of them):

They exist but you cannot catch ’em

admin — Mon, 31 Aug 2020 22:19:48 +0000

(“A few lines where incompressibility meets unprovability”)

The Kolmogorov Complexity $K(x)$ of a string $x$ relative to an Universal Turing machine $U$ is the length of the shortest program $p$ that “prints” $x$:

$$K(x) = min\{ |p| \mid U(p) = x \}$$

A string $x$ is incompressible if $K(x) \geq |x|$. Assuming a binary alphabet $\Sigma = \{0,1\}$, for each $n \geq 1$, there are $2^n$ strings of length $n$, but there are only $2^n-1$ programs shorter than $n$, so there is at least one incompressible string among them. And it follows immediately that there are infinite incompressible strings (they exist …).

Can we catch some of them? … No! Indeed if we are reasoning in a formal theory $T$ that is powerful enough to formalize Turing machines and the notion of compressibility – e.g. Peano Arithmetic – we have:

Theorem 1: There exists $n$ such that for all strings $x$ such that $|x| \geq n$ the statement “$K(x) \geq |x|$” (i.e. $x$ is incompressible) is unprovable in $T$.

Proof: Suppose that there are infinitely many strings $x$ such that there is a proof of “$K(x) \geq |x|$”. We can build a program $p$ that enumerates all valid proofs of $T$ and whenever it founds a proof of “$K(x_i) \geq |x_i|$” for some $x_i$, it compares $|x_i|$ with $|p|$ (by the recursion theorem we can build a program that knows its length), and if $|p| < |x_i|$ then $p$ halts and prints $x_i$. So $T$ proves that $x_i$ is incompressible, but we can actually build a program shorter than $|x_i|$ which prints $x_i$, a contradiction.

Note that Theorem 1 is not provable in $T$ ! … we need $T + Con(T)$ to prove it, because no powerful enough theory can prove its own consistency or prove that some sentence is unprovable.

Box Shift Puzzle

admin — Wed, 11 Dec 2019 22:52:30 +0000

While thinking about simple “puzzles” that seem hard at a first glance, but have not enough rules and structural constraints that make it easy to prove that they are NP-complete, I designed the following game (but perhaps it has already a name … let me know if you know it ):

a $N \times N$ grid contains $S \times S$ blue boxes in the upper left area, the home area; each box occupies a cell;
a column (or row) that contains at least one box is picked at random and it is shifted downward (or rightward). If a box exits from one border it re-enters on the opposite site;
the random shift is repeated maxmoves times (e.g. $maxmoves = S^2$);
the aim of the game is to repack the boxes in the upper left home area, using at most maxmoves upward column shifts or leftward row shifts.

A simple javascript version of the game can be played here.

Can the blue boxes be packed in the upper-left 4×4 yellow area using at most 16 moves?

The rules are simple, but even in a small game with a 4×4 home area it’s hard to find the correct shift sequence …

The Box Shift Puzzle could be analyzed from a computational complexity perspective:

Input: given a $N \times N$ filled with $S \times S$ boxes, and an integer $M$ represented in unary.

Question: can the $S \times S$ boxes be packed in the upper-left corner (home area) using at most $M$ column or row shifts?

Open problem: what is the complexity of the Box Shift Puzzle? Is it NP-complete?

Also the following variants (or any combination of them) seem interesting:

V1. in addition to the initial configuration, a target configuration is given and the problem is to decide if the initial configuration can be transformed into the target configuration using at most $M$ moves;
V2. one of the side of the grid is fixed (the size of the grid is $N \times k$);
V3. the columns (resp. rows) can be moved in both up-down directions (resp. left-right);
V4. the boxes cells could be colored with colors in $[1..c]$

… I’ll spend some time on it, if I find something interesting I’ll update this page.

A funny way to prove that the set of primes is not regular

admin — Tue, 30 Apr 2019 19:38:55 +0000

It is well known that $\text{Primes}= \{ a^p \mid p \text{ is prime}\}$ is not regular. The standard proof uses the pumping lemma for regular languages, but you can also use Parikh’s theorem or Myhill-Nerode theorem: see this question on cs.stackexchange.com. I tried to figure out another way to prove it, and came out with an alternate proof that uses a bit of number theory and Busy Beavers.

First of all: Theorem 1 (Prime gap theorem). There are gaps between primes that are arbitrarily large.

Another easy property of DFAs is:

Property 2. Given a DFA $A = { Q, \Sigma, \delta, q_0, F }$, if $w = uv$ and the state of $A$ on input $w$, after scanning $u$ is $q_i$ (when the head is at the beginning of subword $v$); and $A’ = { Q, \Sigma, \delta, q_i, F }$ then $w \in L(A)$ if and only if $v \in L(A’)$.

We can combine them in a (funny) proof that “uses” Busy Beavers:

Theorem 3. $\text{Primes}= \{ a^p \mid p \text{ is prime}\}$

Proof:

suppose that you have a $DFA$ $A$ that accepts the primes $\{ a^p\mid p \text{ prime} \}$
given a state $q$ of $A$, you can build a Turing machine $M_{\langle A,q \rangle}$ that sequentially simulates $A$ starting from state $q$ on inputs $a^1, a^2, a^3, a^4,…$ until $A$ accepts some $a^k$ (or never halt)
let $|M_{\langle q,A \rangle}| = n$ be the size of such Turing machine, and $BB(n)$ the maximum number of steps achievable by a halting Turing machine of size $n$ (uncomputable)
by the prime gaps theorem, there exists a prime $p_i$ such that $p_{i+1} – p_i \gg BB(n)$
$A$ accepts $a^{p_{i+1}}$, so let $q_i$ be the state of $A$ on input $a^{p_{i+1}}$ after $p_{i}$ steps (i.e. it has scanned the first part $a^{p_i}…$ and the head is at the beginning of the remaining part of the input $…a^{p_{i+1} – p_{i}}$)
so there exists $M_{\langle A,q_i\rangle}$ of size $n$ that by construction will run for a number of steps greater than $p_{i+1} – p_i \gg BB(n)$ and halt (Property 2), contradicting the hypothesis that $BB(n)$ is the maximum number of steps achievable by a halting TM of size $n$

$\square$

With the same technique, we could also use a Kolmogorov Complexity argument instead of Busy Beavers: pick a large enough incompressible binary string $x$ and let $p’- p$ be a large enough prime gap such that $p’ – p > x$. If there was a DFA $A$ that recognizes $\text{Primes}$, we could reconstruct $x$ using only the description of $A$ and its state after scanning the first part $a^{p’-x}$ of $a^{p’}$, i.e. $K(x) = |A| + c \ll |x|$.

Open problem: are there other funny proofs that the set $\text{Primes}$ is not regular, out there?

Minimizing the COL4 to COL3 reduction

admin — Fri, 29 Mar 2019 21:16:18 +0000

I just bought the nice book “Problems with a POINT – Exploring Math and Computer Science” by William Gasarch and Clyde Kruskal. The book explains many nice mathematical and theoretical computer science problems that are “easy-to-understand-but-not-so-easy-to-solve”, and most of them solicit the reader’s curiosity and invite him to futher explore the related topics (plenty of references are given at the end of each chapter).

Chapter 23 explains a sane reduction between COL4 (the problem of deciding whether a graph is 4-colorable) to COL3 (the problem of deciding whether a graph is 3-colorable). Both are well known NP-complete problems. I started reading the chapter, but suddenly stopped and tried to figure out a reduction by myself.

The idea presented in the book is to replace each node of the graph $G$ of the COL4 problem with four nodes, force them to be colorable with colors F and T (the third color R is not allowed), force exactly one of them to be T, and finally adding a gadget that prevents two adjacent quadruples from having two corresponding nodes colored with T.

I came up with a very similar reduction, but the gadgets involved are slightly smaller: indeed a 4 colorable node can be represented “in binary” with two nodes each one colorable with T or F (two bits).

The construction to build a graph $G’$ that is 3-colorable if and only if the original graph $G$ is 4-colorable is the following.

Every node $n_i$ of the original graph $G$ is replaced with two nodes $n_i’$ and $n_i”$. We add a common triangle with three new nodes and an edge between its “Red” $R$ node and each $n_i’$ and $n_i”$, in this way all $n_i’, n_i”$ must be colored $T$ or $F$. Note that we simply label “red” the color assigned to the vertex of the common triangle that we connect to the nodes $n_i’, n_i”$ (in a valid coloring R,F,T are interchangeable) .

We replace each edge $e_h = (n_i, n_j)$ of $G$ with the edge gadget showed in the figure.

Edge gadget.

The edge gadget that replaces the original edge $e_h = (n_i, n_j)$ acts as a comparator between the two side: if $n_i’$ and $n_j’$ (resp. $n_i”$ and $n_j”$ ) have the same color, then one of the two inner nodes that connect them in the edge gadget must be red. As a consequence, if both $n_i’ = n_j’$ and $n_i” = n_j”$ then no node of the central triangle can be colored with red; note that red is already forbidden in the middle node of the central triangle because it is connected to the common $R$.

Allowed/forbidden 3-colorings of the edge gadget.

So the edge gadget is 3-colorable if and only if $(n_i’ \neq n_i”) \lor (n_j’ \neq n_j”) $. But the colors of $n_i’, n_i”$ represent the two bits of a 4 color so the edge gadget is 3-colorable if and only if in the original graph we can assign to $n_i$ and $n_j$ two different colors among the 4 available.

We can conclude that the whole resulting graph $G’$ can be 3-colored if and only if the original graph $G$ can be 4-colored.

A similar technique could be applied to the reduction $COL_a \leq COL_b$ when $a = 2^m$ expanding the edge gadget.

Open problem: is there a simpler $COL4 \leq COL3$ reduction ?

MiniZinc plays Dragster

admin — Tue, 18 Dec 2018 11:17:39 +0000

What happens if a modern constraint satisfaction and optimization program decides to play a Video Game Classic?

I’m a fan of old Video Games (80s games) and recently I found an interesting debate about the world record of the Atari’s videogame Dragster by Activision.

T.R. [I’m polite :-)] claimed to achieve a time of 5.51 in Dragster on 1982 and after 35 years nobody was able to replicate such time: the best time achieved by a human player is 5.61, and with a Tool-Assisted Speedrun (TAS) is 5.57.

Omnigamer disassbled the original cartridge of the game and anlyzed the code. He came to the following conclusion: 5.51 is not achievable with a normal gameplay (assuming that the hardware behavior is consistent with the technical specifications). Even 5.54, claimed to be the best possible time by the Activision’s game developers, seems impossible. Twin Galaxies decided to ban T.R. and cancel his world record.

Omnigamer made a publicly available interactive spreadsheet with the rules of the game; if you change the values of “Gas” and “Shift” in the spreadsheet, you get the exact behaviour of the game frame by frame. For more details about the game model see the first page of the spreadsheet or look this detailed video explanation.

I decided to write a MiniZinc program (MiniZinc is a powerful and open-source constraint modeling language) to simulate the gameplay and see what happens. Note that MiniZinc implicitly tries ALL POSSIBLE combinations of Gas, Shift, starting Tach and Starting Frame.

In short, the “computer-verified” results are:

MiniZinc correctly played the game achieving 5.57;
The maximum total distance achievable is 97.3828125 (=24930 in-game distance);
A time < 5.57 is not possible;
If one could start from 2nd gear, then 5.51 would be achievable.

These are the definitive (and already known) results about Dragster unless the game model is wrong (I myself spent some time on the disassembly of the game and it seems correct).

Here you can download and play with the MiniZinc Dragster source code (remember to choose the built-in Chuffled solver in order to get a solution in a reasonable time):

Download dragster.mzn source code

This is the sequence of Shift/Gas moves generated by MiniZinc that achieve a “regular” 5.57 (and max distance 97.38)

STARTFRAME = 12;
STARTTACH = 24;
Shift=array1d(0..167,[
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,
0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,
0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1]);
Gas=array1d(0..167,[
0,0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,
0,0,1,1,1,1,0,1,1,1,0,0,1,1,0,1,1,1,1,1,1,1,0,1,1,1,0,1,1,1,
1,1,0,0,1,0,1,0,0,1,0,0,0,1,1,1,1,1,1,1,1,0,1,1,0,0,1,0,1,1,
1,0,0,0,1,1,1,1,1,0,1,1,0,0,0,1,1,1,1,1,1,0,0,1,1,1,0,0,0,1,
0,1,0,0,1,0,0,1,0,1,1,1,1,1,0,1,1,1,0,0,1,1,0,1,0,1,0,1,0,1,
0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,0]);

It took MiniZinc about 7 minutes to calculate the optimal run.

If we allow MiniZinc to start from 2nd gear, then this is a sequence of moves that achive a “cheated” 5.51 time and max dist ~97.25 (internal dist. 24898):

% Change 
% constraint GEAR[0] = 2;
% in the source code!!!
STARTFRAME = 0;
STARTTACH = 30;
Shift=array1d(0..165,[
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,1,0,0,0,0,
0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,1,0,0,0,
0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0]);
Gas=array1d(0..165,[
1,1,1,0,1,1,1,1,1,1,1,0,1,1,1,0,1,0,1,1,1,1,1,1,1,1,1,1,1,1,
1,0,1,1,1,1,1,1,1,1,1,0,0,1,1,1,0,0,1,1,1,0,1,1,1,1,1,0,1,0,
1,0,1,1,1,1,1,1,0,1,1,1,1,1,0,1,1,1,0,1,1,1,0,1,0,1,1,1,1,1,
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,0,1,1,1,1,1,0,1,1,1,1,0,1,1,0,
0,1,1,1,0,1,0,1,1,0,0,1,0,1,1,1,0,0,1,1,0,1,0,1,0,1,0,1,0,1,
0,1,0,1,0,1,0,1,0,1,0,1,0,1,0,1]);

… if we fill the spreadsheet with these values after setting Initial gear = 2, we indeed get a “cheated” 5.51:

The full outputs of the MiniZinc runs is here: results.txt.

The Power of One-State Turing Machines

admin — Wed, 17 Jan 2018 20:45:55 +0000

A one-state Turing machine is a very weak device: it has no internal memory and it cannot even recognize the trivial language $L = \{1\}$. Its transition function is a simple map $\delta : \Sigma \to \Sigma \times \{L,R\}$, i.e. given the symbol under its head it can rewrite it with another one and move left or right, but the state remains the same. Nevertheless it can use its ability to write on the tape to “gain” some memory; in particular in each cell $C_i$ of the tape it can store:

the number $n$ of times the head visited $C_i$ modulo a fixed constant $k$;
if it has entered $C_i$ from the left or from the right.

This information is enough to build $k$-ary counters and also a sort of comparator between numbers written in different bases. So, some one-state Turing machines can “recognize” languages that are not Context-Free (the quotes are due to a different accept/reject conditions that are necessary because we cannot distinguish between accepting and non accepting states).

click here to download the paper

Busy Beavers are (allmost) incompressible

admin — Wed, 06 Sep 2017 19:58:06 +0000

After a long time, here we are again … let’s resume with a small (and rather trivial) post …

Busy Beavers are allmost incompressible! … easy fact, but I didn’t find it anywhere.

Let $\sigma(M)$ be the number of $1$s left on the tape after the excution of the Turing machine $M$ (the tape is initially blank). A Turing machine $M$ of length $n$ is a Busy Beaver if, for all Turing machines $M’$ of length $n$, we have $\sigma(M’) \leq \sigma(M)$; and we define the busy-beaver function $\Sigma: \mathbb{N} \to \mathbb{N} $ as $\Sigma(n) = \sigma(M)$. Note that our definition of “Busy Beaver” is slightly different from the standard one: we consider the length of the Turing machines in a fixed reasonable encoding scheme, not the number of states.

Suppose that a Turing machine $B$ of length $n$ is a Busy Beaver. If $p$ is the smallest Turing machine that outputs a representation of $B$ when executed on a blank tape ($U(p) = B$) we can build a Turing machine $p’$ that extends the behaviour of $p$ in this way:

mirror the execution of $p$ until it halts;
then simulate the execution of the Turing machine represented by the output of $p$ until it halts (i.e. it simulates the execution of $B$);
and finally find a $0$, overwrite it with a $1$ and halt.

In the simulation of $B$ we can make a one-to-one correspondence between the $i$-th simulated cell $T^B[i]$ and a group of $k$ cells of the actual tape $T[i_1,..,i_k]$ of $p’$ in such a way that if cell $T^B[i] = 1$ during a simulation step, then at least one of $T[i_1],..,T[i_k]$ is $1$.

Hence, by construction, at the end of the whole execution of $p’$, the number of $1$ on the tape is greater than those left on the tape by $B$: $\sigma(p’) > \sigma(B) = \Sigma(n)$.

So we must have $|p’| > n$, but $|p’| = |p| + c = C(B) + c’$ where $C(B)$ is the Kolmogorov complexity of $B$; and finally:

$$C(B) > n – c’$$

where $c’$ doesn’t depend on $n$ but only on the computational model.

NPC Pill #5: (P)izza =?= (NP)izza

admin — Thu, 12 Nov 2015 20:57:37 +0000

Recently A. Amarilli (a3nm) posted a question on cs.stackexchange.com about the computational complexity of a Test Round problem of the Google France #Hash Code 2015: the “Pizza Regina” problem (March 27th, 2015):

Definition [Pizza Regina problem]

Input: A grid $M$ with some marked squares, a threshold $T\in \mathbb{N}$, a maximal area $A \in\mathbb{N}$

Output: The largest possible total area of a set of disjoint rectangles with integer coordinates in $M$ such that each rectangle includes at least $T$ marked squares and each rectangle has area at most $A$.

The problem can be converted to a decision problem adding a parameter $k \in \mathbb{N}$ and asking:

Question: Does there exist a set of disjoint rectangles satisfying the conditions (each rectangle has integer coordinates in $M$, includes at least $T$ marked squares and has area at most $A$) whose total area is at least $k$ squares?

The problem is clearly in $\mathsf{NP}$, and after struggling a little bit I found that it is $\mathsf{NP}$-hard (so the Pizza Regina problem is $\mathsf{NP}$-complete). This is a sketch of a reduction from MONOTONE CUBIC PLANAR 1-3 SAT:

Definition [1-3 SAT problem]:
Input: A 3-CNF formula $\varphi = C_1 \land C_2 \land … \land C_m$, in which every clause $C_j$ contains exactly three literals: $C_j = (\ell_{j,1} \lor \ell_{j,2} \lor \ell_{j,3})$.
Question: Does there exist a satisfying assignment for $\varphi$ such that each clause $C_j$ contains exactly one true literal.

The problem remains NP-complete even if all literals in the clauses are positive (MONOTONE), if the graph built connecting clauses with variables is planar (PLANAR) and every variable is contained in exactly 3 clauses (CUBIC) (C. Moore and J. M. Robson, Hard tiling problems with simple tiles, Discrete Comput. Geom. 26 (2001), 573-590.).

We use $T=3, A=6$, and in the figures ham is represented with blue boxes (transgenic ham?), pizza with orange boxes.

The idea is to use tracks of ham that carry positive or negative signals; the track is made with an alternation of 1 and 2 pieces of hams placed far enough so that they can be covered exactly by one slice of pizza of area $A$; the segments of the track are marked alternately with $+$ or $-$, the track will carry a positive signal if slices are cut on the positive segments:

Each variable $x_i$, which is connected to exactly 3 SAT clauses, is represented by three adjacent endpoints of three ham tracks (positive segment), in such a way that there are 2 distinct ways to cut it, one will “generate” a positive signal on all 3 tracks (it reppresent the $x_i = TRUE$ assignment) the other a negative signal ($x_i = FALSE$). Notice that we can also generate mixed positive and negative signals, but in that case *at least one ham remains uncovered*.

Each clause $C_j$ of the 1-3 SAT formula with 3 literals $L_{i,1}, L_{i,2}, L_{i,3}$ is simply represented by a single ham with three incoming positive segments of three distinct ham tracks; by construction *only one of the three tracks* carrying a positive signal can “cover” the ham-clause.

Finally we can build shift and turn gadgets to carry the signals according to the underlying planar graph and adjust the endpoints:

Suppose that the resulting graph contains $H$ hams. By construction every slice of pizza must contain exactly 3 hams, and in all cases every slice can be enlarged up to area $A$.

If the original 1-3 SAT formula is satisfiable then by construction we can cut $H /3$ pieces of pizza (with total area of $A H / 3$) and no ham remains uncovered.

On the oppposite direction, if we can cut $H /3$ pieces of pizza (with total area $A H / 3$) then no ham remains uncovered, and the signals on the variables gadgets and on the clauses are consistent: the ham on the clause is covered by exactly one positive slice coming from a positive variable, and every variable generates 3 positive signals or 3 negative signals (no mixed signals); so the cuts induce a valid 1-3 SAT assignment.

Conclusion: … so unless $\mathsf{(P)izza} =\mathsf{(NP)izza}$, cutting a pizza can be really hard. I would like to thank Antoine for posting the funny question and for spending a bit of time checking my proof.