The k-colourings problem is NP-complete
Statement
Proof
We have already show The k-colourings problem is in NP. So all that is left to show is that it is NP-hard.
We can prove NP-complete using the SAT problem. (Which we have SAT is NP-complete.)
Proof outline.
Suppose we have an instance of the $f$ of the SAT problem with $n$ variables $x_i$ for $1 \leq i \leq n$ and $m$ clauses $c_j$ for $1 \leq j \leq m$. Note we assume each variable appears in at most one literal per clause so the clauses are of size at most $n$. Then we are going to define a graph $G$ for the $n+1$ colouring problem. (Note: If $n \leq 2$, we solve the problem in polynomial time using the 2-SAT algorithm using SCC - so from now on assume $k \geq 3$.)
The graph $G$ has the following vertices:
- A $T$ and $F$ vertex for true and false,
- Base vertices $b_k$ for $1 \leq k \leq n - 1$,
- Variable vertices $x_i$ and $\overline{x_i}$ with $1 \leq i \leq n$, and
- Clause vertices $c_j^k$ with $1 \leq j \leq m$ and $1 \leq k \leq n+1$ Suppose clauses are of size at most.
The graph $G$ has the following edges:
- With $T$, $F$ and $b_k$ for $1 \leq k \leq n-1$ make a clique of size $n+1$,
- For each pair variable vertices $x_i$ and $\overline{x_i}$ connect the to one another i.e. $(x_i, \overline{x_i}) \in E$,
- For each variable vertex $x_i$ and $\overline{x_i}$ connect them to all $b_k$ i.e. $(x_i, b_k), (\overline{x_i}, b_k) \in E$,
- For each clause $c_j$ the vertices $c_j^k$ form a clique of size $n+1$,
- For each clause $c_j = \bigwedge_{k=1}^{a_j} l^k_j$ for some $1 \leq a_j \leq n$ connect $l^k_j$ to $c_j^k$ i.e. $(l^k_j,c^k_j) \in E$ , and
- For $a_j < k \leq n+1$ connect $c^k_j$ to $F$, i.e. $(c^k_j, F) \in E$.
This graph takes $O(n + m)$ to build the graph as we need to iterate through clauses and variables so can be done in polynomial time. We provide graph $G$ to the $n+1$-colouring problem.
If the graph has no $n+1$ colouring, say that $f$ has no assignment. If the graph has an $n+1$ colouring from Claim 1 we know each variable vertex $x_i$ and $\overline{x_i}$ as the same colour as either $T$ or $F$. Return the assignment $a(x_i)$ as true if $c(x_i) = c(T)$ or false otherwise. This takes at most $O(n)$ as we have to loop through the variables checking their assignment - so can be done in polynomial time.
Suppose the constructed colouring problem has a solution. From Claim 1 we have a valid assignment and from Claim 2 one literal in every clauses is assigned True. Therefore, this assignment satisfies $f$.
Suppose we have an assignment $a$ to $x_i$ (where $a(x)$ is 0 if it is False and 1 if it is True) that is a solution to the SAT problem $f$. Then build the following colouring
- Set $c(F) = 1$, $c(T) = 2$, and $c(b_k) = k + 2$,
- Set $c(x_i) = a(x_i) + 1$ and $c(\overline{x_i}) = a(\overline{x_i}) + 1$, and
- Lastly for each clause $c_j$ find literal $l^{k'}_j$ where we have $a(l^{k'}_j)$ is True set the corresponding $c(c_j^{k'}) = 1 = c(F)$ and assign $c(c^{n+1}_j) = 2 = c(T)$ (note as there are at most $n$ literals $k' \not = n+1$) then assign each other $c_j^k$ a different one of the remaining $n-1$ colours.
Lets check the edges to ensure this is a valid colouring:
- As $T$, $F$ and $b_k$ are assigned distinct colours these edges all connect vertices of different colours,
- As $a(x_i) \not = a(\overline{x_i})$ then $c(x_i) \not = c(\overline{x_I})$ and the edge $(x_i, \overline{x_i})$ is respected,
- As $c(x_i), c(\overline{x_i}) \leq 2$ and $c(b_k) \geq 3$ we have these connected edges are respected,
- As each of the clause vertices $c_j^k$ receive a distinct colour the clique edges are respected,
- For each clause $c(l^k_j) \leq 2$ as the only $c(c_j^k) \leq 2$ have $k = k'$ or $k = n+1$ and we have $c(l^{k'}_j) = 2$ the rest of the literals $l^k_j$ connect to vertex with $c(c_j^k) \geq 3$ so these edges are respected, and lastly
- As $c(F) = 1$, and $c(l^k_j) \geq 1$ for all $k \not = k'$ where $k' \leq a_j$ we have all the edges $(c_j^k, F)$ are respected.
Therefore this is a valid $n+1$-colouring of our constructed $G$.
So we have this is a many-one reduction reduction of the SAT problem to the k-colourings problem (graphs). Giving it is NP-complete.
Claim 1
If the constructed graph $G$ has a $n+1$-colouring $c$ then the following holds
- $c(T) \not = c(F)$ moreover $T$, $F$, and $b_k$ all have distinct vertices that use all $n + 1$ colours,
- $c(x_i) \not = c(\overline{x_i})$ for every $1 \leq i \leq n$, and
- $\{c(T), c(F)\} = \{c(x_i), c(\overline{x_i})\}$ for every $1 \leq i \leq n$.
Proof of Claim 1
The first statement holds as $T$, $F$, and $b_k$ are all connected in a clique of size $n+1$.
The second statement holds as $x_i$ and $\overline{x_i}$ are connected.
The last statement holds as $x_i$, $\overline{x_i}$, and $b_k$ are all connected in a clique of size $n+1$ so have different colours from one another. Therefore the colours each $x_i$ and $\overline{x_i}$ are assignment must be the same as that of $T$ and $F$ up to some order.
Claim 2
If the constructed graph $G$ has a $n+1$-colouring $c$ then for each clause $c_j = \bigwedge_{k=1}^{a_j} l^k_j$ for some $1 \leq a_j \leq n$ the following holds:
- The clause vertices $c_j^k$ all have distinct colours that use the $n+1$ colours,
- The clause vertex with the colouring $c(c_{j}^{k'}) = c(F)$ is connected with a literal vertex of the clause $l_j^{k'}$, i.e. $1 \leq k' \leq a_j$, and
- The literal vertex $l_j^{k'}$ has $c(l_j^{k'}) = c(T)$
Proof of Claim 2
The first statement holds as $c_j^k$ form a clique of size $n+1$.
As the vertices $c_j^k$ have every colour there exists a vertex $c_j^{k'}$ where $c(c_j^{k'}) = c(F)$. As vertices $c_j^k$ with $k > a_j$ are connected with vertex $F$ we have $c(c_j^k) \not = c(F)$ for all $k > a_j$. So $k' \leq a_j$ and it is connected to a literal vertex. This shows the second statement.
From Claim 1 we have $c(l^{k'}_j) \in \{c(T), c(F)\}$, however as $l^{k'}_j$ is connected to $c_j^{k'}$ with $c(c_j^{k=1}) = c(F)$ by exclusion we have $c(l^{k'}_j) = c(T$). Giving the final statement.