Cheap and Secure Web Hosting Provider : See Now

[Solved]: How do I find the shortest representation for a subset of a powerset?

, , No Comments
Problem Detail: 

I'm looking for an efficient algorithm for the following problem or a proof of NP-hardness.

Let $\Sigma$ be a set and $A\subseteq\mathcal{P}(\Sigma)$ a set of subsets of $\Sigma$. Find a sequence $w\in \Sigma^*$ of least length such that for each $L\in A$, there is a $k\in\mathbb{N}$ such that $\{ w_{k+i} \mid 0\leq i < |L| \} = L$.

For example, for $A = \{\{a,b\},\{a,c\}\}$, the word $w = bac$ is a solution to the problem, since for $\{a,b\}$ there's $k=0$, for $\{a,c\}$ there's $k=1$.

As for my motivation, I'm trying to represent the set of edges of a finite automaton, where each edge can be labeled by a set of letters from the input alphabet. I'd like to store a single string and then keep a pair of pointers to that string in each edge. My goal is to minimize the length of that string.

Asked By : avakar

Answered By : avakar

I believe I found a reduction from Hamiltonian path, thus proving the problem NP-hard.

Call the word $w\in\Sigma^*$ a witness for $A$, if it satisfies the condition from the question (for each $L\in A$, there's $m\geq 1$ such that $\{w_{m+i}\mid 0\leq i<|L|\} = L$).

Consider the decision version of the original problem, i.e. decide whether for some $A$ and $k\geq 0$, there's a witness for $A$ of length at most $k$. This problem can be solved using the original problem as an oracle in polynomial time (find the shortest witness, then compare its length to $k$).

Now for the core of the reduction. Let $G=(V,E)$ be a simple, undirected, connected graph. For each $v\in V$, let $L_v=\{v\}\cup\{e\in E\mid v\in e\}$ be the set containing the vertex $v$ and all of its adjacent edges. Set $\Sigma=E$ and $A=\{L_v\mid v\in V\}$. Then $G$ has a Hamiltonian path if and only if there is a witness for $A$ of length at most $2|E|+1$.

Proof. Let $v_1e_1v_2\ldots e_{n-1}v_n$ be a Hamiltonian path in $G$ and $H=\{e_1, e_2, \ldots, e_{n-1}\}$ the set of all edges on the path. For each vertex $v$, define the set $U_v=L_v\setminus H$. Choose an arbitrary ordering $\alpha_v$ for each $U_v$. The word $w=\alpha_{v_1}e_1\alpha_{v_2}e_2\ldots e_{n-1}\alpha_{v_n}$ is a witness for $A$, since $L_{v_1}$ is represented by the substring $\alpha_1e_1$, $L_{v_n}$ by $e_{n-1}\alpha_n$, and for each $v_i$, $i\notin\{1, n\}$, $L_{v_i}$ is represented by $e_{i-1}u_{v_i}e_i$. Furthermore, each edge in $E$ occurs twice in $w$ with the exception of $|V|-1$ edges in $H$, which occur once, and each vertex in $V$ occurs once, giving $|w|=2|E|+1$.

For the other direction, let $w$ be an arbitrary witness for $A$ of length at most $2|E|+1$. Clearly, each $e\in E$ and $v\in V$ occurs in $w$ at least once. Without loss of generality, assume that each $e\in E$ occurs in $w$ at most twice and each $v\in V$ occurs exactly once; otherwise a shorter witness can be found by removing elements from $w$. Let $H\subseteq E$ be the set of all edges occurring in $w$ exactly once. Given the assumptions above, it holds that $|w|=2|E|-|H|+|V|$.

Consider a contiguous substring of $w$ of the form $ue_1e_2\ldots e_kv$, where $u,v\in V$, $e_i\in E$. We say that $u,v$ are adjacent. Notice that if $e_i\in H$, then $e_i=\{u,v\}$, because $e_i$ occurs only once, yet it is adjacent to two vertices in $G$. Therefore, at most one of $e_i$ can be in $H$. Similarly, no edge in $H$ can occur in $w$ before the first vertex or after the last vertex.

Now, there are $|V|$ vertices, therefore $|H|\leq |V|-1$. From there, it follows that $|w|\geq 2|E|+1$. Since we assume $|w|\leq 2|E|+1$, we get equality. From there we get $|H|=|V|-1$. By pigeonhole principle, there is an edge from $H$ between each pair of vertices adjacent in $w$. Denote $h_1h_2\ldots h_{n-1}$ all elements from $H$ in the order they appear in $w$. It follows that $v_1h_1v_2h_2\ldots h_{n-1}v_n$ is a Hamiltonian path in $G$. $\square$

Since the problem of deciding the existence of Hamiltonian path is NP-hard and the above reduction is polynomial, the original problem is NP-hard too.

Best Answer from StackOverflow

Question Source :

 Ask a Question

 Download Related Notes/Documents


Post a Comment

Let us know your responses and feedback