Cheap and Secure Web Hosting Provider : See Now

# Is there a fast solution for this set reduction problem?

, ,
Problem Detail:

Let there be a set $S$ and denote by $P=2^S$ the power set of $S$. We are given a finite subset $L = \{p_0, \ldots, p_{n-1}\}$ of $P$ and we are allowed to transform the set as follows: if two sets $p_i$ and $p_j$ share some elements, i.e. $p_i \cap p_j \neq \emptyset$, remove both $p_i$ and $p_j$ from $L$ but add their union $p_i \cup p_j$ to $L$.

What is the algorithmically fastest way to reach a state where there is no further (or almost no further) transformation of $L$ possible?

One simple way to achieve that goal is to compare all elements with each other and perform a transformation if possible, repeat until no changes are possible.

Pseudo C++-code using std::vector:

vector<Set> L = ...; int n = L.size();  bool hasChanged = true; while (hasChanged) {     hasChanged = false;      for(int i = 0; i < n; ++i) {         for (int j = i + 1; j < n; ++j) {             if (!isDisjoint(L[i], L[j])) {                 // remove L[i] and L[j]                 swap(L.back(), L[i]);                 L.pop_back();                 swap(L.back(), L[j]);                 L.pop_back();                  // add their union                 L.push_back(getUnion(L[i], L[j]));                  hasChanged = true;                 i -= 1; // we don't want to skip the swapped element                 break;             }         }     } } 

This algorithm takes $\frac{n^2-n}{2}$ comparisions in the best case and $\sum_{k=1}^n \frac{k^2-k}{2}$ comparions in the worst case. Naturally the number of comparisons needed depends on the ordering of the elements in the vector.

To provide some less abstract context: I have a set of line segments which i want to reduce. So the idea is to connect lines which are "close" to each other. A connected line does not necessarily share start or endpoints with its original lines but it can be connected to other lines in the set. How do I connect all the lines until there is no further improvement possible in the fastest way possible?

This rewriting system is confluent, so if you just repeatedly pick a legal transformation and apply it, over and over again, until you can't any longer ... you will always end at the same state.

In other words, there is a unique end state from which no further transformations are possible, and you can reach it by any sequence of legal transformations. Define a state to be a final state if no further transformations are possible from it. Then there is a unique final state, so you can reach it by following any legal sequence of transformations and continuing until you can't continue any further. (This will terminate, since each step reduces the size of $L$. By definition, once you can't continue, you are in a final state. Since the final state is unique, this terminates at the unique final state.)

You can even find the shortest such sequence, relatively efficiently. Build an undirected graph with one vertex for each $p_i$, and with an edge $(p_i, p_j)$ if $p_i \cap p_j \ne \emptyset$. Compute the connected components of this graph. Then the final state will have one subset per connected component in the graph; it will be a subset containing the union of all of the vertices of the connected component. You can get to that connected component by repeatedly picking some pair of vertices $p_i,p_j$ that are joined by an edge, merging those two vertices into a new vertex $p'$, and then updating the edges ($p'$ gets an edge to each vertex that had an edge to either $p_i$ or $p_j$; but remove all self-loops). If there are $n$ vertices originally and $c$ connected components, this will terminate after $n-c+1$ steps.

In fact, looking at it a bit longer, you can discover that any legal sequence of valid transformations that ends at a final state is equally good -- all such sequences have the same length. You start with $L$ containing $n$ subsets; you end with $L$ containing $c$ subsets; each transformation reduces the length of $L$ by one; so any such sequence must contain exactly $n-c+1$ steps. All the legal sequences end at the same final state, so all sequences are equally good.

TL;DR: At each step, apply any valid transformation. Eventually you will terminate. It doesn't matter which pair of subsets you choose to merge at each step; you'll eventually end at the same ending point, and the total number of steps will be the same regardless of the choices you make at each step.

This is the most wonderful kind of algorithm: at each step, do anything that's not illegal, and you'll eventually end up a winner. You can't lose!