Cheap and Secure Web Hosting Provider : See Now

[Solved]: Algorithm: Dimension increase in 1D representation of Square Matrix

, , No Comments
Problem Detail: 

Consider the matrix with dimension $m \times m$:

$$ M = \begin{array}{cc} 1 & 1 & 1 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \\ \end{array} $$

Its 1-D representation: $$ M^* = \begin{array}{cc} 1 & 1 & 1 && 0 & 1 & 1 && 1 & 0 & 1\end{array}$$

Now, another dimension added, so that the matrix becomes:

$$ N = \begin{array}{cc} 1 & 1 & 1 & 0 \\ 0 & 1 & 1 & 0 \\ 1 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \\ \end{array} $$

$$ N^* = \begin{array}{cc} 1 & 1 & 1 & 0 && 0 & 1 & 1 & 0 && 1 & 0 & 1 & 0 && 0 & 0 & 0 & 1\end{array}$$

Update
The data structure can be any linear data structure (Array, Single LL, Double LL) not necessarily contiguously linear. The goal is to dynamically add or remove dimensions from the Matrix with least number of operations.

The best I can think of is to repopulate $N^*$ in $\theta(N^2)$ every time a new dimension is added.

Asked By : WeaklyTyped

Answered By : babou

There is a simple solution, but it requires a different 1D representation of your matrix $M$. You suppose it is stored in an arbitrarily long 1D array $A$, so that new elements can be added.

Then the elements of the matrix $M$ are stored such that: $M[i,j]$ is stored in $A[k]$ with $k=(j-1)^2+i$ if $i\leq j$ and $k=i^2-j+1$ otherwise.

Other similar formulae are also possible.

The idea is to stores in order the successives layers of increased matrix size starting with the $1\times 1$ matrix.

Storing order of a $3\times3$ matrix is described here, each element of the matrix being its index in $A$. The first index $i$ is the line index of matrix $M$ representation:

1 2 5
4 3 6
9 8 7

When you increase the size $m$ of the matrix, you only have to add the new layer at the end, i.e. add the representation of the last row and last column at the end of the used part of array $A$. The cost is linear in $m$.

Explanation

The initial version of the question asked for a linear data structure, without explicitly allowing for linked-lists, which I do not consider more linear than anything else, the memory having generally a linear address space. Hence I assumed the idea was to use, as efficiently as possible, a one dimensional array $A$, without trying to mimic pointers.

I started looking for an efficient way to store (and retrieve) the elements $M(i,j)$ of the matrix $M$ in $A$, so that extending the matrix with an extra dimensional layer for the highest values of the indices, could be done cheaply ... the cheapest being to add the new layer right at the end of the existing matrix, so that the cost would be no more than the number of elements to be copied.

To do that, while keeping a uniform indexing scheme requires to have done it uniformly, starting from the $1\times 1$ matrix. The question being somewhat imprecise regarding constraints, I assumed that keeping each line in one contiguous piece is not really required.

The placement of the elements of matrix $M$ in the linear array $A$ is illustrated by the above $3\times 3$ matrix, where each element has the value of its index in $A$.

This organization is such that the index in $A$ of an element $M(i,j)$ of the matrix $M$ does not depend on the size of $M$, but only on $i$ and $j$. Taking $p=max(i,j)$, the placement in $A$ of the upper-lefmost submatrix of $M$ of size $p$ is independent of the rest of the matrix $M$. The element $M(i,j)$ is either on the rightmost column of that submatrix if $i\leq j$, or on its last line otherwise. Furthermore, the first $(p-1)^2$ elements of array $A$ are already used to store the upper-left submatrix of size $p-1$. One uses the next $2p-1$ elements of the $p^{th}$ row and $p^{th}$ column, which contain element $M(i,j)$. This can be done in different ways, and the proposed formula stores these $2p-1$ elements in top-down and right-to-left order.

Note: the first version of the formulae used $p=max(i,j)$ explicitly to compute the squares, a remnant of my initial search for a solution. But @WeaklyTyped remarked rightly that the test comparing $i$ and $j$ does the job, allowing to replace $p$ by $i$ or $j$, according to the test result.

Actual use of the representation

One characteristic of this representation is that it preserves random access (access time is constant and independent of indices and matrix size), which is not usually the case with list representations. However it does not require multiplication for indexing the array $A$, since only addition and integer squares are needed. Integer squares can be memorized in a linear array, or can be computed with addition only if successive square values are needed in a loop, for example to read a line or a column of the matrix, using the formula $(p+1)^2=p^2+2p+1$.

But, as usual, the usefulness of such a representation is highly dependant on the operations needed and their frequency.

Note: this representation was found independently, but it may have been used before. Pointers to the literature are welcome.

A better solution with pointers

(Added July 1st 2014, after the answer was accepted)

The layer approach described in the previous algorithm seems more effective than most pointer based representations for two reasons:

  • given $i$ an $j$, it allows access of element $M[i,j]$ in constant time;

  • it has no overhead in space for storing pointers (not to mention possibly greater garbage collection costs).

However, the last statement is not completely true. We chose to use an array $A$ of "sufficient" length because that initially (in the first version of the question) seemed a given constraint. But, if we relax that constraint, it is clear that it is a costly solution in space since the array $A$ must have a size sufficient to store the largest version of the matrix $M$ that may be used.

Actually, there is something absurd about the solution presented above, if it is to be more than an intellectual exercise (but the context and motivations of the question were not given). If the array $A$ has the maximum size that will ever be needed, why not simply code the maximum sized matrix in it, and use at any time only the indices that are actually needed, given the current matrix size $m$?

Since pointers are (now) allowed, this is a good motivation for attempting to get the benefits of the solution presented above, without its absurd drawback.

The idea is quite simple: keep the layer structure presented above, but store the layers separately, rather than contiguously in the same array.

So a matrix $M$ of size $m$ is implemented as $m$ arrays $L_p$ with $1\leq p\leq m$, each $L_p$ having $2p-1$ elements corresponding to the elements $M[i,j]$ of $M$ such that $max(i,j)=p$.

In order to access $M[i,j]$, one only has to compute $p=max(i,j)$ to locate $L_p$, and then index appropriately with $i$ or $j$ inside $L_p$ to get the right element. Actually, as in the first algorithm above, it may be faster just to have two cases based on a comparison between $i$ and $j$. I do not get into the detailed formulae, that are somewhat obvious to work out.

However this require an extra structure to find a pointer to $L_p$ when $p$ is known.

Three obvious ideas come to mind:

  • The first idea is to use an array of links to the $L_p$ layers, which is indexed by $p$. The drawback is that the size of the array must be the maximum possible value for $m$, but that is not as bad as the maximum values for $m^2$. The advantage is that accessing any element $M[i,j]$ will still be in constant time.

  • The second idea is to use a linked list, which avoids the constant size array. However, the list has to be followed sequentially to get the right layer $L_p$. Hence accessing any element $M[i,j]$ will on the average be in linear time with respect to the current size $m$. Note that, with this solution, it is better to access the list from the end corresponding to the largest value of $p$. Firstly, that is where layers are added or removed. Secondly, assuming that all elements of $M$ have equal chance of beeing accessed, the probability of finding an element in layer $L_p$ is proportional to $(2p-1)/m^2)$, hence a higher probability for higher values of $p$. Indeed, accessing the list from the higher value of $p$ is on average twice as fast as accessing from the lower values of $p$ (proof left as an exercise). But access is linear in $m$ in both cases.

  • In order to reduce access time, the accessing structure can be a balanced tree. The size of the tree is linear in $m$, as it is with a simple list, but the time complexity of finding the right $L_p$ is now logarithmic in $m$, hence also for accessing an element $M[i,j]$. While building this balanced tree, it is better to give to each $L_p$ a weight proportional to $2p-1$ to account for its probability of being the right layer. However, I have not checked how much of a difference it makes.

But actually, trying to build a sophisticated balanced tree is probably self-defeating sophistication, because it may have to be maintained, and it may require more computation to be explored. A better solution is probably a widely branching tree that is walked very simply by indexation.

Using a tree of indexing arrays

(I have no idea whether this has an official name in textbooks)

The nodes of the tree are arrays of length $2^d$, so that we can build a balanced tree with nodes of degree $2^d$. The leaf nodes are arrays of pointers to matrix layers, and the other nodes are arrays of pointers to other arrays (or nothing). Without going into details, given a layer number $p$, The binary representation of $p$ is cut into chunks of size $d$ which are used as indices for the arrays while walking down the tree to find the layer $L_p$. With a value for $d$ of 3 or 4, this should give simple and fast access for matrices of significant size.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/27627

 Ask a Question

 Download Related Notes/Documents

0 comments:

Post a Comment

Let us know your responses and feedback