**Problem Detail:**

I was following the textbook by David Mackay: *Information theory inference and learning algorithms*.

I have question on asymptotic equiparition' principle:

For an ensemble of $N$ $i.i.d$ random variables $X^N=(X_1,X_2....X_N),$ with $N$ sufficiently large, the outcome $x=(x_1,x_2...x_N)$ is almost certain to belong to a subset of $|A_x^N|$ having only $2^{NH(x)}$ members, with each member having probability "close-to" $2^{-NH(x)}$.

And then in the textbook, it also says that **typical set doesn't nesscarry to contain the most probable element set.**

On the other hand, "smallest-sufficient set" $S_{\delta}$ which defines to be:

the smallest subset of of $A_x$ satisfying $P(x\epsilon S_{\delta})\ge 1-\delta $, for $0\leq{\delta}\leq1. $ In other words, $S_{\delta}$ is constructed by taking the most probable elements in $A_x$, then the second probable......until the total probabily is $\ge1-{\delta}$.

My question is, **as $N$ increases, does $S_{\delta}$ approaches typical set such that these two sets will end up be equivalent of each other?** If the size of the typical set is identical to the size of $|S_{\delta}|$, then why are we even bother with $S_{\delta}$? Why can't we just take the typical set as our compression scheme instead?

###### Asked By : kuku

#### Answered By : Yuval Filmus

The elements in the typical set have typical probability, close to $2^{-NH(x)}$. An element with untypically large probability, say the one with maximal probability, may not satisfy this constraint. Same goes for the rest of $S_\delta$.

The source coding theorem *does* take the typical set as an *encoding* scheme.

###### Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/30372

**3.2K people like this**

## 0 comments:

## Post a Comment

Let us know your responses and feedback