**Problem Detail:**

I went to listen to a workshop and someone from the audience asked the presenter how the moments can improve the mutual information. I am learning about MI (Mutual Information) so didn't have enough knowledge to understand what it means. Then, I did some research but I still have some confusion. I am wondering if someone who has more knowledge about this can clarify things for me. Here are my questions:

Mutual information is usually calculated by bin functions to estimate the probability of two random variables which can be a case of two vectors $X$ and $Y$. Is the moment generating function another way to estimate probability?

If moment generating functions can present the probability of $X$ and $Y$, how do we calculate it?

Does a MI have a moment generating function?

If MI has a moment generating function, how can we present a MI of $X$ and $Y$ by its moment functions?

###### Asked By : Cassie

#### Answered By : frafl

The moment generating function $M_X$ is a property of a random variable $X$. It's defined by the expected value of $e^{tX}$ (where $t$ is the argument).

Since the exponential function $e^x = \sum_0^\infty \frac{x^n}{n!}$ contains all natural powers of its argument as a summand, the expected value of a sum is the sum of the expected values ($\mathbb{E}(\sum_i X_i)=\sum_i\mathbb{E}(X_i)$) and the expected value of a natural power of $X$ ($\mathbb{E}(X^n)$) is called it's $n$-th moment, the $n$-th moment is present in the $n$-th summand:

$$M_X(t)=\mathbb{E}(e^{tX})=\sum_{i=0}^\infty \frac{t^i\mathbb{E}(X^i)}{i!} \quad .$$

If you now consider the $k$-times derivative of $M_X$:

$$M_X^{(k)}(t)=\mathbb{E}(e^{tX})=\sum_{i=0}^\infty \frac{\mathbb{E}(X^{i+k})}{i!} \quad ,$$

and use $0$ as an argument, you get $$M_X^{(k)}(0)=\mathbb{E}(X^k)\quad,$$

so the $k$-th moment was generated.

Now look at the mutual information:

$$I(X,Y) = \sum_{(x,y)}P(X=x,Y=y)\log\left(\frac{P(X=x,Y=y)}{P(X=x)\cdot P(Y=y)}\right) = \mathbb{E}(\mathrm{PMI}(X,Y)), $$

which is the expected value of the pointwise mutual information (it's likely that they actually deal with the continuous case where $I$ and $\mathrm{PMI}$ are defined using integrals and densities, respectively). So mutual information does not have a moment (or moment generating function), but it **is** the first moment of a random variable, so:

$$I(X,Y) = M_{\mathrm{PMI}(X,Y)}'(0)\quad.$$

###### Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/10513

**3.2K people like this**

## 0 comments:

## Post a Comment

Let us know your responses and feedback