# Today I Learned

Some of the things I've learned every day since Oct 10, 2016

## 91: Categorical Distribution (Probability)

categorical distribution is one which describes a random variable which can take $k$ possible values, the probability of each possible outcome being known. For instance, the Bernoulli distribution is a special case of the categorical distribution, where $k = 2$ (one outcome being ‘success’, the other ‘failure’). The distribution of the outcomes of rolling a (not necessarily fair) 6-sided die would also be an example, with $k = 6$.

## 90: The Standard (Unit) Simplex

The standard $n$-simplex, sometimes denoted $\Delta^n$, is the $n$-simplex in $\mathbb{R}^{n + 1}$ with its vertices at the $n + 1$ ‘unit’ points

$e_0 = (1, 0, 0, \dots, 0)$

$e_1 = (0, 1, 0, \dots, 0)$

$e_2 = (0, 0, 1, \dots, 0)$

$\dots$

$e_n = (0, 0, 0, \dots, 1)$.

For example, $\Delta ^0$ is the point $1$ in $\mathbb{R}^1$$\Delta ^1$ is the line segment from $(0, 1)$ to $(1, 0)$ in $\mathbb{R}^2$, and so on. A formal description is

$\Delta ^n = \{(x_0, x_1, \dots, x_n) \in \mathbb{R}^{n+1} | \sum _{i = 0} ^{n} x_i = 1, x_i \geq 0 \}$.

One possible interpretation of $\Delta ^n$ is as the set of possible probability distributions of a categorical distribution on $n+1$ variables.

## 84: Some Properties of Transition Matrices (Probability)

The following are some miscellaneous related properties of transition matrices describing stochastic processes such as Markov chains.

A ‘transition matrix’ is a square matrix $A$ with real entries in $[0, 1]$ such that the sum of the entries of any given of $A$ is equal to 1, the interpretation being that $A_{i, j}$ is the probability of a stochastic process transitioning from state $j$ to state $i$. A ‘regular’ matrix is a square matrix $A$ such that $A^n$ contains strictly positive entries for some $n > 1$.

### For any transition matrix:

1. Every transition matrix has $1$ as an eigenvalue.

### For regular transition matrices:

1. The algebraic multiplicity of $1$ as an eigenvalue of $A$ is $1$.
2. The columns of $\lim _{n \rightarrow \infty} A^n$ are identical, all being the unique $A$-invariant probability vector $v$ such that the sum of the entries of $v$ is $1$ (because it’s a probability vector) and $A v = v$.
3. With $v$ defined as above, for any probability vector $w$$\lim _{n \rightarrow \infty} A^n w = v$. That is, the system or process described by $A$ eventually converges to a distribution described by $v$, regardless of it’s initial state.

## 66: Regular Transition Matrices

The transition matrix $P$ of a Markov chain $X$ is called regular if some positive power of the matrix $P^n, n \geq 1$ contains only positive entries.

## 60: Closure of MIN on Exponential Random Variables

The minimum of random variables $E_1, E_2, ... , E_n$ with corresponding parameters $\lambda _1, \lambda _2, ... , \lambda _n$ is itself a random variable with parameter $\lambda = \sum _{i = 1} ^n \lambda _i$.

It is not, however, the case that the maximum is also exponentially distributed.

## 52: Ergodicity in Markov Chains

A state $s$ in a Markov chain $X$ is ergodic iff it is aperiodic and recurrent. If all states in a Markov chain are ergodic, then the chain itself is said to be ergodic. This is equivalent to the condition that there is a number $n$ such that any state in $X$ can be reached by any other state in exactly $n$ steps.

## 47: Invariant Distributions as Eigenvectors

Since a stationary distribution $\pi$ of a finite Markov chain $X$ satisfies $\pi P = \pi$, where $P$ is the transition matrix of $X$, it can be seen as an eigenvector of eigenvalue $\lambda = 1$ under the linear transformation by $P$. Specifically, $\pi$ is the intersection of the eigenspace $E_1$ with the hyperplane formed by the constraint that $\sum _{i = 1} ^n \pi (i) = 1$.

(Here the vector space in question is $\mathbb{R}^n$, where $n$ is the number of states in $X$.)

## 46: Absorbing States of Markov Chains

A state $S$ in a Markov chain $X$ is absorbing iff it is impossible for the chain to leave $S$ once it’s there. If an absorbing state is accessible from any arbitrary state in $X$, then $X$ is likewise said to be absorbing.

## 45: Transient and Recurrent States of Markov Chains

A state $S$ of a Markov chain $X$ is transient if, given we start from $S$, there is a non-zero probability that we will never return to $S$. A state is transient iff the expected number of visits to that state is finite.

Conversely, $S$ is recurrent if, given we start from $S$, the probability we will eventually return to $S$ is 1. A state is recurrent iff the expected number of visits to that state is infinite.

## 43: Aperiodicity of Markov Chains

A state $s$ in a Markov chain $X$ is aperiodic iff there exists an $n$ such that

$P(X_{n'} = s | X_0 = s) > 0 \quad \forall n' \geq n$.

That is, a state is aperiodic if the chain can always loop back to this state in an arbitrary number of steps.

A Markov chain is aperiodic iff every state in the chain is aperiodic. If the chain is irreducible, then the existence of a single aperiodic state implies that the entire chain is aperiodic.