### Gaussian Mixture Model (GMM) and EM

The pdf of K Gaussian mixture 如下。常用在各種的 signal processing 如影像或聲音的處理上。工程中常用的 jitter analysis 也可用此分析。

# 如何估計 GMM parameters?

Given a set of observed data X = {x1, x2, .., xn}, estimate the weight, mean, and variance.

EM 的優點: introduce hidden variables (wi) to simplify the maximization of the likelihood.

EM 的缺點: local minimum if initial values not correct.  solution:  K-means first or other way

(ii) K is hard to define.  Solution:  from 1 …?

# How EM work in GMM?

E – Estimation step: Estimate the distribution of the hidden variable given the data and the current value of the parameters

M – Maximization step: Maximize the joint distribution of the data and hidden variable

Assume the hidden variable is Z;

Therefore, the complete (joint) likelihood function Lc of all X and Z as follows:

Soft assignment:  r(i, j) = P( j | xi, theta_t) = wj * P(xi , j | theta) /  P(xi | theta)  for j = 1, .. , K

Hard assignment:   P( j | xi, theta_t) = 1  for maximum wj * P(xi, j | theta) for j = 1,.., K

Initialization:  EM is very sensitive to initial conditions:   garbage in –> garbage out

Solution: 通常用 K-Means to get a good initialization  (K-Means is hard decision; EM is soft decision, so to speak)

Number of Gaussians:Use information-theoretic criteria to obtain optima K???

Simplification of covariance matrix!!  (mainly for video or higher dimension applications)

# Two Gaussians 的簡化版

With the EM (and K Means) method, 我們先指定每一個 sample si 屬於那一個 Gaussian.  指定後即可算出 ML estimation 的 parameters (mean, variance, and weight).   根據新算出的參數，再重新指定每一個 sample si 屬於那一個 Gaussian.

The mixed PDF is:

where

${\pi }_{1}+{\pi }_{2} = 1$

Quantum mechanics interpretation

Hard/soft –> firm assignment with error vector? for responsibility?

K-means for initial value?