Expectation Maximization EM Examples

by allenlu2007

 

 

 

 

本文主要參考  

 

Reference1 是 Stanford 的 tutorial paper.  

 

Reference2 是 Washington 的 tutorial paper. 

 

 

EM algorithm 主要用 iterative method 來計算 ML 或 MAP parameter estimate. EM 一個主要的好處是用在 missing data or hidden data or latent variables. 

不過至少對我而言,不容易直覺了解 EM algorithm.  Too good to be true?  以下是幾個 examples 

Missing Data Estimation

Example1a: Normal distribution with unknown mean (but known variance)

NewImage  

observables

NewImage  

NewImage 

equals 

NewImage

NewImage

 

For missing data

NewImage

NewImage

 

x is complete data, y is observables, z is missing or hidden data

x = [y, z]   where y = (x1, .. , xm), z = (xm+1… xn)

NewImage 

 

NewImage

 NewImage

 

依照前文的現在使用 EM algorithm, 第一步是 Expectation step:

 NewImage

NewImage

推導的過程如下

NewImage

when t–> infinity   u = 1/m (sum 1 to m Xi), same as ML estimation without missing data!

 

Example1b: Normal distribution with unknown mean and variance

其實 1a 是 1b 的特例。比較完整的推導如下

NewImage 

當然 1a 和 1b 是一個 trivial case, 因為 mean and variance 最後收歛到 ML estimator without missing data.  引入 missing data 無法榨出更多的 information, 也沒有什麼意義。因為  missing data (x(m+1) … x(n)) 和 observables (x(1) .. x(m)) 之間是 independent.  In general, can we prove missing data of i.i.d. always converge to MLE without missing data?

下面的例子是另一類的 hidden information.


Example2: Fisher’s Multinomial Distribution

The history is cited here.

NewImage

pmf  (y1, y2, y3, y4) are integers.

NewImage

where y = (y1, y2, y3, y4).    = (125, 18, 20, 34)

NewImage

 

Let  the complete data x = (y11, y12, y2, y3, y4) where y11+y12 = y1 

NewImage

 NewImage

 where Bin = Binomial  n = y1  p = 1/2 / (1/2 + phi/4)!   E(Y11) = np as follows:

   NewImage

y12(0) = y1 – y11(0).   This complete the E part.

 

E(log Lc | y ) = (y12(0)) + y4 )  log (phi) + (y2 + y3) log (1-phi)

Set derivative to 0  ==>  phi(1) = (y12(0) + y4 )  / (y12(0) + y2 + y3 + y4) = (y12(0) + y4 ) /  (n- y11(0))

ans = 0.6268214


Mixture Estimation 混合估計

混合估計在統計上有實際的應用如 pattern recognition, data mining, etc.  甚至數位通信都可視為混合估計.

混合估計的問題可以分為不同的困難度: (i) 已知有幾個 pdf 以及參數,估計未知的權重; (ii) 已知有幾個 partially specified pdf, 估計未知的參數和權重; (iii) 最困難的是未知幾個 partially specified pdf, 估計未知的參數和權重。

本文 (i) and (ii) by applying EM and Gibbs sampling schemes respectively. We will address task (iii) as an illustration of reversible jump MCMC later.

我們考慮幾種 distribution, Binomial, Poisson, and Gaussian.  

Binomial 和 Poisson mixtures 是 discrete distributions.  Poisson 常常用在光的 sensor 應用上如 communication or image. 

Gaussian 是 continuous distribution mixture, 有廣泛的應用如 data mining, big data, etc.  

Assume that a sample X1, X2, .. Xn comes from the mixture

NewImage

where the sum of wi equals to 1, and wi > = 0.   fi(x) could be Binomial, Poisson, or Gaussian.

 

Case 1: 

先考慮最簡單的 case, only estimate the weights, the log-likelihood function

NewImage

單純的 ML estimator 計算對 wi 的導數,沒有 closed form. 

如果用 EM 的 latent variable 方式,除了 (x1, x2, .., xn) observables 之外,引入 unobservables matrix Z = (zij, i and j from 1.. n).  zij = 1, observation xi comes from distribution fj;  0, else.

Therefore, the new likelihood and log-likelihood of Lc(w) is shown below:

NewImage

NewImage
E-Step

NewImage  

  NewImage

 

  The result is regardless of the distribution. 

 NewImage

   

Example3:  Gaussian Mixture

 

Case 2:  unknown mean and variance

First a common variance

NewImage

Independent variance

NewImage

 NewImage

NewImage

 NewImage

NewImage

 

  

  


Special case of two Gaussians

NewImage

NewImage

Given hidden variable sigma_i = (0, 1),

 

NewImage 

 

E – step:
NewImage  
NewImage
NewImage

A matlab code is listed below:

MU1 = [1 2];

SIGMA1 = [2 0; 0 .5];

MU2 = [-3 -5];

SIGMA2 = [1 0; 0 1];

X = [mvnrnd(MU1,SIGMA1,1000);mvnrnd(MU2,SIGMA2,1000)];

 

scatter(X(:,1),X(:,2),10,‘.’)

hold on

 

options = statset(‘Display’,‘final’);

obj = gmdistribution.fit(X,2,‘Options’,options);

 

h = ezcontour(@(x,y)pdf(obj,[x y]),[-8 6],[-8 6]);

 

ComponentMeans = obj.mu

ComponentCovariances = obj.Sigma

MixtureProportions = obj.PComponents

 

  
 NewImage


 

 

 

  

 

  

 

 

 

 

 

 

 

 

 

 

Advertisements