Baum-Welch (EM) Algorithm for HMM

by allenlu2007

前文提到 EM algorithm for Kalman filter training.

Step 0:  Initial guess 

A, C, Q, R, pi (initial mean), v (initial covariance)

Step 1: Kalman filter forward pass

Step 2: Kalman smoothing (Rauch) backward pass

Step 3: Update A, C, Q, R to maximize the Q function!  

Step 4: Repeat step1 till converge.

 

Key is step 3:  先 compute joint pdf p(x1, .. , xn, y1, …, yn)

Q function derivative in terms of joint pdf!

 

在 HMM 中,有類似的 EM algorithm, 即是 Baum-Welch algorithm.  但更簡化

(1) A and Q becomes matrix A (transition probability matrix)

(2) C and R becomes matrix B (emission probability matrix)

(3) Initial probability theta vector

 

其實就是把 Gaussian pdf (mean and covariance) 換成 probability matrix/vector!!!!!!

 

Step 0: Initial guess of A, B, pi

Step 1: Forward pass using HMM filtering

Step 2: Backward pass using HMM smoothing

NewImage

 

Step 3: EM algorithm to maximize Q function

Step 4: repeat step 1 and check converge

 

 

 

 

Details:

Step 0: Initial guess of A, B, pi

A: NxN  (N states)

NewImage

B: KxN  (K possible output values)

NewImage

Pi:  Nx1 vector

NewImage

NewImage

Step 1: Forward pass using HMM filtering

NewImage

Step 2: Backward pass using HMM smoothing

NewImage

Step 3: Update using EM algorithm to maximize Q function

NewImage

詳細推導過程見最後。

 

Step 4: repeat step 1 and check converge

It is possible to over-fit a particular data set. That is P(Y|\theta_{final})>P(Y|\theta_{true}). The algorithm also does notguarantee a global maximum.

 

Baum-Welch Algorithm 推導

Z is state {1..M}, the latent variables;  X is observations {1.. N}

NewImage

NewImage

Baum-Welch is an EM algorithm

NewImage

Q-function

NewImage 

NewImage 

Then use the Lagrange multiplier to find the maximum

NewImage

 

NewImage

Advertisements