HMM 幾個有趣應用

by allenlu2007

Hidden Markov Model (HMM) 的理論基礎清晰,有許多有趣的應用。本文 highlight 幾個常見的應用。

這些應用也常用 neural network 來處理。比較起來,neural network 的理論似乎比較 fuzzy. 


Application 1:  Optical Character Recognition (OCR)

可以參考本文為 tutorial.  


1. 字母的 image 是來自 sequences of thin vertical images (segments), 示意如下:


1a. Feature extraction: image 本身是 (i) high dimensional; (ii) distribution instead of discrete values

所以需要 (i) feature extraction to reduce dimension; and (ii) clustering by K-mean or GMM and then discretisation (quantisation)

2. 每一個字毋 (a, b, c, .., z) 都有自己的 character HMM (26 個 character HMM not counting 大小寫)

3. Merge character HMM into word HMM. 整個就是一個大的 HMM.  

4. 一開始先用 character training 自己的 HMM (Baum-Welch EM algorithm).  

5. Given a new image of word, use the word HMM to predict the most likely sequence of characters. 如何 train word? 或是不用 train.?

6. Use EM to determine the transition probability matrix and emission probability matrix.    Use Viterbi algorithm to determines the most likely character and word.



兩個 phases:








How to do word model training???  It is only one HMM.



Application 2:  Speech Recognition


 基本的 idea 和 OCR 類似:

1. Feature extraction

2. Assign feature extraction vector (high dimensional) to HMM states

3. Baum-Welch forward-backward training assigns transition and emission probabilities from each HMM state

4. Viterbi training assigns feature vector to a particular state


類似 OCR 中的 character, speech recognition 的基本單位是 phoneme.  每一個 phoneme 都有自己的 HMM.



Word and sentence level HMMs from phoneme-level unit 如下



Application 3:  Gesture Recognition




(1) image or voice input into segment

(2) feature extraction (may be high dimensional)

(3) clustering and discretisation (quantisation) to finite output values

(4) build HMM and use Baum-Welch EM algorithm to train transition and emission probability

(5) Input real data and use Viterbi to find the best sequence





如何中英文混合 speech recognition or 英文 speech recognition for Chinese?




Application 4:  Biology – Gene Prediction, Protein Unfold