allenlu2007

A great WordPress.com site

深度學習用於物件檢測 (Object Detection)

Reference:

1. Joyce Xu, Medium, “Deep Learning for Object Detection: A Comprehensive Review

2. Faster R-CNN論文筆記——FR

 

簡介

深度學習最顯著的應用是在 computer vision, smart audio, and natural language processing.

Alexnet, GoogleNet, etc. 主要是在 ImageNet object classification competition 一鳴驚人。

 

下一步是 (multiple) object detection, 就是找到多個 object location (bounding box) 同時做 object classification.

(Multiple) Object Detection = Object location (bounding box) + Object classification (feature extraction)

 

注意 object location 和 object classification 算法相衝突: classification 算法要 location invariant (蘋果移動和縮放還是蘋果);

但 object location 的 bounding box 要 location and scaling variant, 而且愈 sensitive 愈好,才能找到最精確的 bounding box.

 

這類算法的衝突在 computer vision 很常見。

例如人臉檢測算法要愈普適愈好。但又要能差異化出不同人臉:男,女,小孩,etc.  這兩個要求也互相衝突。

 

直觀來說,似乎可以用串接方式。先做 object location, 再做 object classification.  

例如在人臉識別可以先做人臉檢測 (face bounding box), 再做人臉識別。

不過這只適用在人臉識別,因為已經確定 object 是人臉!可以利用 prior knowledge (face features) 做人臉檢測 (bounding box).

一般情況事先不知道 object 是什麼 (no prior knowledge), 很難一刀切 object location and object classification. 

會需要 iterative process between bounding box and feature extraction.

 

Region Proposal (候選區域)

Object location task 可以 formulate 成 region proposal (RP) problem.  

就是從個多的 bounding box 找到最精確的 bounding box (region proposal) 如下。

NewImage

 

IoU(Intersection-over-Union)

如何定義精確的 bounding box? 自然是和 ground truth bounding box 比。

Region Proposal(候選區域),就是預先找出圖中目標可能出現的位置,通過利用圖像中的紋理、邊緣、顏色等信息,保證在選取較少窗口(幾千個甚至幾百個)的情況下保持較高的召回率 (recall).

Ground truth bounding box 可視為 A.  Region proposal 可視為 B.  IOU 是定量的分數。目標是 maximise IOU.

Region Proposal方法比傳統的滑動窗口方法獲取的質量要更高。比較常用的Region Proposal方法有:Selective Search(SS,選擇性搜索)、Edge Boxes(EB)。

 

R-CNN, Fast R-CNN, Faster R-CNN

NewImage

 

R-CNN  (Region-based Convolutional Neural Networks)

NewImage

1、SS (select search) 提取 RP;

2、CNN 提取特徵; (2K 個 region proposal 都要做 CNN forward)

3、SVM分類;

4、Bounding Box Regression;

 

1、 訓練步驟繁瑣(微調網絡+訓練SVM+訓練bbox);

2、 訓練、測試均速度慢 ;

3、 訓練佔空間

 

1、 從DPM HSC的34.3%直接提升到了66%(mAP);

2、 引入RP+CNN

 

Fast R-CNN  (Fast Region-based Convolutional Neural Networks)

NewImage

原文網址:https://read01.com/RxO842.html

1、SS 提取 RP;

2、CNN提取特徵;

3、Softmax分類;

4、多任務損失函數邊框回歸。

 

1、 依舊用SS提取RP(耗時2-3s,特徵提取耗時0.32s);

2、 無法滿足實時應用,沒有真正實現端到端訓練測試;

3、 利用了GPU,但是區域建議方法是在CPU上實現的。

 

1、 由66.9%提升到70%;

2、 每張圖像耗時約為3s。

 

注意:Fast R-CNN的 Region Proposal是在 feature map 之後做的,這樣可以不用對所有的區域進行單獨的CNN Forward步驟。

But why it works?  Region Proposal 在最後 layer 還會保持原來的 location 嗎???  (下圖 RoI Projection 和 RoI pooling layer 的關係是什麼?)

 

Fast R-CNN框架如下圖:

 

NewImage

Fast R-CNN框架與R-CNN有兩處不同:

① 最後一個卷積層後加了一個ROI pooling layer;

② 損失函數使用了multi-task loss(多任務損失)函數,將邊框回歸直接加到CNN網絡中訓練。分類Fast R-CNN直接用softmax替代R-CNN用的SVM進行分類。

Fast R-CNN是端到端(end-to-end)的。

 

Faster R-CNN  (Fast Region-based Convolutional Neural Networks)

1、RPN 提取 RP;(引入 RP Generative Network!)

2、CNN 提取特徵;

3、Softmax分類;

4、多任務損失函數邊框回歸。

 

1、 還是無法達到實時檢測目標;

2、 獲取region proposal,再對每個proposal分類計算量還是比較大。

 

1、 提高了檢測精度和速度;

2、 真正實現端到端的目標檢測框架;

3、 生成建議框僅需約10ms。

 

3.1 Faster R-CNN的思想

Faster R-CNN可以簡單地看做「區域生成網絡RPNs + Fast R-CNN」的系統,

用區域生成網絡代替FastR-CNN中的Selective Search方法。Faster R-CNN這篇論文著重解決了這個系統中的三個問題:
1. 如何設計區域生成網絡;
2. 如何訓練區域生成網絡;
3. 如何讓區域生成網絡和Fast RCNN網絡共享特徵提取網絡

在整個Faster R-CNN算法中,有三種尺度:
1. 原圖尺度:原始輸入的大小。不受任何限制,不影響性能。

2. 歸一化尺度:輸入特徵提取網絡的大小,在測試時設置,源碼中opts.test_scale=600。anchor在這個尺度上設定。這個參數和anchor的相對大小決定了想要檢測的目標範圍。
3. 網絡輸入尺度:輸入特徵檢測網絡的大小,在訓練時設置,源碼中為224*224。

 

3.2 Faster R-CNN框架介紹

NewImage

 

Faster-R-CNN算法由兩大模塊組成:

1.PRN候選框提取模塊;

2.Fast R-CNN檢測模塊。

其中,RPN是全卷積神經網絡,用於提取候選框;Fast R-CNN基於RPN提取的proposal檢測並識別proposal中的目標。

同樣的問題:  Region Proposal network 的 RoI 和 image 的 location 有什麼關係?

 

3.3 RPN介紹

3.3.1背景

目前最先進的目標檢測網絡需要先用區域建議算法推測目標位置,像SPPnet和Fast R-CNN這些網絡雖然已經減少了檢測網絡運行的時間,但是計算區域建議依然耗時較大。所以,在這樣的瓶頸下,RBG和Kaiming He一幫人將Region Proposal也交給CNN來做,這才提出了RPN(Region Proposal Network)區域建議網絡用來提取檢測區域,它能和整個檢測網絡共享全圖的卷積特徵,使得區域建議幾乎不花時間。

RCNN解決的是,「為什麼不用CNN做classification呢?」

Fast R-CNN解決的是,「為什麼不一起輸出bounding box和label呢?」

Faster R-CNN解決的是,「為什麼還要用selective search呢?」

 

 

 

 

 

 

Advertisements

人臉識別 FaceNet

Reference:

1. 基于深度学习的人脸识别技术综述

2. FaceNet: A Unified Embedding for Face Recognition and Clustering 

3. 谷歌人脸识别系统FaceNet解析

4. DeepFace: Closing the Gap to Human-Level Performance in Face Verification

5. Recover Canonical-View Faces in the Wild with Deep Neural Networks

  

簡介

前文介紹的 Viola-Jones, Cascade CNN, MTCNN 都是人臉檢測。下一步就是人臉識別。

這和常人的直覺一樣。先注意到人臉,再回想這個人是誰。

 

人臉識別的直觀想法:(1) 先找到臉部的特徵點 (e.g. 眼睛,鼻子,嘴)。(2) 把特徵點的相對位置轉換成高維向量。

再來有兩種作法: (3a) supervised learning: 用標註的人臉進行分類 (淺層或深層分類)。

(3b) unsupervised learning: 把高維向量分 clustering (e.g. KNN)。再用少量的標註人臉進行分類。 

(3b) Unsupervised learning 比 (3a) supervised learning 有實際上的好處。不用事先準備大量標註的人臉。

 

以上是大致的思路。

 

深度學習用於人臉識別 

首先,深度學習可以結合 (1) and (2), 直接把人臉影像轉為高位向量。這是 Google FaceNet 的做法。

Reference2 FaceNet 直接用 deep neural network 將輸入人臉轉為 128D embedded vector.

NewImage 

作者开发了一个新的人脸识别系统:FaceNet,可以直接将人脸图像映射到欧几里得空间,空间的距离代表了人脸图像的相似性。

只要该映射空间生成,人脸识别,验证和聚类等任务就可以轻松完成。

该方法是基于 CNN,在LFW数据集上,准确率为0.9963,在YouTube Faces DB数据集上,准确率为0.9512。

FaceNet的核心是百万级的训练数据以及 triplet loss。

 

Loss Function

triplet loss是文章的核心,模型将图像x embedding入d-维的欧几里得空间f(x)\in R^{d} 。我们希望保证某个个体的图像x_{i}^{a} (anchor) 和该个体的其它图像x_{i}^{p} (positive) 距离近,与其它个体的图像x_{i}^{n} (negtive) 距离远。如图5-1所示:图5-1 triplet loss示意图



triplets 的选择对模型的收敛非常重要。如公式1所示,对于x_{i}^{a} ,我们我们需要选择不同个体的图片x_{i}^{p} ,使argmax_{x_{i}^{p} } \left| \left| f(x_{i}^{a} )-f(x_{i}^{p} )\right|  \right| _{2}^{2} ;同时,还需要选择同一个体不同图片x_{i}^{n} ,使得argmin_{x_{i}^{n} } \left| \left| f(x_{i}^{a} )-f(x_{i}^{n} )\right|  \right| _{2}^{2}

 

CNN 有多種架構: NN1, NN2, NN3, NNS1, NNS2

NN1:

NewImage

 

NN2 (注意參數數量比起 NN1 大幅減少)

NewImage

 采用adagrad优化器,使用随机梯度下降法训练CNN模型。在cpu集群上训练了1000-2000小时。边界值\alpha 设定为0.2。


其次,深度學習可以用來做 2D 甚至 3D 人臉 alignment, 人臉正面重構(Frontalization)

, 增加識別成功率,如 DeepFace 的做法。

2.1 简介

常规人脸识别流程是:人脸检测-对齐-表达-分类。本文中,我们通过额外的3d模型改进了人脸对齐的方法。然后,通过基于4million人脸图像(4000个个体)训练的一个9层的人工神经网络来进行人脸特征表达。我们的模型在LFW数据集上取得了0.9735的准确率。该文章的亮点有以下几点:一,基于3d模型的人脸对齐方法;二,大数据训练的人工神经网络。

2.2 人脸对齐方法

文中使用的人脸对齐方法包括以下几步:1,通过6个特征点检测人脸;2,剪切;3,建立Delaunay triangulation;4,参考标准3d模型;5,将3d模型比对到图片上;6,进行仿射变形;7,最终生成正面图像。

NewImage

2.3 深度神经网络

图2-2:深度神经网络

2.4 结果

该模型在LFW数据集上取得了0.9735准确率,在其它数据集比如Social Face Classification (SFC) dataset和YouTube Faces (YTF) dataset也取得了好结果,详情请参见原文。

 

 

深度學習可以用來產生 canonical view faces, 增加識別成功率,如 FR+FCN 的做法。

 

3.1 简介

 

自然条件下,因为角度,光线,occlusions(咬合/张口闭口),低分辨率等原因,使人脸图像在个体之间有很大的差异,影响到人脸识别的广泛应用。本文提出了一种新的深度学习模型,可以学习人脸图像看不见的一面。因此,模型可以在保持个体之间的差异的同时,极大的减少单个个体人脸图像(同一人,不同图片)之间的差异。

与 DeepFace 使用2d环境或者3d信息来进行人脸重建的方法不同,FR+FCN 直接从人脸图像之中学习到图像中的规则观察体(canonical view,标准正面人脸图像)。作者开发了一种从个体照片中自动选择/合成canonical-view的方法。在应用方面,该人脸恢复方法已经应用于人脸核实。同时,该方法在LFW数据集上获得了当前最好成绩。

该文章的亮点在于:一,新的检测/选择canonical-view的方法;二,训练深度神经网络来重建人脸正面标准图片(canonical-view)。

 

3.2 canonical view选择方法

 

我们设计了基于矩阵排序和对称性的人脸正面图像检测方法。如图3-1所示,我们按照以下三个标准来采集个体人脸图片:一,人脸对称性(左右脸的差异)进行升序排列;二,图像锐度进行降序排列;三,一和二的组合。

图3-1 正面人脸图像检测方法

 

矩阵Y_{i} \in R^{64\times 64} 为第i个个体的人脸图像矩阵,D_{i} 为第i个个体所有人脸图像集合,Y_{i} \in D_{i} 。正面人脸检测公式为:M(Y_{i} )=||Y_{i}P-Y_{i}Q||_{F}^{2}-\lambda ||Y_{i}||_{*}

 

3.3 人脸重建

 

我们通过训练深度神经网络来进行人脸重建。loss函数为:E(\left\{ X_{ik}^{0} \right\}  ;W)=\sum_{i}^{}{} \sum_{k}^{}{} \left| \left| Y_{i}-f(X_{ik}^{0};W ) \right|\right| _{F}^{2}

 

i为第i个个体,k为第i个个体的第k张样本。X^{0} 和Y为训练图像和目标图像。

 

如图3-2所示,深度神经网络包含三层。前两层后接上了max pooling;最后一层接上了全连接层。于传统卷积神经网络不同,我们的filters不共享权重(我们认为人脸的不同区域存在不同类型的特征)。第l层卷积层可以表示为:

X_{q,uv}^{l+1} =\sigma (\sum_{p=1}^{I}{x_{pq,uv}^{l} }\circ (X_{p}^{l} ) _{uv} +x_{q}^{l} )

图3-2 深度神经网络

 

最终,经过训练的深度神经网络生成的canonical view人脸图像如图3-3所示。

图3-3 canonical view人脸图像

 

GAN 用於人臉識別 (combined with video)?




FDDB 人臉檢測數據集以及評測指標

Reference: 

1. FDDB人脸检测测评数据集介绍

2. FDDB: A Benchmark for Face Detection in Unconstrained Settings


簡介

FDDB是全世界最具权威的人脸检测评测平台之一,包含2845张图片,共有5171个人脸作为测试集。

图片来源:Yahoo News, 美联社, 和路透社新闻报道图片,并删除了重复图片。

NewImage

1. 強調在日常生活場景的人臉檢測。

测试集范围包括:不同姿势、不同分辨率、旋转和遮挡等图片,同时包括灰度图和彩色图。

 

2. 更準確的標註:FDDB 标准的人脸标注区域为椭圆形。不過似乎沒有臉的方向。但有臉 title 的角度。

NewImage

Elliptical regions

Each face region is represented as a 6-tuple (ra, rb, θ, cx, cy, s),

where ra and rb refer to the half-length of the major and minor axes; θ is the angle of the major axis with the horizontal axis; and cx and cy are the x and y coordinates of the center; and s ∈ {−∞,∞} is the confi- dence score associated with the detection of this elliptical region.

NewImage

 

3. Pose 似乎有三種: Frontal (正面), Profile (側面), and Tilted back/front (仰角或俯角).

NewImage

2、人脸标注

2.1  标注方式

制定人脸标注标准协议 —-> 安排多人独立完成图像标注 —-> 取标注的平均值为最终值。


2.2  非人脸区域排除

1、长或宽小于20个像素的人脸区域标记为非人脸。
2、像素较低,设定一个阈值,低于阈值区域标记为非人脸。
3、远离相机的人脸区域,被标记为非人脸。
4、人脸被遮挡,2个眼睛都不在区域内,被标记为非人脸。


2.3  标注椭圆区域


上图所示分别为:正脸、侧脸和仰俯脸。


2.4  标注结果

最终标注得到的椭圆形区域数据,如图所示。


每个标注的椭圆形人脸由六个元素组成。為什麼沒有方形區域?

(ra, rb, Θ, cx, cy, s)
ra,rb:半长轴、半短轴
cx, cy:椭圆中心点坐标
Θ:长轴与水平轴夹角(头往左偏Θ为正,头往右偏Θ为负)
s:置信度得分

Θ角换算:
∏ = 180度
1.553739 = (1.553739/ ∏)*180 = 89.022度


3  评价


为了表示检测区域di和标注区域lj的匹配程度,使用如下公式表示它们之前的匹配程度。


         匹配度>=0.5:人脸区域,正确检测;
         匹配度<=0.5:非人脸区域,误报。


另外 precision-recall curve 也常用於 benchmark 不同的算法。

 

NewImage

 

True positive rate = recall = (correctly predicted positive, TP) / (actual positive, TP+FN)

False positive rate = (incorrectly predicted positive, FP) / (actual negative, FP+TN)

Precision = (correctly predicted positive, TP) / (predicted positive, TP + FP)

顯然 Viola-Jones 算法在 FDDB 有很大的改善空間。

 
 



 

人臉檢測 Face Detection 算法 Cascade CNN and MTCNN

Reference:

1. 人臉偵測 Face Detection 算法

2. 基于mtcnn和facenet的实时人脸检测与识别系统开发

3. Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks

4. mxnet 实现 mtcnn 人脸检测和特征点定位

5. Convolutional Neural Network Cascade for FaceDetection

 

前文 (reference 1) 討論 face detection (人臉檢測或偵測)是用 feature based.  

例如 opencv 的人臉檢測就是 Haar Feature-based Cascade Classifier for Object Detection.


Viola-Jones 建議用 Haar 特徵分裂器 (with scaling) + cascade detector.

Haar 分類器主要是找 edge, line, and center-around features.  簡單來說是找高頻的 features.

Scaling 部份是分類器自己 scaling, 而不是圖像 scaling.

最重要的是引入 cascade 架構。才能達到 real time 實時的要求。

image

 

然而應該要決定多少個 classifier 呢?這個問題決定於我們所設定的 false positive 
rate 以及 detection rate 而定。所謂的 false positive rate 就是我們將人臉的圖片誤 
判成不是人臉的圖片的機率。而所謂的 detection rate 則正確找到人臉的準確率。 
通常這兩個之間會有 trade off,如果我們想要達到比較高的 detection rate,那麼 
false positive rate 可能就會比較高一點;而如果想達到比較低的 false positive 
rate,那麼正確率難免就會下降。 


整個選取cascade classifier的演算法如Algorithm 3所示。首先我們要決定每一個階 
層的classifier的false positive rate以及detection rate。然後我們要決定一個target的 
false positive rate以及target detection rate,當所有的整體的false positive rate以及 
detection rate達到設定的值以後才會停止。因此對於每一個階層,我們就要選足 
夠多的feature來達到false positive rate以及detection rate。

image 

 

Cascade CNN

Li 的 paper (reference 5),  採用了 cascade structure + CNN 提高檢測的準確率。

Li 採用三級架構如下:12-net, 24-net, and final 48 net.

12-net (12×12) 時粗調,reject obvious non-face.   24-net (24×24) embedded 12-net 再 reject non-face.  

48-net embedded 24-net 再 reject non-face.  整體像是 nested 架構。

NewImage

12-net 只有 1 convolution layer + 1 fully-connected layer.  是淺層架構。可以 reject 90% 的 non-face.  節省大量的計算。

24-net 也只有 1 convolution layer + 1 fully-connected layer.  進一步 reject 90%.

Li’s paper 一共有 6 個 CNN.  其中 3 個 (12-net, 24-net, 48-net) 是 binary face / non-face classification.

 

另外 3 個 CNN 是為了 bounding box calibration, 是 multi-class classification of discretized displacement pattern 如下。

NewImage

 

結果相當不錯。比 Viola-Jones 好很多 (V-J 不到 70%)。

NewImage

 

MTCNN

Zhang 的 paper (reference 3), 同樣採用了 cascade structure 以達到實時人臉檢測。

但用 CNN 取代 Haar feature detection.

不過 cascade structure 和 Viola-Jones 的 cascade structure 似乎無關。作者自己的說法:

The proposed CNNs consist of three stages.

Stage1 (P-Net), it produces candidate windows quickly through a shallow CNN.

Stage2 (R-Net), it refines the windows by rejecting a large number of non-faces windows through a more complex CNN.

Stage3 (O-Net), it uses a more powerful CNN to refine the result again and output five facial landmarks positions.

Zhang 和 Li 最大的差別是不用 CNN for bounding box!  節省 3 個 CNN!  

主要是用 facial landmark localization 和 bounding box 的 correlation 達成。

NewImage

CNN 架構和 Li paper 非常類似。

 NewImage

作者自述的貢獻:

1. We propose a new cascaded CNNs based framework for joint face detection and alignment, and carefully design lightweight CNN architecture for real time performance. (上述, P-Net, R-Net, O-Net)

2. We propose an effective method to conduct online hard sample mining to improve the performance.

3. Extensive experiments are conducted on challenging benchmarks, to show significant performance improvement of the proposed approach compared to the state-of-the-art techniques in both face detection and face alignment tasks.

 

Performance (True positive Rate ~ 95%)

 NewImage

 

 ————————————————————————————–

Haar feature + cascade detection (Viola Jones) – real time

dlib

Faster RCNN – slow (use RPN)

Cascaded CNN – better but slower compared with MTCNN

MTCNN  – tailor for real-time





Ubuntu 16.04 安裝 VNC 及 Mate 桌面環境

Reference: 

1. Ubuntu 16.04 安裝 VNC 及 Mate 桌面環境

2. How to Setup A Ubuntu Remote Desktop

 

VNC 是非常重要的軟體,但很容易有安裝的問題。主要是 vncserver 安裝容易,只有 vncserver and vnc4server 兩種版本。

但是 server 的 desktop 卻有很多版本 (Gnome, KDE, Mate, etc.)  Reference 是我用最順的。

 

$ sudo apt update

$ sudo apt upgrade -y

$ sudo apt install ubuntu-mate-core ubuntu-mate-desktop

$ sudo apt install vnc4server -y

 

再來就是最重要的 edit ~/.vnc/xstartup.

 

#!/bin/sh 
# Uncomment the following two lines for normal desktop: 
# unset SESSION_MANAGER
unset DBUS_SESSION_BUS_ADDRESS
# exec /etc/X11/xinit/xinitrc 
[ -x /etc/vnc/xstartup ] && exec /etc/vnc/xstartup 
[ -r $HOME/.Xresources ] && xrdb $HOME/.Xresources 
xsetroot -solid grey vncconfig -iconic & 
x-terminal-emulator -geometry 80x24+10+10 -ls -title "$VNCDESKTOP Desktop" & 
x-window-manager &
mate-session &

 

VNC client 端

使用 VNC viewer in MAC.  

 

NewImage

 

更好的方法是用 Ubuntu default 的 desktop sharing.  請見 reference 2.

01
of 05
 

How To Share Your Ubuntu Desktop

Share Your Ubuntu Desktop

 Share Your Ubuntu Desktop.

 click on the icon at the top of the Unity Launcher which is the bar down the left side of the screen.

When the Unity Dash appears to start entering the word “Desktop”

An icon will appear with the words “Desktop Sharing” underneath. Click on this icon.

 
02
of 05
 

Setting Up Desktop Sharing

Desktop Sharing

 

The desktop sharing interface is broken down into three sections:

  1. Sharing
  2. Security
  3. Show notification area icon

Sharing

The sharing section has two available options:

  1. Allow other users to view your desktop
  2. Allow other users to control your desktop

If you wish to show another person something on your computer but you don’t want them to be able to make changes then just tick the “allow other users to view your desktop” option.

If you know the person who is going to be connecting to your computer or indeed it is going to be you from another location tick both boxes.

Warning: Do not allow somebody you do not know to have control over your desktop as they can damage your system and delete your files. 

Security

The security section has three available options:

  1. You must confirm each access to this machine
  2. Require the user to enter this password
  3. Automatically configure UPnP router to open and forward ports

If you are setting up the desktop sharing so that other people can connect to your computer to show them your screen then you should check the box for “you must confirm each access to this machine”. This means you know exactly how many people are connecting to your computer. 

If you intend to connect to the computer from another destination yourself then you should make sure the “you must confirm each access to this machine” does not have a tick in it. If you are elsewhere then you won’t be around to confirm the connection.

Whatever your reason for setting up desktop sharing you should definitely set a password. Place a tick in the “Require the user to use this password” box and then enter the best password you can think of into the space provided.

The third option deals with accessing the computer from outside your network. By default your home router will be set up to only allow other computers connected to that router to know about the other computers and devices connected to that network. To connect from the outside world your router needs to open a port to allow that computer to join the network and have access to the computer your are trying to connect to.

Some routers allow you to configure this within Ubuntu and if you intend to connect from outside your network it is worth placing a tick into the “Automatically configure UPnP router to open and forward ports”.

Show Notifications Area Icon

The notification area is in the top right corner of your Ubuntu desktop. You can configure the desktop sharing to show an icon to show it is running.

The options available are as follows:

  1. Always
  2. Only when someone is connected
  3. Never

如果選擇“always”選項,則會出現一個圖標,直到您關閉桌面共享。如果您選擇“Only when someone is connected”時,圖標才會出現,如果有人連接到電腦。最後的選擇是”never”顯示圖標。


After setting up the VNC server, just close the utility.

3.) Disable encryption.

Due to this bug, the common used TigerVNC, TightVNC viewer does not support vino’s security type. You’ll get the error below when you try to connect:

security-notsupported

A workaround is to disable encryption requirement. To do so, install dconf Editor from Ubuntu Software (or via sudo apt install dconf-editor command in terminal), and launch it.

When it opens, navigate to org -> gnome -> desktop -> remote-access, and uncheck the value of 「require-encryption」 in right.

vino disable encryption

Finally connect to this desktop on remote machine by typing the IP and password using a VNC client!


 

 

 

Deep Learning Machine on Ubuntu LTS 16.04 with GTX 1080

本文聚焦在 Ubuntu LTS 16.04 with GTX1080 GPU deep learning 軟體設定。

Reference: 

1. tflearn: http://tflearn.org/examples/

2. https://standbymesss.blogspot.tw/2016/09/ubuntu-1404-caffe-cuda-75-opencv-31.html -> good reference

 

Step 0: Install emacs and vnc

* sudo apt install emacs

* sudo apt install git

* vnc 可以參考下文。==> change to the following reference using ubuntu default screen sharing!

http://ubuntuhandbook.org/index.php/2016/07/remote-access-ubuntu-16-04/

 

ps. edit /etc/default/locale to change the date format from lzh_TW -> en_US.UTF-8

 

Step 1: Install Ubuntu LTS 16.04 and Nvidia driver

Reference: 

1.2. https://developer.nvidia.com/cuda-downloads  : CUDA

1.3. http://docs.nvidia.com/deeplearning/sdk/cudnn-install/index.html:cudnn

 

* Install Ubuntu desktop version.  Need to solve the x-win problem when install nvidia driver.

* setup -> software update —> additional driver —> GTX 1080 —> choose Nvidia driver 375-88 to install Nvidia driver.

 

Step 2: Install Nvidia CUDA and cuDNN

Reference:

* CUDA 8.0 

  1. sudo dpkg -i cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  2. sudo apt-get update
  3. sudo apt-get install cuda
  4. sudo dpkg -i cuda-repo-ubuntu1604-8-0-local-cublas-performance-update_8.0.61-1_amd64.deb 

$nvcc —version (may need to do   $sudo apt install nvidia-cuds-toolkit

$nvidia-smi

NewImage

Setup environment variables

export CUDA_HOME=/usr/local/cuda-8.0 
export LD_LIBRARY_PATH=${CUDA_HOME}/lib64 
 
PATH=${CUDA_HOME}/bin:${PATH} 
export PATH

  

* cuDNN: 6.0 

  • Navigate to your <cudnnpath> directory containing cuDNN Debian file.
  • Install the runtime library, for example:
    sudo dpkg -i libcudnn7_7.0.2.43-1+cuda9.0_amd64.deb
  • Install the developer library, for example:
    sudo dpkg -i libcudnn7-dev_7.0.2.43-1+cuda9.0_amd64.deb
  • Install the code samples and the cuDNN Library User Guide, for example:
    sudo dpkg -i libcudnn7-doc_7.0.2.43-1+cuda9.0_amd64.deb

注意在 install cuDNN 後,依照 Nvidia cuDNN installation guide (reference 3) compile mnistCUDNN example.

遇到以下錯誤。

 

Can not use cuDNN on context None: cannot compile with cuDNN. We got this error: In file included from /usr/local/cuda-8.0/include/channel_descriptor.h:62:0, from /usr/local/cuda-8.0/include/cuda_runtime.h:90, from /usr/include/cudnn.h:64, from /tmp/try_flags_F2eFMF.c:4: /usr/local/cuda-8.0/include/cuda_runtime_api.h:1628:101: error: use of enum 『cudaDeviceP2PAttr' without previous declaration extern __host__ __cudart_builtin__ cudaError_t CUDARTAPI cudaDeviceGetP2PAttribute(int *value, enum cudaDeviceP2PAttr attr, int srcDevice, int dstDevice); ^

參考 https://github.com/Theano/Theano/issues/5856 解決這個問題。

Open the file:
/usr/include/cudnn.h

And try change the line:
#include “driver_types.h”

to:
#include <driver_types.h>

-------------------------------- 

在 2.3.1 installing from a Tar File.

Copy the following files into the CUDA Toolkit directory.

$ sudo cp cuda/include/cudnn.h /usr/local/cuda/include

$ sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64

$ sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*

不過用 2.3.2 installing from a Debian File (as I did).  在以上目錄 (/usr/local/cuda/include and lib) 卻找不到 cudnn or libcudnn files.

反而是在 /usr/include/x86_64-linux-gnu/include and lib, why?

NewImage

 

Step 3: Install Anaconda python, OpenCV, and Caffe and Caffe2

https://standbymesss.blogspot.tw/2016/09/ubuntu-1404-caffe-cuda-75-opencv-31.html

$ sudo apt-get install libprotobuf-dev libleveldb-dev libsnappy-dev libopencv-dev libhdf5-serial-dev protobuf-compiler
$ sudo apt-get install --no-install-recommends libboost-all-dev
$ sudo apt-get install libgflags-dev libgoogle-glog-dev liblmdb-dev
Install Atlas
 sudo apt-get install libatlas-base-dev

Step 3.1 Install Anaconda Python

Download anaconda2, use python 2.7.13 for caffe compatibility

$ bash Anaconda2-4.0.0-Linux-x86_64.sh

Step 3.2 Install OpenCV

參考 https://standbymesss.blogspot.tw/2016/09/ubuntu-1404-caffe-cuda-75-opencv-31.html

$ sudo apt-get install cmake git libgtk2.0-dev pkg-config libavcodec-dev libavformat-dev libswscale-dev
$ unzip opencv-3.3.0.zip
$ cd  opencv-3.1.0/ $ mkdir release $ cd release $ cmake -DBUILD_TIFF=ON  -DENABLE_AVX=ON -DWITH_OPENGL=ON -DWITH_OPENCL=ON -DWITH_IPP=ON -DWITH_TBB=ON -DWITH_EIGEN=ON -DWITH_V4L=ON  -DCMAKE_BUILD_TYPE=RELEASE -DCMAKE_INSTALL_PREFIX=$(python -c "import sys; print(sys.prefix)")-DPYTHON_EXECUTABLE=$(which python)-DPYTHON_INCLUDE_DIR=$(python -c "from distutils.sysconfig import get_python_inc; print(get_python_inc())")-DPYTHON_PACKAGES_PATH=$(python -c "from distutils.sysconfig import get_python_lib; print(get_python_lib())")..
$ make -j8

test opencv

a. run python and do “import cv2″

b. using ./python/python demo.py

我遇到問題是 

GLib-GIO-Message:Using the 'memory'GSettings backend.

export GIO_EXTRA_MODULES=/usr/lib/x86_64-linux-gnu/gio/modules/

 

Step 3.2 Install caffe

git download caffe.

cd caffe directory.

$ cd python

$ for req in $(cat requirements.txt);do pip install $req;done

$ cd ..

$ cp Makefile.config.example Makefile.config

Modify Makefile.config based on the reference.

$make all -j8

$make test -j8

all OK

$make runtest -j8

encounter library libhdf5_hl_so.8 problem!!

Add ${ANACONDA2}/lib path to LD_LIBRARY_PATH to solve this problem.

NewImage

 

pycaffe

make python

 

Step 3.3 Install caffe2

# for Ubuntu 16.04
sudo apt-get install -y --no-install-recommends libgflags-dev

 

# for both Ubuntu 14.04 and 16.04
sudo apt-get install -y --no-install-recommends \
      libgtest-dev \
      libiomp-dev \
      libleveldb-dev \
      liblmdb-dev \
      libopencv-dev \
      libopenmpi-dev \
      libsnappy-dev \
      openmpi-bin \
      openmpi-doc \
      python-pydot
sudo pip install \
      flask \
      future \
      graphviz \
      hypothesis \
      jupyter \
      matplotlib \
      pydot python-nvd3 \
      pyyaml \
      requests \
      scikit-image \
      scipy \
      setuptools \
      six \
      tornado

 git clone –recursive https://github.com/caffe2/caffe2.git && cd caffe2

make &&cd build && sudo make install

python -c ‘from caffe2.python import core’ 2>/dev/null &&echo“Success”||echo“Failure”

 

python -m caffe2.python.operator_test.relu_op_test

 

 

Step 3: Install PyTorch

conda install pytorch torchvision cuda80 -c soumith


 

Step 4: Install TensorFlow

 

sudo apt-get install libcupti-dev

..



Step 5: Install MXNET

pip install mxnet-cu80==0.11.0

主要是用來 run mtcnn for face detection.

 

 

 



 



 

 

 

 

Step 5: install tflearn

pip install tflearn or condo install tflearn?


Step 6: install caffe/caffe2? to run faster R-CNN 

Reference:  

  • 2. caffe Ubuntu 16.04 or 15.10 Installation Guide
  • 需要 compile from source code.  可能要先用 GCP without CPU 走過一次。
  • 另外 caffe 最好是用 python 2.7.
  • > conda create -n y27 python=2.7 anaconda
  • pip install –ignore-installed —upgrade 
    https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.3.0-cp27-none-linux_x86_64.whl





Deep Learning Machine on Windows 10

本文聚焦在 Windows 10 with GTX1080 GPU deep learning 軟體設定。

Reference:

tflearn: http://tflearn.org/examples/

Step 1: Install Nvidia driver on win 10.

Step 2: Install Nvidia CUDA and cuDNN

…. add later.

Step 3: Install Anaconda (4.4.0)

Step 4: Use anaconda shell prompt to install tensorflow

Step 5: install tflearn

Ref: https://github.com/tflearn/tflearn/issues/539

5.1: Then go to http://www.lfd.uci.edu/~gohlke/pythonlibs/#curses download curses‑2.2‑cp36‑none‑win_amd64.whl file and run pip install curses‑2.2‑cp36‑none‑win_amd64.whl command in local folder.

5.2: pip install tflearn



Deep Learning Machine Software Setup

本文聚焦在 CPU only (no GPU) deep learning 軟體設定。

Step 0:  Create GCP VM

OS 建議用 Ubuntu LTS 16.04 (or 14.04).   (Persistent disk) HDD 建議用 60GB.  RAM 建議用 6.5GB (max in GCP). 

image

Step 1:  Install TensorFlow

我們先從 Google tensorflow installation 反推:

https://www.tensorflow.org/versions/r0.12/get_started/os_setup

We support different ways to install TensorFlow:

  • Pip install: Install TensorFlow on your machine, possibly upgrading previously installed Python packages. May impact existing Python programs on your machine.
  • Virtualenv install: Install TensorFlow in its own directory, not impacting any existing Python programs on your machine.
  • Anaconda install: Install TensorFlow in its own environment for those running the Anaconda Python distribution. Does not impact existing Python programs on your machine.
  • Docker install: Run TensorFlow in a Docker container isolated from all other programs on your machine.
  • Installing from sources: Install TensorFlow by building a pip wheel that you then install using pip.

根據我的經驗,使用 Anaconda install 大概是問題最小的方式。同時一次解決 python and tensorflow installations.

$ sudo apt-get update

$sudo apt-get upgrade

Install Anaconda: (建議先用 python 3.6)

Follow the instructions on the Anaconda download site.

Create a conda environment called tensorflow:

> mkdir downloads

> cd downloads

> wget http://repo.continuum.io/archive/Anaconda3-4.4.0-Linux-x86_64.sh

> bash Anaconda3-4.3.1-Linux-x86_64.sh

# Python 3.6

$ conda create –n tensorflow python=3.6

$ source activate tensorflow

(tensorflow)$  # Your prompt should change

# Linux/Mac OS X, Python 2.7/3.4/3.5/3.6, CPU only:

(tensorflow)$ conda install -c conda-forge tensorflow

下一步是確認 tensorflow 是否 ok.

>>> import tensorflow as tf

>>> hello = tf.constant(‘Hello, TensorFlow!’)

>>> sess = tf.Session()

>>> sess = tf.Session()

>>> print(sess.run(hello))

2017-09-03 13:10:50.101098: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.

2017-09-03 13:10:50.101130: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.

2017-09-03 13:10:50.101137: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.

Step 2:  Install TFlearn and TensorLayer and Keras

pip install tflearn

pip install tensorlayer

pip install keras

DIY Deep Learning Computer

Reference:

 

 

最近使用 GCP 執行 Tensorflow/Theano/Keras 的 machine learning and deep learning examples.

雖然非常方便,最大的問題是 training 時執行速度太慢。有時因為 RAM/storage 的關係,甚至無法執行。

因此決定 DIY deep learning computer.  主要參考以下 references:

1. The $1700 great Deep Learning box: Assembly, setup and benchmarks 

2. Building your own deep learning box  !!!! Excellent!!!!

 


先說硬體部份,直接用原價屋的 list.  共 NTD$51K.

CPU: i7

MB: Z270

DDR4: 32GB

SSD: 480GB

HDD: 3TB

Graph card: GTX1080

NewImage

 

下一個重點是 OS:

因為 Tensorflow 已經開始支持 windows.  因此 support windows 10 (Home or Pro?) 作爲 benchmark 有其必要性。

不能說的秘密是 windows 可以支持電競軟體。以及 HTC/Oculus VR 應用。

 

Linux 的 deep learning 基本都是 Ubuntu OS 和對應軟體:

OS: Ubuntu LTS 14.04 or LTS 16.04  (LTS 14.04 似乎更穩定)

CUDA:  8.x  (on both LTS 14.04 or 16.04)

Python: 2.x or 3.x  

Tensorflow: 1.x.x  (on both LTS 14.04 and 16.04; support both Python 2.x and 3.x)

Theano: (only Python 2.x)

Keras

PyTorch

 

OS summary:

Windows 10, Ubuntu LTS 14.04, Ubuntu LTS 16.04.   Maybe CentOS or Xubuntu later.

Ubuntu 分為 desktop 和 server 版本。Desktop 版本包含 desktop UI (X, KDE, Gnome);  server 版本不包含 desktop UI.

但包含 server package (e.g. Apache2, Mail, Print server).  如果安裝時有 X or UI 問題,可以考慮用 server version! 

 

 

Windows:

Disable fast boot!!!

Disable secure boot!!!!

 

 

==> noveanu problem of the GTX graphic card!!!

 

The problem turned out to be the built-in graphics card on the MSI motherboard (the GTX 1080 was still on my coffee table). It wasn’t compatible with the Ubuntu GUI! This was classic chicken-and-egg. Without Ubuntu I couldn’t install the Nvidia drivers, but without the drivers I couldn’t launch Ubuntu! Enter GRUB.

 

OS 結論:

Windows 10 (Pro or Home?), and Ubuntu LTS 14.04 Server version, and Ubuntu LTS 16.04 server version.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

PixelRNN vs. GAN for Probabilistic Generative Model

Reference: 

1. 如何比较PixelCNN与DCGAN两种Image generation方法?

前文比較幾種 unsupervised learning (包含 self-supervised learning like autoencoder) 粗分為 probabilistic (generative) models and non-probabilistic models.   

 

NewImage

 

再細分 (unsupervised) probabilistic generative models 又可分為: 

NewImage

 

Probabilistic genitive model 近年很熱。

GAN 主要的 promotor 是 Goodfellow.

PixelRNN 的主要 promotor 是 A van den Oord and WaveNet from Google.

Variational method …    Believe net …    本文主要比較 GAN vs. PixelRNN.

 

 

先從 PixelRNN/CNN 的觀點來看:(reference1)

PixelRNN 的 probability model 如下:

 

NewImage 

NewImage 

兩個觀察:

1. Probabilistic generative model 的主要目的是找到 nxn joint pdf: p(x).  依據 maximum likelihood principle 可以得到 nxn pixels.

2. p(x) 可以用 conditional probability 展開如 (1) without prior knowledge. 

3. PixelRNN 不是 Markovian, 而是 causal.  Conditional pdf (pxi | x1, …) 會 depends on all previous pixels.  但不會 depends on future pixels!  對於 image 並不一定為真 (除了 local correlation), 但對語音應該更適用。

 

PixelCNN vs. DCGAN 比較 (Reference 1)

 
相比GAN,PixelCNN/RNN有以下几个优势:
1. 可以通过chain rule提供exact的data likelihood
p(\textbf{x})=\prod_{i=1}^{n^{2}} p(x_{i}|x_{1}, ..., x_{i-1})
虽说likelihood不是一个完美的evaluation metric [12],但它对评价一个generative model 还是能提供很多insight(尤其是detect missing mode)。GAN的方法不仅没有办法给出exact的likelihood,而且approximate出来的likelihood似乎结果很差 [9]。另外PixelCNN这套方法在likelihood这个metric上是state-of-the-art,甩其它方法一大截。我觉得拿ICML best paper主要就是因为这个。
 
2. 因为objective function直接就是likelihood,PixelCNN的training要稳定很多
PixelCNN的典型training curve[11]:
 
GAN的典型training curve… [10]
NewImage

3. PixelCNN的data space可以是continuous 也可以是discrete(e.g. Wavenet, discrete 的 performance 稍好一点点),但是GAN目前基本还只能在continuous data上work。在discrete data上如何实现GAN目前仍是一个non-trivial的open problem。

当然,除了题主提到的依赖于arbitrary的order而且sampling很慢以外,PixelCNN还有很多缺点:
1. PixelCNN的training其实也很慢,比如OpenAI的PixelCNN++ [8] 要在8块Titan上train 5天才converge… 这还只是一个CIFAR dataset
2. sample quality明显不如GAN。现在state-of-the-art的GAN在CIFAR上能生成相当reasonable的sample [5]:
 

但是PixelCNN++[8]的sample基本还看不出什么object…

 

3. 暂时还没有paper用PixelCNN成功做出来 unsupervised/semi-supervised feature learning,但是GAN在这方面硕果累累 [1,2,3,4]。

最后还想说的一点是,PixelCNN和GAN也许并不是非此即彼的关系,在将来有可能可以combine起来。如果有一个generative model能同时具备两者的优势,既能给出exact的likelihood又能有GAN这么好的sample quality,那将会是一个非常有趣的工作。这几个月的各种model组合(VAE+GAN [6], VAE+PixelCNN [7])也suggest这种combination或许是可行的。

[1] Salimans et al., Improved Techniques for Training GANs, 2016
[2] Dumoulin et al., Adversarially Learned Inference, 2016
[3] Donahue et al., Adversarial Feature Learning, 2016
[4] Denton et al., Semi-Supervised Learning with Context-Conditional Generative Adversarial Networks, 2016
[5] Huang et. al., Stacked Generative Adversarial Networks, 2016
[6] Larsen et al., Autoencoding beyond pixels using a learned similarity metric, 2016
[7] Gulrajani et al., PixelVAE: A Latent Variable Model for Natural Images, 2016
[8] Salimans et al., PixelCNN++: A PixelCNN Implementation with Discretized Logistic Mixture Likelihood and Other Modifications, 2016
[9] Wu et al., On the Quantitative Analysis of Decoder-Based Generative Models, 2016
[10] torch/torch.github.io
[11] carpedm20/pixel-rnn-tensorflow
[12] Thesis et al., A note on the evaluation of generative models, 2016

 
 
Reference 4 的結論:
 

The pros of Pixel CNN compared to GAN:

  • Provides a way to calculate likelihood. (A reasonable metric)

  • the training is more stable than GANs.

  • Works for both discrete and continuous data.

The cons:

  • The model assumes the order of generation: top to down, left to right. (Doesn’t really make sense)

  • Slow in sampling.

  • Slow in training: PixelCNN++ (from OpenAI) converges in 5 days on 8 Titan for CIFAR dataset. (however faster than pixel RNN).

  • Worse sample quality.

  • Haven’t been applied for feature learning.

(The pros and cons are tranlated from How to compare between PixelCNN and DCGAN, credit on Xun Huang)