作者(英文):Li-Lin Chen
論文名稱(英文):A Bidirectional Recurrent Neural Network for Offline Connected and Overlapped Handwritten Numeral Recognition
指導教授(英文):Cheng-Chin Chiang
口試委員(英文):Jun-Wei Hsieh
Shi-Jim Yen
關鍵詞(英文):Handwritten Numeral RecognitionDigit RecognitionHistogram of vertical projectionNeural NetworkBidirectional Recurrent Neural Network
近年來,對於手寫數字辨識(Handwritten Numeral Recognition)技術研發已趨於成熟,對於個別單獨數字(Isolated Digit)皆能有很高的辨識率。但數字辨識在很多實際應用的情況下仍有很大的挑戰性,像是多個數字連寫時會有接觸(Connect)甚至重疊(Overlap)的狀況,則數字分割的問題就變成一大挑戰。若已知或固定連寫的數字個數,或許分割與辨識上還可以使用一些較單純的假設或經驗法則來處理,但當個數未知或長度不固定時,則分割與辨識的問題變得相當複雜,也常會因此造成辨識準確度的嚴重下降。
針對不定長度數字連寫的挑戰,本論文先以垂直直方圖投影(histogram of vertical projection)將明顯可分離的區段切出,形成多個連寫數字片段,再採用雙向循環神經網路(Bidirectional Recurrent Neural Network)以序列標注(Sequence Labeling)的方式,對所有片段作同步分割與辨識。為提供神經網路足夠的訓練樣本,我們利用單獨數字自動合成連寫數字的資料樣本擴增法合成出大量的訓練樣本,有效解決訓練樣本不足與人工書寫曠日費時的問題,最終我們在 NSTRING SD19 資料庫測得整體辨識率為 97.6%,而在連結與重疊的高難度數字串辨識率也可達95.9%,效果頗佳且優於前人所提方法。我們以此技術實作一個雛形系統來實際展示與評估本研究的具體成果。
In recent years, the duty technology tended to regarding the hand-written numeral identification maturely, regarding divides the successful independent numeral (Isolated Digit) all to be able to have the very high identification rate. But the digital identification still had the very big challenging in very many situations, was likely a digital non-pair of independent numeral which we wanted in the duty to recognize, but was the Unknown Length the numstring, this let us in divide (Segment) to obtain in the correct digital integer to be more difficult; Also has in the unknown length numstring to have links (Connect) even to have overlaps the (Overlapping) part, these factors will mistake which creates the division or recognizes, causes the whole identification rate drop.
Regarding the above challenge, our paper first step use histogram of vertical projection obviously separates first cuts many digital fragment, then uses the Bidirectional Recurrent Neural Network by Sequence Labeling the way, will obtain all fragments will make the synchronized division and the identification movement. Moreover, our paper makes the training to the unknown length goal, and obtains for the solution training sample not easily. we provides the data augmentation method to synthesize the independent numeral to become the multi-integer string the new sample; Other also like Receptive Field, lets the neural network learn much better, as well as designs a confirmation method to come the result which obtains our nerve network, identifies the final digital result.
The final identification rate in NSTRING SD19 database is 97.6%, and in connect and overlap high difficulty numstring identification rate reach 95.9%, the effect is good also surpasses the goal paper. We also makes a system, comes the reality to examine our achievement.
誌謝 I
摘要 III
Abstract V
圖目錄 IX
表目錄 XI
第 1 章 緒論 1
1.1 研究動機 1
1.2 相關技術與背景 1
1.2.1 手寫辨識 1
1.2.2 連續軌跡與連結重疊 2
1.2.3 循環神經網路( Recurrent Neural Network,RNN) 3
1.3 系統流程 4
第 2 章 影像前處理 5
2.1 垂直直方圖分割 5
2.2 邊界裁切與濾除雜點 6
2.2.1 邊界裁切 6
2.2.2 濾除雜點 7
2.3 序列對齊 8
2.3.1 區段長寬正規化 8
2.3.2 序列補零(zero padding) 9
2.4 接收域處理 10
第 3 章 辨識系統 13
3.1 訓練方法 13
3.1.1 序列標註(Sequence Labeling) 13
3.1.2 訓練方法參考 14
3.1.3 訓練方法-接點標定法 16
3.1.4 訓練方法-字內標定法 17
3.2 雙向循環神經網路(Bidirectional Recurrent Neural Network,BRNN) 18
3.2.1 訓練方法之於RNN 18
3.2.2 接收域之影響 19
3.2.3 雙向長短期記憶循環神經網路(Bidirectional Long Short-Term
Memory Recurrent Neural Network,BLSTM RNN) 20
3.3 驗證方法 23
3.4 資料擴增 32
第 4 章 實驗結果與討論 35
4.1 實驗資料庫 35
4.1.1 測試資料 35
4.1.2 訓練資料 36
4.2 實驗 37
4.2.1 接收域與切割線 37
4.2.2 高度值 39
4.2.3 神經網路架構 40
4.2.4 水平排列上下震盪資料擴增 41
4.2.5 訓練樣本選擇 43
4.3 前人方法實驗結果比較 46
4.4 雛型系統設計 48
第 5 章 結論與未來研究方向 53
參考文獻 55
