帳號:guest(          離開系統
字體大小: 字級放大   字級縮小   預設字形  


作者(英文):Tzu-Rong Cheng
論文名稱(英文):A Gesture Recognition System with Deep Learning
指導教授(英文):Cheng-Chin Chiang
口試委員(英文):Hsin-Feng Lin
Chun-Wei Hsieh
關鍵詞(英文):Gesture recognitionDeep learningConvolution Neural NetworkOptical flowData amplification
  • 推薦推薦:0
  • 點閱點閱:341
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:60
  • 收藏收藏:0
近年來,隨著科技不斷的發展和進步,市面上的科技產品也越來越多樣化,除了目前常見的各式手持裝置,例如智慧型手機、平板電腦等,不需藉由人體接觸的遠端操控設備也逐漸受到重視,例如紅外線感測、語音辨識遙控、影像辨識遙控等。其中手勢辨識在影像辨識中一直是一項相當熱門的議題,且可應用之領域相當的廣泛。本論文設計了一套手勢辨識系統,以Cambridge Hand Gesture資料庫作為訓練測試樣本,採用光流法(Optical Flow)和灰階兩種資料類型分別做為輸入,再利用卷積神經網路(Convolutional Neural Network, CNN) 分別抽取特徵並訓練資料模型。使用者只需透過任意的單一攝像頭,做出不同的手勢變化和移動方向,將使用者的手勢移動方向和初始手形作為輸入,利用事先學習好的資料模型分別判斷手勢移動方向和初始手形的分數,最後結合兩者機率值即可得知使用者的手勢為哪一類。和以往需要大量資料的機器學習相比,本論文採用了較少量的資料搭配資料擴增,再使用兩種不同特徵的網路模型相輔相成,達到不需大量資料亦可讓訓練出的網路模型能夠應對各種可能狀況的目的,以資料庫的九種手勢辨識統計,本論文的辨識結果可達到93.9%。
In recent years, with the continuous development and progress of science and technology, more and more diversified technology products are available in the market. In addition to the common types of hand-held devices such as smart phones and tablet computers, Terminal control devices are also gradually receiving attention, such as infrared sensing, voice recognition, remote control, remote control and other image recognition. Among them, gesture recognition has been a rather hot topic in image recognition and its application is quite extensive. In this paper, a set of gesture recognition system is designed. The Cambridge Hand Gesture database is used as training sample. The two input data types are Optical Flow and Gray. Convolutional Neural Network (CNN) to extract features and train data models respectively. Users can make different gesture changes and movement directions by using any single camera. The direction of the user's gesture movement and the initial hand shape are taken as input. The learning model is used to separately determine the movement direction of the gesture and the score of the initial hand shape, the last combination of the probability of the two can know which type of user's gesture. Compared with the machine learning that used to require a lot of data in the past, this paper uses a smaller amount of data with data amplification, and then uses two different network models to complement each other, which can make the trained network the model can deal with all kinds of possible situations. According to nine kinds of hand gesture recognition statistics of the database, the recognition result of this dissertation can reach 93.9%.
摘要 III
Abstract IV
目錄 V
圖目錄 VII
表目錄 IX
第1章 緒論 1
1.1 研究動機與目的 1
1.2 系統流程 2
1.3 章節架構 3
第2章 文獻探討 4
2.1 單一畫面手形辨識 4
2.2 連續動態手勢辨識 5
2.3 卷積神經網路(CNN) 6
第3章 卷積神經網路用於動態手勢和靜態手形 10
3.1 前處理 12
3.2 資料擴增 13
3.3 雙串流卷積神經網路 15
3.3.1 3D卷積神經網路用於動態手勢辨識 16
3.3.2 卷積神經網路用於靜態手形辨識 18
3.4 結合兩種辨識結果 20
第4章 實驗結果與討論 22
4.1 劍橋手勢資料庫 22
4.2 光流法移動方向辨識實驗 22
4.3 資料擴增實驗 23
4.3.1 模擬不同光源資料擴增實驗 23
4.3.2 平移、旋轉擴增實驗 24
4.4 動態手勢和靜態手形辨識率合併實驗 25
4.5 實驗結果分析與比較 26
第5章 結論與未來研究方向 28
參考文獻 29

[2]Heba M.Gamal, H.M. Abdul-Kader, Elsayed A. Sallam, ”Hand Gesture Recognition using Fourier Descriptors”, Computer Engineering & Systems (ICCES), 2013 8th International Conference on, 2013.
[3]Hotelling, H. “Relations Between Two Sets of Variates. ”Biometrika. 1936, 28 (3–4): 321–377.
[4]Tae-Kyun Kim, Shu-Fai Wong, Roberto Cipolla, ”Tensor Canonical Correlation Analysis for Action Classification”, Computer Vision and Pattern Recognition, 2007. CVPR '07. IEEE Conference on, 2007.
[5]Yui Man Lui, J. Ross Beveridge, Michael Kirby, ”Action Classification on Product Manifolds”, Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, 2010.
[6]Andres Sanin, Conrad Sanderson, Mehrtash T. Harandi, Brian C. Lovell, ”Spatio-Temporal Covariance Descriptors for Action and Gesture Recognition”, 2013 IEEE Workshop on Applications of Computer Vision, pp. 103-110, 2013.
[7]Yann LeCun, Leon Bottou, Yoshua Bengio, et al. “Gradient-based learning applied to document recognition.” Proceedings of the IEEE, 1998, 86(11):2278-2324.
[9]Stanford CS231n note : Convolutional Neural Networks for Visual Recognition.
[11]Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, Manohar Paluri, “Learning Spatiotemporal Features with 3D Convolutional Networks”, 2015 IEEE International Conference on Computer Vision (ICCV), Pages 4489-4497, 2015.
[12]Karen Simonyan, Andrew Zisserman, ”Two-Stream Convolutional Networks for Action Recognition in Videos”, CoRR, abs/1406.2199, 2014. Published in Proc. NIPS, 2014.
[13]Bruce D. Lucas, Takeo Kanade, ”An Iterative Image Registration Technique with an Application to Stereo Vision”, Proceedings of Imaging Understanding Workshop, pages 121-130, 1981.
[14]The MathWorks, Inc. “Image Processing Toolbox For Use with MATLAB Version 2“, 1997.
[15]Tae-Kyun Kim, Roberto Cipolla, ”Canonical Correlation Analysis of Video Volume Tensors for Action Categorization and Detection”, IEEE Trans Pattern Anal Mach Intell. 2009, 2009.
[17]Zhou Ren, Junsong Yuan, Jingjing Meng, Zhengyou Zhang, “Robust Part-Based Hand Gesture Recognition Using Kinect Sensor”, IEEE Transactions on Multimedia 2013, 2013.
[18]Pablo Barros, Sven Magg, Cornelius Weber, Stefan Wermter, “A Multichannel Convolutional Neural Network for Hand Posture Recognition”, Artificial Neural Networks and Machine Learning – ICANN 2014 pp 403-410, 2014.
[19]Hsien-I Lin, Ming-Hsiang Hsu, Wei-Kai Chen, “Human hand gesture recognition using a convolution neural network”, Automation Science and Engineering (CASE), 2014 IEEE International Conference on, 2014.
[20]Markus Oberweger, Paul Wohlhart, Vincent Lepetit, “Hands Deep in Deep Learning for Hand Pose Estimation”, In Proceedings of 20th Computer Vision Winter Workshop (CVWW) 2015, pp. 21-30, 2015.
[21]Pavlo Molchanov, Shalini Gupta, Kihwan Kim, Kari Pulli, “Multi-sensor system for driver's hand-gesture recognition”, Automatic Face and Gesture Recognition (FG), 2015 11th IEEE International Conference and Workshops on, 2015.
第一頁 上一頁 下一頁 最後一頁 top
* *