一種基於ST-GCN的序列化骨架姿勢預測方法__國立東華大學博碩士論文全文影像系統

帳號：guest(3.15.208.242) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	徐維彬
作者(英文):	Wei-Bin Hsu
論文名稱:	一種基於ST-GCN的序列化骨架姿勢預測方法
論文名稱(英文):	A sequential skeleton pose prediction method based on ST-GCN
指導教授:	陳偉銘
指導教授(英文):	Wei-Ming Chen
口試委員:	張耀中簡暐哲
口試委員(英文):	Yao-Chung Chang Wei-Che Chien
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	資訊管理學系
學號:	610935103
出版年(民國):	111
畢業學年度:	110
語文別:	中文
論文頁數:	41
關鍵詞:	圖神經網路、圖卷積神經網路、人體姿勢預測
關鍵詞(英文):	Graph Neural Network、Graph Convolutional Neural Network、Human posture prediction
相關次數:	推薦:1 點閱:17 評分: 下載:0 收藏:0

近年來，由於計算機計算能力大幅提升，大量新穎的深度學習演算法也隨之出現，經常被應用於影像辨識、自然語言處理等領域。其中，人體姿態的辨識也是一項重要研究項目，不論國內或國外都有眾多研究。相較於分類動作的方法愈趨成熟，預測未來人體姿勢一直都是有相當挑戰的項目，其中一原因是過往在影像辨識領域中經常使用CNN作為取得影像特徵的方法之一，但人體姿勢更適合使用近年興起的GNN，本文即是使用ST-GCN來取得人們過去的歷史資訊再透過GRU預測未來的姿勢。
隨著人口結構與經濟型態的變化，獨居人士比例將逐漸提高，根據政府統計，跌倒致死佔意外事故傷害死亡有很高的比例，發生跌倒等意外時如何在第一時間即時通報協助救護人員能夠更及時的抵達是非常至關重要的。而除了預測跌倒意外，也可應用於各項情境，如在工地操作機具出現不當操作致使可能的意外即將發生時，即能立即警示人們可能會有危險必須立即停止該危險動作，抑或是嬰兒睡覺時經常會無意識翻身，萬一父母沒注意到可能就會因為掩蓋住口鼻無法呼吸導致意外的發生。
在現實生活中，意外無所不在，拜科技進步所賜，人們能夠藉由新的科技協助人類處理一些無法解決的問題，上述這些情況如果能夠更即時的進行處理或警示，便能預防事故發生。

Recently, due to the dramatic increase in computing power, there has been a large number of novel deep learning algorithms, which are often applied in areas such as image recognition and natural language processing. Among them, human posture recognition is also an important research, both domestic and foreign. One of the reasons is that CNNs are often used in the field of image recognition to obtain image features. However, it is more useful to use GNNs for human postures. In this article, we use STGCN to obtain historical information about people to predict future postures through GRU.
With the change of demographic structure and economic patterns, more and more people are living alone. According to government statistics, falls are the main cause of accidental deaths. Therefore, it is important to notify the ambulance crew in the first instance when an accident such as a fall occurs, so that they can arrive in a more timely manner.
In addition to predicting falls, future posture prediction can also be used in various situations, such as when a construction site machine is not operating properly, which can immediately warn people of a possible accident; or when a baby is sleeping and often turns over unconsciously, if the parents do not notice, the accident may occur because the mouth and nose are covered and cannot breathe.
Thanks to technological advances, people are able to use new technologies to help humans deal with unsolvable problems. The above situations could be prevented if they were handled or warned in a more timely manner.

中文摘要　　 i
Abstract　　 ii
目錄　　 iii
圖目錄　　 v
表目錄　　 vii
第1章. 緒論　　 1
1.1. 研究背景　　 1
1.2. 研究動機　　 1
1.3. 研究目的　　 3
第2章. 文獻探討　　 4
2.1. NN（Neural Network）　　 4
2.2. CNN (Convolutional Neural Network)　　 5
2.2.1. Convolution Layer　　 5
2.2.2. Pooling Layer　　 6
2.3. VGG Net　　 7
2.4. RNN (Recurrent Neural Network)　　 8
2.5. LSTM (Long Short-term Memory)　　 8
2.6. ConvLSTM (Convolutional LSTM)　　 10
2.7. GCN (Graph Convolutional Network)　　 10
2.8. 2D人體姿勢轉換成3D人體姿勢　　 11
2.9. Seq2Seq　　 13
2.10. Residual supervised　　 13
第3章. 研究方法　　 15
3.1. 問題定義　　 15
3.2. 系統架構　　 15
3.3. ST-GCN　　 16
3.4. Decoder　　 19
第4章. 實驗設計與結果　　 21
4.1. 資料集　　 21
4.2. 評估指標　　 23
4.3. 實驗環境及設置　　 23
4.4. 實驗結果　　 24
4.4.1. 預測未來姿勢　　 24
4.4.2. 分類　　 31
第5章. 結論與未來展望　　 39
參考文獻　　 40

[1] “人工智能史,” 維基百科，自由的百科全書. Apr. 29, 2022. Accessed: Jun. 24, 2022. [Online]. Available: https://zh.wikipedia.org/w/index.php?title=%E4%BA%BA%E5%B7%A5%E6%99%BA%E8%83%BD%E5%8F%B2&oldid=71367113
[2] 國家發展委員會, “國家發展委員會,” 國發會全球資訊網, Jun. 29, 2015. https://www.ndc.gov.tw/Content_List.aspx?n=695E69E28C6AC7F3 (accessed Sep. 08, 2022).
[3] 國民健康署, “每6人就有1位老人曾跌倒國健署傳授防跌妙招,” 國民健康署, Sep. 24, 2019. https://www.mohw.gov.tw/cp-4253-49428-1.html (accessed Sep. 08, 2022).
[4] 行政院主計處(DGBAS), “行政院主計處,” 行政院主計處(DGBAS), Feb. 01, 2005. https://www.dgbas.gov.tw/mp.asp?mp=1 (accessed Sep. 09, 2022).
[5] M. Chaabane, A. Trabelsi, N. Blanchard, and R. Beveridge, “Looking Ahead: Anticipating Pedestrians Crossing with Future Frames Prediction,” in 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, Mar. 2020, pp. 2286–2295. doi: 10.1109/WACV45572.2020.9093426.
[6] A. Diba et al., “Temporal 3D ConvNets: New Architecture and Transfer Learning for Video Classification.” arXiv, Nov. 22, 2017. doi: 10.48550/arXiv.1711.08200.
[7] V. Adeli, E. Adeli, I. Reid, J. C. Niebles, and H. Rezatofighi, “Socially and Contextually Aware Human Motion and Pose Forecasting,” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6033–6040, Oct. 2020, doi: 10.1109/LRA.2020.3010742.
[8] S. Yan, Y. Xiong, and D. Lin, “Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition.” arXiv, Jan. 25, 2018. doi: 10.48550/arXiv.1801.07455.
[9] I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to Sequence Learning with Neural Networks.” arXiv, Dec. 14, 2014. doi: 10.48550/arXiv.1409.3215.
[10] Z. Cao, G. Hidalgo, T. Simon, S.-E. Wei, and Y. Sheikh, “OpenPose: Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields.” arXiv, May 30, 2019. doi: 10.48550/arXiv.1812.08008.
[11] “人工神经网络,” 維基百科，自由的百科全書. May 31, 2022. Accessed: Jun. 24, 2022. [Online]. Available: https://zh.wikipedia.org/w/index.php?title=%E4%BA%BA%E5%B7%A5%E7%A5%9E%E7%BB%8F%E7%BD%91%E7%BB%9C&oldid=71945459
[12] Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, Nov. 1998, doi: 10.1109/5.726791.
[13] K. Simonyan and A. Zisserman, “Very Deep Convolutional Networks for Large-Scale Image Recognition.” arXiv, Apr. 10, 2015. doi: 10.48550/arXiv.1409.1556.
[14] W. Zaremba, I. Sutskever, and O. Vinyals, “Recurrent Neural Network Regularization.” arXiv, Feb. 19, 2015. doi: 10.48550/arXiv.1409.2329.
[15] S. Hochreiter and J. Schmidhuber, “Long Short-Term Memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, Nov. 1997, doi: 10.1162/neco.1997.9.8.1735.
[16] “Convolutional LSTM Network | Proceedings of the 28th International Conference on Neural Information Processing Systems - Volume 1.” https://dl.acm.org/doi/10.5555/2969239.2969329 (accessed Sep. 09, 2022).
[17] T. N. Kipf and M. Welling, “Semi-Supervised Classification with Graph Convolutional Networks.” arXiv, Feb. 22, 2017. doi: 10.48550/arXiv.1609.02907.
[18] J. Martinez, R. Hossain, J. Romero, and J. J. Little, “A simple yet effective baseline for 3d human pose estimation.” arXiv, Aug. 04, 2017. doi: 10.48550/arXiv.1705.03098.
[19] T. Hu, W. Meng, and S. Li, “Extract Accurate 3D Human Skeleton from Video,” in 2019 International Conference on Virtual Reality and Visualization (ICVRV), Nov. 2019, pp. 100–107. doi: 10.1109/ICVRV47840.2019.00025.
[20] S. Bai, J. Z. Kolter, and V. Koltun, “An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling.” arXiv, Apr. 19, 2018. doi: 10.48550/arXiv.1803.01271.
[21] A. Shahroudy, J. Liu, T.-T. Ng, and G. Wang, “NTU RGB+D: A Large Scale Dataset for 3D Human Activity Analysis,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2016, pp. 1010–1019. doi: 10.1109/CVPR.2016.115.
[22] J. Martinez, M. J. Black, and J. Romero, “On human motion prediction using recurrent neural networks.” arXiv, May 06, 2017. doi: 10.48550/arXiv.1705.02445.
[23] L. Shi, Y. Zhang, J. Cheng, and H. Lu, “Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition.” arXiv, Jul. 09, 2019. doi: 10.48550/arXiv.1805.07694.

(此全文20270925後開放外部瀏覽)
01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文