基於人體姿態序列辨識之健身動作追蹤系統__國立東華大學博碩士論文全文影像系統

帳號：guest(18.226.165.165) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	羅子涵
作者(英文):	Zih-Han Luo
論文名稱:	基於人體姿態序列辨識之健身動作追蹤系統
論文名稱(英文):	A Fitness Action Tracking System Based on Pose Sequence Recognition
指導教授:	張意政
指導教授(英文):	I-Cheng Chang
口試委員:	王元凱施皇嘉陳以錚
口試委員(英文):	Yuan-Kai Wang Huang-Chia Shih Yi-Cheng Chen
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	資訊工程學系
學號:	610621231
出版年(民國):	110
畢業學年度:	109
語文別:	英文
論文頁數:	66
關鍵詞:	健身運動、人體姿態估測、動作辨識
關鍵詞(英文):	fitness exercises、human pose estimation、action recognition
相關次數:	推薦:0 點閱:26 評分: 下載:2 收藏:0

近年國人的運動風氣蔚為風潮，其中健身運動人口也越來越多，規律的健身運動除了可以維持標準的體態、強壯骨骼和肌肉外，也可以改善睡眠品質，降低負面情緒更可以提高大腦的功能。而深度學習在電腦視覺的應用也快速成長，由於深度學習可透大量的資料去學習更多特徵，因此在效能或準確度都優於以往技術。因此，目前有許多透過深度學習技術去開發運動動作辨識相關的研究。
目前使用人體骨架去開發動作辨識的研究中，可以分為使用2D人體骨架序列及3D人體骨架序列。2D人體骨架序列可以從視訊中的人體影像萃取而得，因取像容易，因此資料庫數量以及動作種類豐富，但其辨識會受到拍攝角度影響。而3D人體骨架序列雖擁有動作的三維資訊，可辨識複雜動作，但其資料取得成本相對較高，不容易應用到日常生活的環境。
本論文開發一個基於人體姿態序列辨識之健身動作追蹤系統，其內容包含人體偵測、人體追蹤、人體姿態估測，及動作辨識四個技術。此系統針對多人健身運動進行追蹤與辨識，並將追蹤後的運動予以分析，紀錄每個人從事不同健身運動的時間、運動到的身體部位，及消耗的卡路里，提供使用者的運動回饋。我們建構一個多角度的拍攝的健身動作資料庫，讓訓練後的模型，在辨識時可以不受視角的影響。我們建構一個新的人體姿態估測網路RAHPNet（Residual Attention Heatmap Prediction Network），用於生成健身動作影像的動作姿態序列，並結合時空圖卷積網路用於辨識動作姿態序列。實驗部分，本研究分別在MPII Multi-Person Dataset以及MSCOCO Keypoints Challenge評估開發的RAHPNet，並分別達到74.5 mAP以及63.5mAP，超越所比較的其他研究。本系統除了可辨識影像中健身人員所從事的健身動作外，也可以在多人環境下對每個人員進行追蹤，並記錄其運動期間的相關資訊。並透過本研究所建構的Fitness Sport Dataset評估健身動作辨識系統，其準確率也達到90.2%。

In recent years, fitness exercising has become a popular trend in Taiwan. Regular fitness exercises can not only help people maintain a standard physique and muscles, but also improve sleep quality and increase positive emotions. The development of deep learning in computer vision is growing rapidly, as it performs better than traditional machine learning technologies through learning a large amount of data. Therefore, developing exercise action recognition through deep learning techniques has recently been emphasized.
The current research on action recognition based on human skeletons can be classified into two types: 2D-based and 3D-based human skeletons. 2D human skeleton sequences can be extracted from videos, which is easy to acquire and therefore has a large database and a variety of actions, but its recognition is affected by the camera angle. The 3D human skeleton sequence owns 3D information of movements and can be used to recognize complex movements; however, the data acquisition cost is relatively high, and it is not easy to be generalized to practical applications.
This thesis develops a fitness action tracking system based on pose sequence recognition, which consists of four main techniques: human detection, human tracking, human pose estimation, and action recognition. The system tracks and identifies multi-person fitness exercises, and record the time, muscle, and calories burned of each person in different fitness exercises to provide users with exercise feedback. We build a multi-view database of fitness exercises so that the trained model can be recognized regardless of the angle of view. We also build a new pose estimation network, Residual Attention Heatmap Prediction Network (RAHPNet), for generating action pose sequences from fitness exercises images. The sequences are used to recognize the fitness action through the spatial-temporal graph convolutional networks. In the experimental results, RAHPNet is evaluated on two datasets: MPII Multi-Person Dataset and MSCOCO Keypoints Challenge, and it achieved 74.5 mAP and 63.5 mAP, respectively. The experimental results show that our method outperforms the other methods. The action recognition accuracy can achieve 90.2% on the fitness action dataset. Besides, we also show the system can track each exerciser's activities under a multi-person environment.

摘要 6
Abstract 7
Content 8
List of Table 10
List of Figure 11
Chapter 1 Introduction 13
1.1 Background and Motivation 13
1.2 System Overview 14
1.3 Thesis Organization 15
Chapter 2 Related Work 16
2.1 Human Pose Estimation 16
2.1.1 Single-Person Pose Estimation 16
2.1.2 Multi-Person Pose Estimation 18
2.2 Human Action Recognition 21
2.2.1 Image-based Human Action Recognition 21
2.2.2 Skeleton-based Human Action Recognition 23
Chapter 3 Multi-Person Pose Estimation 25
3.1 Human Detection 25
3.2 Human Tracking 27
3.3 Pose Sequence Generation 29
3.3.1 Residual Attention Heatmap Prediction Network 30
3.3.2 Human Block Augmentation and Pose NMS 34
Chapter 4 Multi-View Action Recognition 37
4.1 Multi-View Fitness Action Dataset 37
4.2 ST-GCN (Spatial-Temporal Graph Convolutional Networks) 40
Chapter 5 Experimental Results 44
5.1 Performance of Pose Estimation 44
5.1.1 MSCOCO Keypoints Challenge 45
5.1.2 MPII Multi-Person Dataset 48
5.3 Performance of Action Recognition 50
5.3.1 Kinetics Dataset 50
5.3.2 Multi-View Fitness Action Dataset 51
5.4 System Performance 52
Chapter 6 Conclusion 58
References 59

[1] A. Toshev, and C. Szegedy, Deeppose: Human Pose Estimation via Deep Neural Networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 1653–1660, 2014.
[2] A. Krizhevsky, I. Sutskever, and G. Hinton, Imagenet Classification with Deep Convolutional Neural Networks. In Neural Information Processing Systems NIPS, 2012.
[3] J. Carreira, P. Agrawal, K. Fragkiadaki, and J. Malik. Human Pose Estimation with Iterative Error Feedback. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR,2016
[4] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, and A. Rabinovich, Going Deeper with Convolutions. In IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2015.
[5] X. Sun, J. Shang, S. Liang, and Y. Wei. Compositional Human Pose Regression. In IEEE International Conference on Computer Vision ICCV, 2017.
[6] K. He, X. Zhang, S. Ren, and J. Sun, Deep Residual Learning for Image Recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2015.
[7] D. C. Luvizon, H. Tabia, and D. Picard. Human Pose Regression by Combining Indirect Part Detection and Contextual Information. In Clinical Orthopaedics and Related Research CoRR, abs/1710.02322, 2017.
[8]T. Pfister, K. Simonyan, J. Charles, and A. Zisserman, Deep Convolutional Neural Networks for Efficient Pose Estimation in Gesture Videos. In Asian Conference on Computer Vision ACCV, 2014.
[9]A. Nibali, Z. He, S. Morgan, and L. Prendergast, Numerical Coordinate Regression with Convolutional Neural Networks. In arXiv:1801.07372, 2018.
[10]S. Li, Z.-Q. Liu, and A. B. Chan, Heterogeneous Multi-Task Learning for Human Pose Estimation with Deep Convolutional Neural Network, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2014.
[11]X. Fan, K. Zheng, Y. Lin, and S. Wang, Combining Local Appearance and Holistic View: Dual-Source Deep Neural Networks for Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2015.
[12]D. C. Luvizon, D. Picard, and H. Tabia, 2d/3d Pose Estimation and Action Recognition Using Multitask Deep Learning. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2018.
[13]F. Zhang, X. Zhu, and M. Ye, Fast Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2019.
[14] J. J. Tompson, A. Jain, Y. LeCun, and C. Bregler, Joint Training of a Convolutional Network and A Graphical Model for Human Pose Estimation. In Neural Information Processing Systems NIPS, 2014.
[15]I. Lifshitz, E. Fetaya, and S. Ullman, Human Pose Estimationusing Deep Consensus Voting. In European Conference on Computer Vision ECCV, 2016.
[16] S. E. Wei, V. Ramakrishna, T. Kanade, and Y. Sheikh, Convolutional Pose Machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 4724-4732, 2016.
[17] V. Ramakrishna, D. Munoz, M. Hebert, J. A. Bagnell, and Y. Sheikh, Pose Machines: Articulated Pose Estimation via Inference Machines. In European Conference on Computer Vision ECCV, pp. 33-47, September 2014.
[18] A. Newell, K. Yang, and J. Deng, Stacked Hourglass Networks for Human Pose Estimation. In European Conference on Computer Vision ECCV, pp. 483-499, October 2016.
[19] X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille, and X. Wang, Multi-Context Attention for Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017.
[20]W. Yang, S. Li, W. Ouyang, H. Li, and X. Wang, Learning Feature Pyramids for Human Pose Estimation. In IEEE International Conference on Computer Vision ICCV, 2017.
[21]J. Tompson, R. Goroshin, A. Jain, Y. LeCun, and C. Bregler, Efficient Object Localization Using Convolutional Networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2015.
[22]A. Bulat and G. Tzimiropoulos, Human Pose Estimation Via Convolutional Part Heatmap Regression. In European Conference on Computer Vision ECCV, 2016.
[23]G. Gkioxari, A. Toshev, and N. Jaitly, Chained Predictions Using Convolutional Neural Networks. In European Conference on Computer Vision ECCV, 2016.
[24]V. Belagiannis and A. Zisserman, Recurrent Human Pose Estimation. In IEEE International Conference on Automatic Face and Gesture Recognition, 2017.
[25]Y. Luo, J. Ren, Z. Wang, W. Sun, J. Pan, J. Liu, J. Pang, and L. Lin, LSTM Pose Machines. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2018.
[26]B. Debnath, M. O'Brien, M. Yamaguchi, and A. Behera, Adapting Mobilenets for Mobile Based Upper Body Pose Estimation. In IEEE International Conference on Advanced Video and Signal-based Surveillance AVSS, 2018.
[27]B. Xiao, H. Wu, and Y. Wei, Simple Baselines for Human Pose Estimation and Tracking. In European Conference on Computer Vision ECCV, 2018.
[28]H. Zhang, H. Ouyang, S. Liu, X. Qi, X. Shen, R. Yang, and J. Jia, Human Pose Estimation with Spatial Contextual Information. In arXiv:1901.01760, 2019.
[29]B. Artacho and A. Savakis, Unipose: Unified Human Pose Estimation in Single Images and Videos. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2020.
[30]X. Chu, W. Yang, W. Ouyang, C. Ma, A. L. Yuille, and X.Wang, Multi-Context Attention For Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017.
[31] G. Papandreou, T. Zhu, N. Kanazawa, A. Toshev, J. Tompson, C. Bregler, and K. Murphy, Towards Accurate Multiperson Pose Estimation in the Wild. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 4903-4911, 2017.
[32] H. S. Fang, S. Xie, Y. W. Tai, and C. Lu, RMPE: Regional Multi-Person Pose Estimation. In Proceedings of IEEE International Conference on Computer Vision ICCV, pp. 2334-2343, 2017.
[33] W. Liu, D. Anguelov, D. Erhan, C. Szegedy, and S. E. Reed, SSD: Single Shot Multibox Detector, In European Conference on Computer Vision, pp. 21-37, October 2016.
[34] M. Jaderberg, K. Simonyan, A. Zisserman, and K. Kavukcuoglu, Spatial Transformer Networks, Advances in Neural Information Processing System, Vol. 28, pp. 2017-2015, 2015.
[35] B. Xiao, H. Wu, and Y. Wei, Simple Baselines For Human Pose Estimation And Tracking. In European Conference on Computer Vision ECCV, p.472–487, 2018.
[36] K. Sun, B. Xiao, D. Liu, and J. Wang. Deep High-Resolution Representation Learning for Human Pose Estimation. In IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2019.
[37] Y. Chen, Z. Wang, Y. Peng, Z. Zhang, G. Yu, and J. Sun, Cascaded Pyramid Network for Multi-Person Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 7103-7112, 2018.
[38] S. Huang, M. Gong, and D. Tao, A Coarse-Fine Network for Keypoint Localization. In IEEE International Conference on Computer Vision ICCV, 2017.
[39] K. Sun, B. Xiao, D. Liu, and J. Wang, Deep High-Resolution Representation Learning for Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2019.
[40] W. Li, Z.Wang, B. Yin, Q. Peng, Y. Du, T. Xiao, G. Yu, H. Lu, Y.Wei, and J. Sun, Rethinking On Multi-Stage Networks For Human Pose Estimation. In arXiv:1901.00148, 2019.
[41] G. Moon, J. Y. Chang, and K. M. Lee, Posefix: Model-Agnostic General Human Pose Refinement Network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2019.
[42] J. Wang, X. Long, Y. Gao, E. Ding, and S. Wen, Graph-Pcnn: Two Stage Human Pose Estimation with Graph Pose Refinement. In arXiv:2007.10599, 2020.
[43] J. Huang, Z. Zhu, F. Guo, and G. Huang, The Devil Is in The Details: Delving Into Unbiased Data Processing For Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2020.
[44] Y. Cai, Z. Wang, Z. Luo, B. Yin, A. Du, H. Wang, X. Zhou, E. Zhou, X. Zhang, and J. Sun, Learning Delicate Local Representations For Multi-Person Pose Estimation. In arXiv:2003.04030, 2020.
[45] F. Zhang, X. Zhu, H. Dai, M. Ye, and C. Zhu, Distribution-Aware Coordinate Representation for Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2020.
[46] L. Pishchulin, E. Insafutdinov, S. Tang, B. Andres, M. Andriluka, P. Gehler, and B. Schiele, Deepcut: Joint Subset Partition and Labeling for Multi Person Pose Estimation. In IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 4929-4937, 2016.
[47] E. Insafutdinov, L. Pishchulin, B. Andres, M. Andriluka, and B. Schiele, Deepercut: A Deeper, Stronger, and Faster Multiperson Pose Estimation Model. In European Conference on Computer Vision, Springer, pp. 34-50, October 2016.
[48] Z. Cao, T. Simon, S. E. Wei, and Y. Sheikh, Realtime Multiperson 2d Pose Estimation Using Part Affinity Fields. In Proceedings of the IEEE conference on Computer Vision and Pattern Recognition CVPR, pp. 7291-7299, 2017.
[49] Z. Cao, G. Hidalgo, T. Simon, S. E. Wei, and Y. Sheikh, OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields. In IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 43, No. 1, pp. 172-186, July 2019.
[50] X. Zhu, Y. Jiang, and Z. Luo, Multi-Person Pose Estimation for Posetrack with Enhanced Part Affinity Fields. In IEEE International Conference on Computer Vision ICCV, 2017.
[51] S. Kreiss, L. Bertoni, and A. Alahi, Pifpaf: Composite Fields for Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2019.
[52] E. Insafutdinov, M. Andriluka, L. Pishchulin, S. Tang, E. Levinkov, B. Andres, and B. Schiele, Arttrack: Articulated Multi-Person Tracking in The Wild. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017.
[53] A. Newell, Z. Huang, and J. Deng, Associative Embedding: End-to-End Learning for Joint Detection and Grouping. In Neural Information Processing Systems NIPS, 2017.
[54] M. Fieraru, A. Khoreva, L. Pishchulin, and B. Schiele, Learning to Refine Human Pose Estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2018.
[55] Z. Tian, H. Chen, and C. Shen, Directpose: Direct End-To-End Multi-Person Pose Estimation. In arXiv:1911.07451, 2019.
[56] X. Nie, J. Feng, J. Zhang, and S. Yan, Single-Stage Multi-Person Pose Machines. In IEEE International Conference on Computer Vision ICCV, 2019.
[57] S. Jin, W. Liu, E. Xie, W. Wang, C. Qian, W. Ouyang, and P. Luo, Differentiable Hierarchical Graph Grouping for Multi-Person Pose Estimation. In arXiv:2007.11864, 2020.
[58] B. Cheng, B. Xiao, J. Wang, H. Shi, T. S. Huang, and L. Zhang, Higherhrnet: Scale-Aware Representation Learning for Bottom-Up Human Pose Estimation. In arXiv:1908.10357, 2019.
[59] K. Simonyan and A. Zisserman, Two-Stream Convolutional Networks for Action Recognition in Videos, Advances in Neural Information Processing System, Vol. 27, pp. 568-576, 2014.
[60] C. Feichtenhofer, A. Pinz, and A. Zisserman, Convolutional Two-Stream Network Fusion for Video Action Recognition. In IEEE International Conference on Computer Vision and Pattern Recognition CVPR, pp. 1933-1941, 2016.
[61] L. Wang, Y. Xiong, Z. Wang, Y. Qiao, D. Lin, X. Tang, and L. V. Gool, Temporal Segment Networks: Towards Good Practices for Deep Action Recognition. In European Conference on Computer Vision ECCV, pp. 20-36, October 2016.
[62] D. Tran, L. Bourdev, R. Fergus, L. Torresani, and M. Paluri, Learning Spatial-Temporal Features with 3D Convolutional Networks. In IEEE International Conference on Computer Vision ICCV, pp. 4489–4497, 2015.
[63] C. Feichtenhofer, H. Fan, J. Malik, and K. He. Slowfast Networks for Video Recognition. In arXiv:1812.03982, 2018.
[64] Y. Du, W. Wang, and L. Wang, Hierarchical Recurrent Neural Network for Skeleton Based Action Recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, pp. 1110-1118, 2015.
[65] M. Li, S. Chen, X. Chen, Y. Zhang, Y. Wang, and Q. Tian, Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2019, pp. 3595–3603.
[66] S. Yan, Y. Xiong, and D. Lin, Spatial-Temporal Graph Convolutional Networks for Skeleton-Based Action Recognition. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32, No. 1, April 2018.
[67] Joseph Redmon and Ali Farhadi, Yolov3: An Incremental Improvement, In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, abs/1804.02767, April 2018.
[68] T.-Y. Lin, P. Dollar, R. Girshick, K. He, B. Hariharan, and 'S. Belongie. Feature Pyramid Networks for Object Detection. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017.
[69] N. Wojke, A. Bewley, and D. Paulus, Simple Online and Realtime Tracking with a Deep Association Metric. In Proceedings of IEEE Conference on Image Processing ICIP, pp. 3645–3649, 2017.
[70] S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision ECCV, pages 3–19, 2018.
[71] Wang, P., Chen, P., Yuan, Y., Liu, D., Huang, Z., Hou, X., Cottrell, G.: Understanding convolution for semantic segmentation. In arXiv:1702.08502, 2017.
[72] W. Shi, J. Caballero, F. Huszar, J. Totz, A. P. Aitken, R. Bishop,D. Rueckert, and Z. Wang. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network. In IEEE Conference on Computer Vision and Pattern Recognition CVPR, pages 1874–1883, 2016.
[73] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollar, and C. L. Zitnick. Microsoft COCO: Common Objects in Context. In European Conference on Computer Vision ECCV, 2014.
[74] M. Andriluka, L. Pishchulin, P. Gehler, and B. Schiele. 2D Human Pose Estimation: New Benchmark and State of The Art Analysis. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2014.
[75] K. He, G. Gkioxari, P. Doll´ar, and R. Girshick, Mask R-CNN, In IEEE International Conference on Computer Vision ICCV, 2017.
[76] U. Iqbal and J. Gall, “Multi-person pose estimation with local joint-to- person associations,” In European Conference on Computer Vision ECCV, 2016.
[77] E. Levinkov, J. Uhrig, S. Tang, M. Omran, E. Insafutdinov, A. Kirillov, C. Rother, T. Brox, B. Schiele, and B. Andres, Joint graph decomposition & node labeling: Problem, algorithms, applications, In IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2017.
[78] M. Fieraru, A. Khoreva, L. Pishchulin, and B. Schiele, Learning to refine human pose estimation, In IEEE Conference on Computer Vision and Pattern Recognition CVPR, 2018.
[79] Kay, W. Carreira, J. Simonyan, K. Zhang, B. Hillier, C. Vijayanarasimhan, S. Viola, F. Green, T. Back, T. Natsev, P. The kinetics human action video dataset. In arXiv:1705.06950.
[80] Fernando, B. Gavves, E. Oramas, J. M. Ghodrati, A. and Tuytelaars, T. Modeling video evolution for action recognition. In IEEE Conference on Computer Vision and Pattern Recognition CVPR, 5378–5387.
[81] Shahroudy, A. Liu, J. Ng, T.-T. and Wang, G. NTU RGB+ D: A large scale dataset for 3D human activity analysis. In IEEE Conference on Computer Vision and Pattern Recognition CVPR, 1010–1019.
[82] Kim, T. S., and Reiter, A. Interpretable 3D human action analysis with temporal convolutional networks. In arXiv: 1704.04516.

(此全文20241024後開放外部瀏覽)
01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文