基於 YOLO 偵測器之降低遮蔽影響的多物件追蹤__國立東華大學博碩士論文全文影像系統

帳號：guest(3.133.152.215) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	張廷宇
作者(英文):	Ting-Yu Chang
論文名稱:	基於 YOLO 偵測器之降低遮蔽影響的多物件追蹤
論文名稱(英文):	Multiple Object Tracking with Occlusion Effect Reduction using YOLO-based Detector
指導教授:	林信鋒
指導教授(英文):	Shin-Feng Lin
口試委員:	劉國成張意政
口試委員(英文):	Kuo-Cheng Liu I-Cheng Chang
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	資訊工程學系
學號:	610821239
出版年(民國):	112
畢業學年度:	111
語文別:	英文
論文頁數:	36
關鍵詞:	多物件追蹤、YOLO、重新識別、運動預測、遮蔽
關鍵詞(英文):	Multiple object tracking、YOLO、Re-identification、Motion prediction、Occlusion
相關次數:	推薦:0 點閱:46 評分: 下載:2 收藏:0

計算機視覺中，多物件追蹤（MOT）在解決許多重要問題中有著重要的作用；如自動車、人群行為分析和人機互動。MOT也有許多挑戰需要被克服，例如ID的重新識別和如何處理被遮蔽的物件。Tracking-by-detection是MOT中常用的方法，透過預先使用偵測器進行偵測物件，並利用所得到的檢測資料完成追蹤、識別物件、物件重新識別和運動預測。從影片中提取一組用於引導追踪過程的檢測，並將檢測結果關聯在一起，以相同的標識分配給包含相同目標的邊界框中。在本文中，MOT 使用 YOLO 取代傳統的檢測器，能在一開始就有更好的檢測結果，以便之後追蹤階段的效果提升。
本論文提出了一種基於 YOLO 偵測器之降低遮蔽影響的多物件追蹤。目標是透過少數的目標特徵在各種場景下實現良好的效果，包括擁擠的廣場、夜景、購物中心的移動相機和擁擠的火車站室內。這些場景存在於2DMOT15、 MOT16和MOT20的影片集中。本方法分別有兩個階段，YOLO 檢測階段和遮蔽處理階段。在檢測部分，我們使用了先進的YOLOv4、YOLOv5、YOLOv7來替代公開檢測器，以獲得比傳統方法更好的結果。目標是開發一個系統，只使用幾個目標特徵，即使有遮蔽也可以重新識別物件，這些強大的偵測方法能夠大幅提升影片中偵測到的物件數，因此能夠更好的讓追蹤器進行關聯，以提高實驗的效果。在遮蔽處理中，我們擷取所有物件的特徵並保存，當遇到物件被遮蔽時，我們也可以重新識別被遮蔽的物件。從實驗數據證明此方法在擁擠的廣場、夜景、擁擠的火車站等具有挑戰性的場景中的效果，相較於許多先進的方法有更好的表現。

Multiple Object Tracking (MOT) in computer vision is crucial in solving various crucial problems like autonomous vehicles, crowded behavior analysis, and human-computer interaction. Despite its significance, MOT faces several challenges, including ID re-identification and handling occluded objects. Tracking-by-detection is the common method in MOT, incorporating object re-identification and motion prediction. The video frames extract a set of detections to guide the tracking process, which is then associated with assigning the same identity to bounding boxes containing the same target. This article employs YOLO for object proposals and utilizes bounding box regression and association to predict object position.
This thesis proposes Multiple Objects Tracking with Occlusion Effect Reduction using YOLO-based Detector. The objective is to achieve high accuracy in MOT in challenging scenes such as crowded squares, night scenes, moving cameras in shopping malls, and crowded indoor train stations, as shown in 2DMOT15, MOT16, and MOT20 sequences.
The proposed system has two stages: YOLO detection and occlusion reduction. The detection stage uses advanced YOLOv4, YOLOv5, and YOLOv7 detectors for better results than conventional methods. The goal is to develop a system that can re-identify objects even with occlusion and use only a few target features.
The proposed method is compared with state-of-the-art techniques through experiments, demonstrating its robustness against various challenges. It shows good performance in challenging scenes such as crowded squares, night scenes, and crowded train stations.

Chapter 1 Introduction 1
1.1 Motivation 1
1.2 Thesis Organization 4
Chapter 2 Background 5
2.1 Deep Learning in Object Detection 5
2.2 Image Classification and Tracking as a Graph Problem 6
2.3 Appearance Models and Re-identification 6
2.4 Intersection over Union 6
2.5 Non-Maximum Suppression 7
2.6 Detection with Transformers 8
Chapter 3 Related Work 9
3.1 Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking 9
3.2 Tracking without Bells and Whistles 10
3.3 MOTR: End-to-End Multiple-Object Tracking with Transformer 12
Chapter 4 The Proposed Method 14
4.1 Feature Extraction 15
4.2 Object Detection 16
4.3 Bounding Box Regression and Association 18
4.4 Occlusion Effect Reduction 18
Chapter 5 ExperimentalResults 21
5.1 Metrics of MOT 21
5.1.1 Clear MOT metrics 21
5.1.2 ID scores 23
5.1.3 Classical metrics 24
5.2 Experiment Databases 25
5.3 Comparison with Other Methods 26
Chapter 6 Conclusions 32
References 33

1.Wang, X. (2013). Intelligent multi-camera video surveillance: A review. Pattern recognition letters, 34(1), 3-19.
2.Candamo, J., Shreve, M., Goldgof, D. B., Sapper, D. B., & Kasturi, R. (2009). Understanding transit scenes: A survey on human behavior-recognition algorithms. IEEE transactions on intelligent transportation systems, 11(1), 206-224.
3.Chen, L., Ai, H., Zhuang, Z., & Shang, C. (2018, July). Real-time multiple people tracking with deeply learned candidate selection and person re-identification. In 2018 IEEE international conference on multimedia and expo (ICME) (pp. 1-6). IEEE.
4.Henschel, R., Leal-Taixé, L., Cremers, D., & Rosenhahn, B. (2018). Fusion of head and full-body detectors for multi-object tracking. In Proceedings of the IEEE conference on computer vision and pattern recognition workshops (pp. 1428-1437).
5.Keuper, M., Tang, S., Andres, B., Brox, T., & Schiele, B. (2018). Motion segmentation & multiple object tracking by correlation co-clustering. IEEE transactions on pattern analysis and machine intelligence, 42(1), 140-153.
6.Bergmann, P., Meinhardt, T., & Leal-Taixe, L. (2019). Tracking without bells and whistles. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 941-951).
7.Maksai, A., & Fua, P. (2019). Eliminating exposure bias and metric mismatch in multiple object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 4639-4648).
8.Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2021). MOTR: End-to-End Multiple-Object Tracking with Transformer. ArXiv. https://doi.org/10.48550/arXiv.2105.03247
9.Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16 (pp. 213-229). Springer International Publishing.
10.Ren, S., He, K., Girshick, R., & Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28.
11.Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016). Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14 (pp. 21-37). Springer International Publishing..
12.Redmon, J., & Farhadi, A. (2017). YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7263-7271).
13.Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014). Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13 (pp. 740-755). Springer International Publishing.
14.Everingham, M., Van Gool, L., Williams, C. K., Winn, J., & Zisserman, A. (2009). The pascal visual object classes (voc) challenge. International journal of computer vision, 88, 303-308.
15.Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020). End-to-end object detection with transformers. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part I 16 (pp. 213-229). Springer International Publishing.
16.Kim, C., Li, F., Ciptadi, A., & Rehg, J. M. (2015). Multiple hypothesis tracking revisited. In Proceedings of the IEEE international conference on computer vision (pp. 4696-4704).
17.Lowe, D. G. (2004). Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60, 91-110.
18.Rosenfeld, A., & Thurston, M. (1971). Edge and curve detection for visual scene analysis. IEEE Transactions on computers, 100(5), 562-569.
19.Rothe, R., Guillaumin, M., & Van Gool, L. (2015). Non-maximum suppression for object detection by passing messages between windows. In Computer Vision–ACCV 2014: 12th Asian Conference on Computer Vision, Singapore, Singapore, November 1-5, 2014, Revised Selected Papers, Part I 12 (pp. 290-306). Springer International Publishing.
20.Hosang, J., Benenson, R., & Schiele, B. (2017). Learning non-maximum suppression. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4507-4515).
21.Redmon, J., & Farhadi, A. (2018). Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767.
22.Bernardin, K., & Stiefelhagen, R. (2008). Evaluating multiple object tracking performance: the clear mot metrics. EURASIP Journal on Image and Video Processing, 2008, 1-10.
23.Ristani, E., Solera, F., Zou, R., Cucchiara, R., & Tomasi, C. (2016, November). Performance measures and a data set for multi-target, multi-camera tracking. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part II (pp. 17-35). Cham: Springer International Publishing.
24.Wu, B., & Nevatia, R. (2006, June). Tracking of multiple, partially occluded humans based on static body part detection. In 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06) (Vol. 1, pp. 951-958). IEEE.
25.Leal-Taixé, L., Milan, A., Reid, I., Roth, S., & Schindler, K. (2015). Motchallenge 2015: Towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942.
26.Milan, A., Leal-Taixé, L., Reid, I., Roth, S., & Schindler, K. (2016). MOT16: A benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831.
27.Dendorfer, P., Osep, A., Milan, A., Schindler, K., Cremers, D., Reid, I., ... & Leal-Taixé, L. (2021). Motchallenge: A benchmark for single-camera multiple target tracking. International Journal of Computer Vision, 129, 845-881.
28.Dendorfer, P., Rezatofighi, H., Milan, A., Shi, J., Cremers, D., Reid, I., Roth, S., & Schindler, K. (2020). MOT20: A benchmark for multi object tracking in crowded scenes. ArXiv. https://doi.org/10.48550/arXiv.2003.09003
29.Bochkovskiy, A., Wang, C., & Liao, H. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. ArXiv. https://doi.org/10.48550/arXiv.2004.10934.
30.Zeng, F., Dong, B., Zhang, Y., Wang, T., Zhang, X., & Wei, Y. (2021). MOTR: End-to-End Multiple-Object Tracking with Transformer. ArXiv. https://doi.org/10.48550/arXiv.2105.03247v2
31.Bewley, A., Ge, Z., Ott, L., Ramos, F., & Upcroft, B. (2016, September). Simple online and realtime tracking. In 2016 IEEE international conference on image processing (ICIP) (pp. 3464-3468). IEEE.
32.Peng, J., Wang, C., Wan, F., Wu, Y., Wang, Y., Tai, Y., ... & Fu, Y. (2020). Chained-tracker: Chaining paired attentive regression results for end-to-end joint multiple-object detection and tracking. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16 (pp. 145-161). Springer International Publishing.
33.Henschel, R., Leal-Taixé, L., Cremers, D., & Rosenhahn, B. (2017). Improvements to frank-wolfe optimization for multi-detector multi-object tracking. arXiv preprint arXiv:1705.08314, 8.
34.Zhang, Y., Wang, C., Wang, X., Zeng, W., & Liu, W. (2021). Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision, 129, 3069-3087.
35.Wang, Y., Kitani, K., & Weng, X. (2021, May). Joint object detection and multi-object tracking with graph neural networks. In 2021 IEEE International Conference on Robotics and Automation (ICRA) (pp. 13708-13715). IEEE.
36.Xu, Y., Ban, Y., Delorme, G., Gan, C., Rus, D., & Alameda-Pineda, X. (2021). Transcenter: Transformers with dense queries for multiple-object tracking. arXiv e-prints, arXiv-2103.
37.Huang, Y. C., Multiple Object Tracking using Sparse R-CNN with Spatial Uncertainty NTHU https://hdl.handle.net/11296/4p8u95
38.Chen, L., Ai, H., Shang, C., Zhuang, Z., & Bai, B. (2017, September). Online multi-object tracking with convolutional neural networks. In 2017 IEEE international conference on image processing (ICIP) (pp. 645-649). IEEE.
39.Sadeghian, A., Alahi, A., & Savarese, S. (2017). Tracking the untrackable: Learning to track multiple cues with long-term dependencies. In Proceedings of the IEEE international conference on computer vision (pp. 300-311).1.
40.Keuper, M., Tang, S., Andres, B., Brox, T., & Schiele, B. (2018). Motion segmentation & multiple object tracking by correlation co-clustering. IEEE transactions on pattern analysis and machine intelligence,
41.Fang, K., Xiang, Y., Li, X., & Savarese, S. (2018, March). Recurrent autoregressive networks for online multi-object tracking. In 2018 IEEE Winter Conference on Applications of Computer Vision (WACV) (pp. 466-475). IEEE.
42.Sun, S., Akhtar, N., Song, H., Mian, A., & Shah, M. (2019). Deep affinity network for multiple object tracking. IEEE transactions on pattern analysis and machine intelligence, 43(1), 104-119.
43.Wu, J., Cao, J., Song, L., Wang, Y., Yang, M., & Yuan, J. (2021). Track to detect and segment: An online multi-object tracker. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 12352-12361).
44.Pang, B., Li, Y., Zhang, Y., Li, M., & Lu, C. (2020). Tubetk: Adopting tubes to track multi-object in a one-step training model. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6308-6318).
45.Lin, S. D., Chang, T., & Chen, W. (2021). Multiple Object Tracking using YOLO-based Detector. Journal of Imaging Science and Technology, 65(4), 40401-1.

(此全文20250205後開放外部瀏覽)
01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文