使用分解式雙線性層仲裁網路之多階段藥品自動識別法__國立東華大學博碩士論文全文影像系統

帳號：guest(3.146.221.180) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	徐嫚庭
作者(英文):	Man-Ting Xu
論文名稱:	使用分解式雙線性層仲裁網路之多階段藥品自動識別法
論文名稱(英文):	Multi-stage medication automatic identification method based on decomposable bilinear layer arbitration network
指導教授:	江政欽
指導教授(英文):	Cheng-Chin Chiang
口試委員:	魏德樂林信鋒
口試委員(英文):	Der-Lor Way Shin-Feng Lin
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	資訊工程學系
學號:	611021202
出版年(民國):	112
畢業學年度:	111
語文別:	中文
論文頁數:	62
關鍵詞:	深度學習、藥物偵測、藥物辨識、YOLO、DenseNet、分解式雙線性仲裁網路
關鍵詞(英文):	Deep Learning、Medication Detection、Medication Recognition、YOLO、DenseNet、Decomposed Bilenear Layer Arbitration Network
相關次數:	推薦:0 點閱:13 評分: 下載:11 收藏:0

根據食藥署報導指出，近年來國人的用藥比例日漸提高，每年用藥高達數千億元，藥師所需要配給的處方箋也隨之升高，當藥師在核對藥品時，不只費時費力也可能因核對藥品失誤帶來用藥人的風險，用藥錯誤是一個潛在且重要的議題。長時間的藥品檢核壓力下，讓藥師的身心負擔極大；加上若是用藥人對自身服用的藥物認知不足，輕則沒有對症下藥，重則可能會有致命的風險。因此本研究希望透過近期廣為盛行的深度學習，開發一套藥物辨識系統，使用攝影機來輔助藥師快速且正確的核對藥品、降低用藥錯誤的風險，減輕藥師的核對壓力。
本研究提出一套自動藥品識別系統的設計。系統使用網路攝影機、伺服器以及固定攝影機高度的支架作為硬體工具。軟體方面，我們利用 PyTorch 實現藥品檢測和識別模型。資料集則由臺灣基督教門諾會醫療財團法人門諾醫院提供，院內使用透明包裝共215種膠囊和藥錠。
本研究提出一種多階段的藥品識別方法。在第一階段，使用高識別率的YOLO模型，對影像中的藥丸進行檢測和概略分類，識別準確率達到98.6%。在第二階段，我們使用DenseNet模型，對藥品進一步細分。膠囊類藥品的Top-1準確率達到99.27%，Top-3準確率達到100%；藥錠類藥品的Top-1準確率達到97.65%，Top-3準確率達到99.55%，Top-5準確率達到99.66%。最後，在第三階段，我們引入自創的DBL模塊，對Top-N候選藥品進行仲裁。使用DBL模塊提取候選類別間的仲裁特徵，將它們整合為一個綜合的仲裁特徵。通過我們設計的DBLAN，完成對Top-N候選藥品類別的仲裁，並輸出最終的第一候選藥品類別。實驗結果表明，DBLAN確實能提高第二階段DenseNet模型對膠囊類藥品的Top-1識別率，達到100%；藥錠類藥品的Top-1識別率則提高到98.3%。最後的實驗驗證多階段藥品識別方法的識別率優於單階段藥品識別方法。

According to a report by the Food and Drug Administration (FDA), the proportion of medication use among the Taiwanese population has been increasing in recent years, with annual medication expenditure reaching billions of pills. Consequently, pharmacists are facing a higher volume of prescriptions to handle. Verifying medications not only requires significant time and effort, but also carries the risk of errors, which can pose a potential and concern for patients. Due to the extended duration spent on medication verification, The physical and mental burden on pharmacists is enormous. Additionally, if patients have insufficient knowledge and understanding of the medications they are taking, they may not receive appropriate treatment or, in severe cases, face life-threatening risks. Therefore, this study aims to develop a medication recognition system using deep learning, a widely used technology, to assist pharmacists in quickly and accurately verifying medications, reducing the risk of medication errors, and alleviating the verification pressure on pharmacists.
This study proposes a design for an automatic drug identification system. The system utilizes a network camera, a server, and a stand with a fixed camera height as hardware tools. In terms of software, we implemented drug detection and identification models using PyTorch. The dataset was provided by the Mennonite Christian Hospital, and it includes 215 types of capsules and tablets with transparent packaging used within the hospital.
In this study, we propose a multi-stage drug identification method. In the first stage, we employ a highly accurate YOLO model to detect and roughly classify the pills in the images, achieving a recognition accuracy of 98.6%. In the second stage, we utilize the DenseNet model to further categorize the drugs. For capsule-type drugs, the Top-1 accuracy reaches 99.27%, and the Top-3 accuracy is 100%. For tablet-type drugs, the Top-1 accuracy reaches 97.65%, the Top-3 accuracy is 99.55%, and the Top-5 accuracy is 99.66%. Finally, in the third stage, we introduce our designed DBL module for arbitration among the Top-N candidate drugs. The DBL module extracts arbitration features between candidate categories and integrates them into a comprehensive arbitration feature. Through our designed DBLAN neural network model, we accomplish the arbitration of the Top-N candidate drug categories and output the final first candidate drug category. Experimental results demonstrate that DBLAN indeed improves the Top-1 recognition accuracy of the DenseNet model in the second stage for capsule-type drugs, achieving 100% accuracy, while the Top-1 recognition accuracy for tablet-type drugs is improved to 98.3%. Ultimately, our experiments confirm that the multi-stage drug identification method achieves higher recognition accuracy compared to the single-stage drug identification method.

謝誌 I
摘要 III
ABSTRACT V
目錄 VII
圖目錄 IX
表目錄 XIII
第 1 章緒論 1
1.1 研究背景與動機 1
1.2 研究目的 2
1.3 論文架構 2
第 2 章文獻回顧 3
2.1 相關文獻 3
2.2 使用技術 9
2.2.1 深度學習 10
2.2.2 物件偵測 12
2.2.3 卷積神經網路 13
2.2.4 孿生神經網路 17
第 3 章研究方法與系統設計 19
3.1 研究方法 19
3.1.1 YOLO藥品偵測與第一階段概略分類 19
3.1.2 DenseNet第二階段細分類 21
3.1.3 分解式雙線性層仲裁網路第三階段Top-N仲裁 22
3.2 系統設計 29
3.2.1 系統環境 29
3.2.2 系統流程 30
第 4 章實驗結果與討論 35
4.1 藥品偵測與第一階段概略分類 35
4.1.1 藥品資料蒐集 35
4.1.2 藥品偵測與概略分類效能 36
4.2 第二階段藥品細分類 40
4.2.1 藥品辨識系統限制 40
4.2.2 藥品辨識模型之選擇 41
4.2.3 藥品資料擴增及像素標準化 41
4.2.4 膠囊類藥品第二階段細分類 46
4.2.5 藥錠類藥品第二階段細分類 47
4.3 第三階段TOP-N候選仲裁 50
4.3.1 DBLAN訓練資料集擴增 50
4.3.2 膠囊類藥品的第三階段仲裁 51
4.3.3 藥錠類藥品的第三階段仲裁 51
4.4 單階段藥品辨識與多階段藥品辨識比較 53
4.5 分解式雙線性層效果驗證 54
第 5 章結論與未來方向 57
參考文獻 59

[1] 謝昀哲, “影像辨識技術運用於藥物檢索系統之研究,” 碩士論文, 義守大學電子工程學系, 2009.
[2] 李哲宇, “應用於行動裝置之藥品外觀影像辨識演算法研究,” 碩士論文, 南臺科技大學資訊工程系, 2017.
[3] Z. Huang and J. Leng, “Analysis of Hu’s moment invariants on image scaling and rotation,” in 2010 2nd International Conference on Computer Engineering and Technology, Apr. 2010, pp. V7-476-V7-480. doi: 10.1109/ICCET.2010.5485542.
[4] K. O’Shea and R. Nash, “An Introduction to Convolutional Neural Networks.” arXiv, Dec. 02, 2015. doi: 10.48550/arXiv.1511.08458.
[5] N. Usuyama, N. L. Delgado, A. K. Hall, and J. Lundin, “ePillID Dataset: A Low-Shot Fine-Grained Benchmark for Pill Identification,” in 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA: IEEE, Jun. 2020, pp. 3971–3977. doi: 10.1109/CVPRW50498.2020.00463.
[6] Lu Tan, Tianran Huangfu, Liyao Wu, and Wenying Chen, “Comparison of RetinaNet, SSD, and YOLO v3 for real-time pill identification.,” BMC Med. Inform. Decis. Mak., vol. 21, no. 1, p. 324, Nov. 2021, doi: 10.1186/s12911-021-01691-8.
[7] T.-Y. Lin, P. Goyal, R. Girshick, K. He, and P. Dollár, “Focal Loss for Dense Object Detection.” arXiv, Feb. 07, 2018. Accessed: Jul. 15, 2023. [Online]. Available: http://arxiv.org/abs/1708.02002
[8] W. Liu et al., “SSD: Single Shot MultiBox Detector,” 2016, pp. 21–37. doi: 10.1007/978-3-319-46448-0_2.
[9] J. Redmon and A. Farhadi, “YOLOv3: An Incremental Improvement.” arXiv, Apr. 08, 2018. doi: 10.48550/arXiv.1804.02767.
[10] A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” ArXiv200410934 Cs Eess, Apr. 2020, Accessed: May 08, 2022. [Online]. Available: http://arxiv.org/abs/2004.10934
[11] H.-J. Kwon, H.-G. Kim, and S.-H. Lee, “Pill Detection Model for Medicine Inspection Based on Deep Learning,” Chemosensors, vol. 10, no. 1, Art. no. 1, Jan. 2022, doi: 10.3390/chemosensors10010004.
[12] G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely Connected Convolutional Networks,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI: IEEE, Jul. 2017, pp. 2261–2269. doi: 10.1109/CVPR.2017.243.
[13] D. E. Rumelhart and J. L. McClelland, “Learning Internal Representations by Error Propagation,” in Parallel Distributed Processing: Explorations in the Microstructure of Cognition: Foundations, MIT Press, 1987, pp. 318–362. Accessed: Jul. 15, 2023. [Online]. Available: https://ieeexplore.ieee.org/document/6302929
[14] “[1409.1556] Very Deep Convolutional Networks for Large-Scale Image Recognition.” https://arxiv.org/abs/1409.1556 (accessed Jul. 15, 2023).
[15] F. Rosenblatt, “The perceptron: A probabilistic model for information storage and organization in the brain,” Psychol. Rev., vol. 65, no. 6, pp. 386–408, 1958, doi: 10.1037/h0042519.
[16] 山口達輝、松田洋之, 圖解AI：機器學習和深度學習的技術與原理. 碁峰, 2020.
[17] R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation”.
[18] S. J. Pan and Q. Yang, “A Survey on Transfer Learning,” IEEE Trans. Knowl. Data Eng., vol. 22, no. 10, pp. 1345–1359, Oct. 2010, doi: 10.1109/TKDE.2009.191.
[19] “[1504.08083] Fast R-CNN.” https://arxiv.org/abs/1504.08083 (accessed Jul. 15, 2023).
[20] O. Corcoll, “Semantic Image Cropping,” arXiv.org, Jul. 15, 2021. https://arxiv.org/abs/2107.07153v1 (accessed Jul. 15, 2023).
[21] “A Comprehensive Guide to Convolutional Neural Networks — the ELI5 way | Saturn Cloud Blog,” Dec. 15, 2018. https://saturncloud.io/blog/a-comprehensive-guide-to-convolutional-neural-networks-the-eli5-way/ (accessed Jul. 15, 2023).
[22] “#013 CNN VGG 16 and VGG 19 - Master Data Science.” https://datahacker.rs/deep-learning-vgg-16-vs-vgg-19/ (accessed Jul. 15, 2023).
[23] G. Koch, R. Zemel, and R. Salakhutdinov, “Siamese Neural Networks for One-shot Image Recognition”.
[24] “Siamese networks with Keras, TensorFlow, and Deep Learning - PyImageSearch.” https://pyimagesearch.com/2020/11/30/siamese-networks-with-keras-tensorflow-and-deep-learning/ (accessed Jul. 15, 2023).
[25] R. Hadsell, S. Chopra, and Y. LeCun, “Dimensionality Reduction by Learning an Invariant Mapping,” in 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’06), Jun. 2006, pp. 1735–1742. doi: 10.1109/CVPR.2006.100.
[26] P. F. Felzenszwalb, R. B. Girshick, D. McAllester, and D. Ramanan, “Object Detection with Discriminatively Trained Part-Based Models,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 9, pp. 1627–1645, Sep. 2010, doi: 10.1109/TPAMI.2009.167.
[27] P. Viola and M. Jones, “Rapid object detection using a boosted cascade of simple features,” in Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001, Dec. 2001, p. I–I. doi: 10.1109/CVPR.2001.990517.
[28] “機器/深度學習: 物件偵測 Non-Maximum Suppression (NMS) | by Tommy Huang | Medium.” https://chih-sheng-huang821.medium.com/機器-深度學習-物件偵測-non-maximum-suppression-nms-aa70c45adffa (accessed Jul. 15, 2023).
[29] “Bilinear — PyTorch 2.0 documentation.” https://pytorch.org/docs/stable/generated/torch.nn.Bilinear.html (accessed Jul. 15, 2023).
[30] T.-Y. Lin, A. RoyChowdhury, and S. Maji, “Bilinear CNN Models for Fine-Grained Visual Recognition,” in 2015 IEEE International Conference on Computer Vision (ICCV), Dec. 2015, pp. 1449–1457. doi: 10.1109/ICCV.2015.170.
[31] “阿達瑪乘積 (矩陣),” 維基百科，自由的百科全書. Jul. 28, 2022. Accessed: Jul. 15, 2023. [Online]. Available: https://zh.wikipedia.org/w/index.php?title=阿達瑪乘積_(矩陣)&oldid=72935996
[32] Y. Bengio, R. Ducharme, P. Vincent, and C. Jauvin, “A Neural Probabilistic Language Model”.
[33] “ImageNet Large Scale Visual Recognition Challenge | SpringerLink.” https://link.springer.com/article/10.1007/s11263-015-0816-y (accessed Jul. 15, 2023).

(此全文20260801後開放外部瀏覽)
01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文