U形孿生變分自編碼器於破損中文字填補修復應用__國立東華大學博碩士論文全文影像系統

帳號：guest(52.14.128.7) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	黎椿棟
作者(英文):	Chun-Tung Li
論文名稱:	U形孿生變分自編碼器於破損中文字填補修復應用
論文名稱(英文):	U-shaped Siamese Variational Autoencoder: A Network for Repairing Damaged Chinese Characters
指導教授:	江政欽
指導教授(英文):	Cheng-Chin Chiang
口試委員:	魏德樂林信鋒
口試委員(英文):	Der-Lor Way Shin-Feng Lin
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	資訊工程學系
學號:	610621501
出版年(民國):	112
畢業學年度:	111
語文別:	中文
論文頁數:	54
關鍵詞:	深度學習、變分自動編碼器、中文字填補修復
關鍵詞(英文):	deep learning、Variational Autoencoder、generative models
相關次數:	推薦:0 點閱:3 評分: 下載:0 收藏:0

在早期文件尚未電腦化之前人們只會將典籍資料或文件印刷或記錄在紙本上，在歷經長久時日之後，紙本上的文字資料常不免會因保存不當而有破損或缺漏，如果損漏的內容相當重要，就須尋求修復，若全部均以人工方式修復則相當費時費工，且效果未必理想，現今影像處理科技進步，尤其在人工智慧的加持下，修復與加工效果已經可以媲美人工處理，如果能以電腦作為輔助，將可大幅提高修復效率並降低成本，且修復後的文字再運用文字辨識技術進行文字識別與轉譯，可以提高辨識準確度，如此對於文字書籍的數位典藏也有很大的助益。另外，現今常在電腦排版文件上常可看到許多經過精美設計的文字字型，這些字型的設計也需耗費不小人工與時間成本，若能將文字修復技術進一步延伸為可自動生成不同風格字型的輔助工具，未來或許還能讓一般人都能以迅捷的方式來設計出自己特殊風格的字體。
本研究運用深度學習（Deep Learning）技術提出一套修復破損文字的方法，能將破損文字缺漏的部份還原其原樣，並嚐試以此技術為基礎，希望透過學習的方式進行中文字新字體的自動生成試驗。近年來較有名之深度學習生成模型以對抗式生成網絡（Generative Adeversarial Network, GAN）及自動編碼器（Autoencoder, AE）為二大主流，所以本研究中以變分自動編碼器(Variational Autoencoder, VAE)為核心，嘗試將孿生神經網路（Siamese Neural Network）與U型網路（U-Net）結構整合入變分自動編碼器以開發出新型的神經網路，並運用跳躍連接技巧去改善重建原始圖像品質。本研究中進行了應用變分自動編碼器技術於中文字之完整與破損的中空字型（Outline Fonts）的填補，因為中空字型的輪廓一旦有所破損，對於文字修復的挑戰度將比實心字型的文字來得大，實驗中我們以部份中文字集的中空字型訓練神經網路進行填補，再使用其餘未經訓練的中文字集的中空字型測試填補效果，利用我們的網路架構進行實驗後發現網路學習時間可縮短且字體修復效果甚佳。

Before the computerization of early documents, people used to transcribe or record classical texts or documents on paper. Over time, the textual information on paper often suffered from improper preservation, leading to damage or missing parts. If the content lost or damaged was significant, restoration efforts were necessary. Manual restoration of all content was time-consuming and labor-intensive, with uncertain results. With advancements in image processing technology, particularly with the aid of artificial intelligence, restoration and processing effects have become comparable to manual methods. By employing computers as tools, restoration efficiency can be significantly improved, costs reduced, and the restored text can be subjected to text recognition technology for accuracy enhancement. This process greatly benefits the digital preservation of textual books.
Additionally, in modern computer typesetting, one can find various beautifully designed fonts. The design of these fonts requires considerable human and time resources. If text restoration techniques can be extended to automatically generate fonts of different styles, it might enable individuals to design their own unique fonts more swiftly.
This study utilizes deep learning techniques to propose a method for restoring damaged text. It aims to restore missing portions of damaged text and experimentally explores the generation of new Chinese fonts through learning. In recent years, two well-known deep learning generative models, Generative Adversarial Networks (GANs) and Autoencoders (AEs), have gained prominence. Therefore, this research focuses on Variational Autoencoders (VAEs), integrating Siamese Neural Networks and U-Net structures into VAEs to develop a novel neural network. Jump connections are employed to enhance the quality of reconstructed original images.
The study applies Variational Autoencoder technology to the filling of complete and damaged hollow fonts (Outline Fonts) in Chinese characters. The challenge of text restoration is higher for Outline Fonts due to the intricate outlines. In experiments, a portion of the Chinese character set's Outline Fonts were used for training the neural network for filling, while the rest of the untrained character set's Outline Fonts were used to test the filling effect. Experimental results showed that our network architecture reduced learning time and achieved excellent font restoration effects.

審定書 I
誌謝 II
摘要 III
Abstract V
目錄 VII
圖目錄 X
表目錄 XII
第一章緒論 1
1.1 研究動機 1
1.2 研究目標 1
第二章相關研究與文獻探討 3
2.1 相關研究 3
2.2 深度學習技術 8
2.2.1 卷積神經網路 8
2.2.2 變分自動編碼器(Variational AutoEncoder, VAE) 9
2.2.3 孿生神經網路(Siamese Neural Network) 12
2.2.4 Unet網路 14
2.3 小結 16
第三章 U形孿生變分自編碼器之破損字形修補 18
3.1 需求分析 18
3.2 訓練架構 19
3.3 U形孿生變分自編碼設計 20
3.3.1 孿生VAE字形修補網路 20
3.3.2 U形孿生VAE字形修補網路 22
3.3.3 訓練及測試資料集 23
第四章實驗結果與討論 26
4.1 實驗流程 26
4.1.1 實驗方法 26
4.1.2 評量指標 27
4.2 實驗結果 28
4.2.1 破損字型測試 28
4.2.2 歪斜字型測試 37
4.2.3 破損之歪斜字型測試 39
4.2.4 圖片中文分割之破損字字型測試 41
4.2.5 損失函數 43
4.3 結果總結 48
第五章結論與未來方向 49
5.1 結論 49
5.2 未來發展 49
參考文獻 50

[1] Coloma Ballester, Marcelo Bertalmio, Vicent Caselles, Guillermo Sapiro and Joan Verdera, "Filling-in by joint interpolation of vector fields and gray levels",IEEE Trans. Image Process, vol. 10, no. 8, pp. 1200-1211, 2001.
[2] Marcelo Bertalmio, Luminita Vese, Guillermo Sapiro and Stanley Osher, "Simultaneous structure and texture image inpainting", IEEE Trans. Image Process, vol. 12, no. 8, pp. 882-889, 2003.
[3] Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell and Alexei A. Efros, "Context encoders: Feature learning by inpainting",CVPR, pp. 2536-2544, 2016.
[4] Chuanxia Zheng, Tat-Jen Cham and Jianfei Cai, "Pluralistic image completion",CVPR, pp. 1438-1447, 2019.
[5] CHOPRA S, HADSELL R, LECUN Y. Learning a simil-arity metric discriminatively, with application to face verification[C]//2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CV-PR'05), San Diego, USA, 2005: 539−546.
[6] O. Ronneberger, P. Fischer and T. Brox, "U-net: Convolutional networks for biomedical image segmentation", International Conference on Medical image computing and computer-assisted intervention, pp. 234-241, 2015.
[7] Ronneberger, O., Fischer, P., Brox, T. (2015). U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W., Frangi, A. (eds) Medical Image Computing and Computer-Assisted Intervention – MICCAI 2015. MICCAI 2015. Lecture Notes in Computer Science(), vol 9351. Springer, Cham. https://doi.org/10.1007/978-3-319-24574-4_28
[8] Y. Lecun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based learning applied to document recognition," in Proceedings of the IEEE, vol. 86, no. 11, pp. 2278-2324, Nov. 1998, doi: 10.1109/5.726791.
[9] Isola, Phillip, Jun-Yan Zhu, Tinghui Zhou and Alexei A. Efros. “Image-to-Image Translation with Conditional Adversarial Networks.”2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017): 5967-5976.
[10] KOCH G, ZEMEL R, SALAKHUTDINOV R. Siamese neural networks for one-shot image recognition[C]// Proc of the ICML Deep Learning Workshop, Lille, France 2015.
[11] Kingma, Diederik P. and Max Welling.“Auto-Encoding Variational Bayes.”CoRR abs/1312.6114 (2014): n. pag.
[12] Jeremy Jodan. (2018). Variational autoencoders. Jeremy Jodan.
[13] Russakovsky,O.,Deng,J.,Su,H.et al.ImageNet Large Scale Visual Recognition Challenge.Int J Comput Vis 115, 211–252 (2015). https://doi.org/10.1007/s11263-015-0816-y
[14] A. Krizhevsky, I. Sutskever, and G. E. Hinton. ImageNet classiﬁcation with deep convolutional neural networks. In NIPS, 2012. 2, 3, 5, 7, 8, 10
[15] I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In NIPS, 2014. 2, 3, 4
[16] A. Radford, L. Metz, and S. Chintala. Unsupervised representation learning with deep convolutional generative adversarial networks. ICLR, 2016. 3, 5, 6, 10
[17] Lake, B. M., Salakhutdinov, R., and Tenenbaum, J. B. (2015). Human-level concept learning through probabilistic program induction.Science, 350(6266), 1332-1338.
[18] Durall, Ricard, et al. "Combating mode collapse in gan training: An empirical analysis using hessian eigenvalues." arXiv preprint arXiv:2012.09673 (2020).
[19] Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, & Eero P. Simoncelli (2004). Image quality assessment: from error visibility to structural similarity IEEE transactions on image processing.
[20] https://en.wikipedia.org/wiki/Flood_fill

(此全文20260809後開放外部瀏覽)
01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文