深度學習應用於向量量化影像壓縮之研究__國立東華大學博碩士論文全文影像系統

帳號：guest(3.143.235.219) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	張泓亮
作者(英文):	Hong-Liang Zhang
論文名稱:	深度學習應用於向量量化影像壓縮之研究
論文名稱(英文):	Deep Learning Based Vector Quantization For Image Compression
指導教授:	陳偉銘
指導教授(英文):	Wei-Ming Chen
口試委員:	張耀中簡暐哲
口試委員(英文):	Yao-Chung Chang Wei-Che Chien
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	資訊管理學系
學號:	610935102
出版年(民國):	111
畢業學年度:	110
語文別:	中文
論文頁數:	52
關鍵詞:	影像壓縮、向量量化、編碼簿、深度學習
關鍵詞(英文):	Image Compression、Vector Quantization、Codebook、Deep Learning
相關次數:	推薦:0 點閱:20 評分: 下載:0 收藏:0

在這資訊爆炸的時代，每分每秒有著無以計數的圖片被傳輸，面對這些需要耗費大量儲存空間、網路傳輸的圖片，影像壓縮扮演著重要的角色。向量量化為影像壓縮的其中一個方法，具有高壓縮比的失真壓縮，其架構中的編碼簿對於影像還原後的品質，占了極重要的部分。將編碼簿和編碼向量的數量增加，皆能有效的提升還原後的影像品質，但也會使壓縮後所需的空間增加，導致壓縮比降低。向量量化的文獻大多針對編碼簿的訓練演算法進行改進，或是用於數據隱藏。
近年來深度學習的興起，拓展至各個領域應用廣泛，然而深度學習與向量量化結合的應用不少，像是用於生成模型或是神經網路的壓縮，文獻中的實驗結果也獲得不錯的成果，但與深度學習分類模型的結合在文獻資料上還是偏少。
因此，本研究提出了一個以向量量化為架構，與深度學習分類模型進行結合，透過影像分類創造出大量的分群分類編碼簿，使壓縮後的影像品質提升。然而分群分類的編碼簿會使壓縮後的資訊增加，所以本研究進而以深度學習模型猜測影像的分類，達到減少壓縮後的訊息，取得更好的壓縮比。

In this era of information explosion, countless pictures are being transmitted every second, and image compression plays an important role in the face of these pictures that require a lot of storage space and network transmission. Vector quantization is one of the methods of image compression with high compression ratio and distortion compression, and the codebook in its structure plays an important part in the quality of the restored images. Increasing the number of codebooks and coding vectors can effectively improve the quality of the restored images, but it also increases the space required for compression, resulting in a lower compression ratio. Most of the vector quantization literature is devoted to improving the training algorithm of codebooks or for data hiding.
In recent years, the rise of deep learning has expanded to a wide range of applications in various fields. However, there are many applications of combining deep learning with vector quantization, such as for generative models or neural network compression, and the experimental results in the literature have yielded good results.
Therefore, this study proposes a vector quantization framework combined with a deep learning classification model to create a large number of grouped codebooks through image classification to improve the quality of the compressed images. However, the grouped codebooks will increase the information passed to the decompression side, so this study further guesses the image classification with a deep learning model to achieve a better compression ratio by reducing the information passed to the decompression side.

第一章緒論　　 1
1.1研究背景與動機　　 1
1.2研究目的　　 2
第二章文獻探討　　 5
2.1 影像壓縮　　 5
2.1.1無失真壓縮(Lossless Compression)　　 5
2.1.2 失真壓縮(Lossy Compression)　　 6
2.2 向量量化(Vector Quantization, VQ)　　 6
2.2.1 編碼簿(Codebook)　　 9
2.2.2 LBG演算法(Linde-Buzo-Gray)　　 10
2.3 影像分析(Image Analysis)　　 11
2.3.1直方圖(Histogram)　　 11
2.3.2影像變異數　　 12
2.3.3 Canny邊緣偵測　　 13
2.4 深度學習(Deep Learning)　　 14
2.4.1卷積神經網路(CNN)　　 14
2.4.2 ResNet　　 15
2.5 向量量化結合深度學習　　 16
2.6 影像壓縮之評估　　 16
2.6.1 峰值訊噪比(Peak signal-to-noise ratio, PSNR)　　 17
2.6.2 結構相似性(Structural similarity, SSIM)　　 18
2.7 技術總結　　 19
第三章研究方法　　 21
3.1 資料集　　 21
3.2 編碼簿設計　　 21
3.2.1 分群　　 23
3.2.2 分類　　 25
3.2.3 編碼簿設計總結　　 25
3.3 向量量化結合深度學習　　 26
3.3.1 壓縮　　 27
3.3.2 解壓縮　　 28
3.3.3 ResNet　　 30
3.4 向量量化結合深度學習架構流程圖　　 31
3.5 影像壓縮指標　　 33
第四章實驗結果　　 35
4.1 實驗環境　　 35
4.1.1硬體環境　　 35
4.1.2軟體環境　　 35
4.2實驗結果　　 35
4.2.1灰階直方圖和變異數分群分類之結果　　 36
4.2.2灰階直方圖和Canny邊緣偵測分群分類之結果　　 38
4.2.3增加灰階直方圖和變異數分群分類之結果　　 40
4.3實驗結果討論　　 42
4.3.1變異數與Canny邊緣偵測方式結果之比較　　 42
4.3.2以直方圖和變異數方式分出16群和32群編碼簿結果之比較　　 43
4.3.3壓縮比(Compression ratio)　　 44
第五章結論與未來展望　　 47
參考文獻　　 49
中文文獻　　 49
英文文獻　　 49

[1] 逍遙文工作室結構相似性 (Structural SIMilarity) » SSIM demo。取自https://cg2010studio.com/2013/01/07/%E7%B5%90%E6%A7%8B%E7%9B%B8%E4%BC%BC%E6%80%A7-structural-similarity/
[2] 楊柏遠（2014）。影像壓縮優質編碼簿之研究。國立屏東教育大學資訊科學系碩士班碩士論文，屏東縣。取自https://hdl.handle.net/11296/vchra8
[3] Brock, A., Donahue, J., & Simonyan, K. (2018). Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096.
[4] Budhiman, A., Suyanto, S., & Arifianto, A. (2019, December). Melanoma cancer classification using resnet with data augmentation. In 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI) (pp. 17-20). IEEE.
[5] Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6), 679-698.
[6] Chang, C. C., & Hu, Y. C. (1998). A fast LBG codebook training algorithm for vector quantization. IEEE Transactions on Consumer Electronics, 44(4), 1201-1208.
[7] Chang, C. C., & Lin, P. Y. (2004, November). A compression-based data hiding scheme using vector quantization and principle component analysis. In 2004 International Conference on Cyberworlds (pp. 369-375). IEEE.
[8] Demir, A., Yilmaz, F., & Kose, O. (2019, October). Early detection of skin cancer using deep learning architectures: resnet-101 and inception-v3. In 2019 medical technologies congress (TIPTEKNO) (pp. 1-4). IEEE.
[9] Gong, Y., Liu, L., Yang, M., & Bourdev, L. (2014). Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115.
[10] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.
[11] Han, J. K., & Kim, H. M. (2006). Optimization of requantization codebook for vector quantization. IEEE transactions on image processing, 15(5), 1057-1061.
[12] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[13] Hore, A., & Ziou, D. (2010, August). Image quality metrics: PSNR vs. SSIM. In 2010 20th international conference on pattern recognition (pp. 2366-2369). IEEE.
[14] Hu, Y. C., & Chang, C. C. (2003). An effective codebook search algorithm for vector quantization. The Imaging Science Journal, 51(4), 221-233.
[15] Huang, B., & Xie, L. (2010, July). An improved LBG algorithm for image vector quantization. In 2010 3rd International Conference on Computer Science and Information Technology (Vol. 6, pp. 467-471). IEEE.
[16] Huffman, D. A. (1952). A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40(9), 1098-1101.
[17] Kohonen, T. (1995). Learning vector quantization. In Self-organizing maps (pp. 175-189). Springer, Berlin, Heidelberg.
[18] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
[19] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[20] Linde, Y., Buzo, A., & Gray, R. (1980). An algorithm for vector quantizer design. IEEE Transactions on communications, 28(1), 84-95.
[21] Lu, X., Wang, H., Dong, W., Wu, F., Zheng, Z., & Shi, G. (2019). Learning a deep vector quantization network for image compression. IEEE Access, 7, 118815-118825.
[22] Mahajan, A., & Chaudhary, S. (2019, June). Categorical image classification based on representational deep network (RESNET). In 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 327-330). IEEE.
[23] Nallaperumal, K., Selvakumar, R. K., Radhakrishnan, S., ArulMozhi, K., Varghese, J., & Krishnaveni, K. (2006, April). An efficient approach to data hiding based on lattice vector quantization. In 2006 IFIP International Conference on Wireless and Optical Communications Networks (pp. 5-pp). IEEE.
[24] Pourghasemi, H. R., Gayen, A., Lasaponara, R., & Tiefenbacher, J. P. (2020). Application of learning vector quantization and different machine learning techniques to assessing forest fire influence factors and spatial modelling. Environmental research, 184, 109321.
[25] Razavi, A., Van den Oord, A., & Vinyals, O. (2019). Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32.
[26] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[27] Wang, F. H., Pan, J. S., Jain, L. C., & Huang, H. C. (2004, June). A VQ-based image-in-image data hiding scheme. In 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763) (Vol. 3, pp. 2191-2194). IEEE.
[28] Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4), 600-612.
[29] Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., & Graves, A. (2016). Conditional image generation with pixelcnn decoders. Advances in neural information processing systems, 29.
[30] Van Den Oord, A., & Vinyals, O. (2017). Neural discrete representation learning. Advances in neural information processing systems, 30.
[31] Yang, S., & Mao, Y. (2022, June). Vector Quantization of Deep Convolutional Neural Networks With Learned Codebook. In 2022 17th Canadian Workshop on Information Theory (CWIT) (pp. 39-44). IEEE.
[32] Zahisham, Z., Lee, C. P., & Lim, K. M. (2020, September). Food recognition with ResNet-50. In 2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET) (pp. 1-5). IEEE.
[33] Zaremba, W., Sutskever, I., & Vinyals, O. (2014). Recurrent neural network regularization. arXiv preprint arXiv:1409.2329.

(此全文20250925後開放外部瀏覽)
01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文