帳號:guest(3.143.235.219)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目勘誤回報
作者:張泓亮
作者(英文):Hong-Liang Zhang
論文名稱:深度學習應用於向量量化影像壓縮之研究
論文名稱(英文):Deep Learning Based Vector Quantization For Image Compression
指導教授:陳偉銘
指導教授(英文):Wei-Ming Chen
口試委員:張耀中
簡暐哲
口試委員(英文):Yao-Chung Chang
Wei-Che Chien
學位類別:碩士
校院名稱:國立東華大學
系所名稱:資訊管理學系
學號:610935102
出版年(民國):111
畢業學年度:110
語文別:中文
論文頁數:52
關鍵詞:影像壓縮向量量化編碼簿深度學習
關鍵詞(英文):Image CompressionVector QuantizationCodebookDeep Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:20
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:0
  • 收藏收藏:0
在這資訊爆炸的時代,每分每秒有著無以計數的圖片被傳輸,面對這些需要耗費大量儲存空間、網路傳輸的圖片,影像壓縮扮演著重要的角色。向量量化為影像壓縮的其中一個方法,具有高壓縮比的失真壓縮,其架構中的編碼簿對於影像還原後的品質,占了極重要的部分。將編碼簿和編碼向量的數量增加,皆能有效的提升還原後的影像品質,但也會使壓縮後所需的空間增加,導致壓縮比降低。向量量化的文獻大多針對編碼簿的訓練演算法進行改進,或是用於數據隱藏。
近年來深度學習的興起,拓展至各個領域應用廣泛,然而深度學習與向量量化結合的應用不少,像是用於生成模型或是神經網路的壓縮,文獻中的實驗結果也獲得不錯的成果,但與深度學習分類模型的結合在文獻資料上還是偏少。
因此,本研究提出了一個以向量量化為架構,與深度學習分類模型進行結合,透過影像分類創造出大量的分群分類編碼簿,使壓縮後的影像品質提升。然而分群分類的編碼簿會使壓縮後的資訊增加,所以本研究進而以深度學習模型猜測影像的分類,達到減少壓縮後的訊息,取得更好的壓縮比。
In this era of information explosion, countless pictures are being transmitted every second, and image compression plays an important role in the face of these pictures that require a lot of storage space and network transmission. Vector quantization is one of the methods of image compression with high compression ratio and distortion compression, and the codebook in its structure plays an important part in the quality of the restored images. Increasing the number of codebooks and coding vectors can effectively improve the quality of the restored images, but it also increases the space required for compression, resulting in a lower compression ratio. Most of the vector quantization literature is devoted to improving the training algorithm of codebooks or for data hiding.
In recent years, the rise of deep learning has expanded to a wide range of applications in various fields. However, there are many applications of combining deep learning with vector quantization, such as for generative models or neural network compression, and the experimental results in the literature have yielded good results.
Therefore, this study proposes a vector quantization framework combined with a deep learning classification model to create a large number of grouped codebooks through image classification to improve the quality of the compressed images. However, the grouped codebooks will increase the information passed to the decompression side, so this study further guesses the image classification with a deep learning model to achieve a better compression ratio by reducing the information passed to the decompression side.
第一章 緒論   1
1.1研究背景與動機   1
1.2研究目的   2
第二章 文獻探討   5
2.1 影像壓縮   5
2.1.1無失真壓縮(Lossless Compression)   5
2.1.2 失真壓縮(Lossy Compression)   6
2.2 向量量化(Vector Quantization, VQ)   6
2.2.1 編碼簿(Codebook)   9
2.2.2 LBG演算法(Linde-Buzo-Gray)   10
2.3 影像分析(Image Analysis)   11
2.3.1直方圖(Histogram)   11
2.3.2影像變異數   12
2.3.3 Canny邊緣偵測   13
2.4 深度學習(Deep Learning)   14
2.4.1卷積神經網路(CNN)   14
2.4.2 ResNet   15
2.5 向量量化結合深度學習   16
2.6 影像壓縮之評估   16
2.6.1 峰值訊噪比(Peak signal-to-noise ratio, PSNR)   17
2.6.2 結構相似性(Structural similarity, SSIM)   18
2.7 技術總結   19
第三章 研究方法   21
3.1 資料集   21
3.2 編碼簿設計   21
3.2.1 分群   23
3.2.2 分類   25
3.2.3 編碼簿設計總結   25
3.3 向量量化結合深度學習   26
3.3.1 壓縮   27
3.3.2 解壓縮   28
3.3.3 ResNet   30
3.4 向量量化結合深度學習架構流程圖   31
3.5 影像壓縮指標   33
第四章 實驗結果   35
4.1 實驗環境   35
4.1.1硬體環境   35
4.1.2軟體環境   35
4.2實驗結果   35
4.2.1灰階直方圖和變異數分群分類之結果   36
4.2.2灰階直方圖和Canny邊緣偵測分群分類之結果   38
4.2.3增加灰階直方圖和變異數分群分類之結果   40
4.3實驗結果討論   42
4.3.1變異數與Canny邊緣偵測方式結果之比較   42
4.3.2以直方圖和變異數方式分出16群和32群編碼簿結果之比較   43
4.3.3壓縮比(Compression ratio)   44
第五章 結論與未來展望   47
參考文獻   49
中文文獻   49
英文文獻   49

[1] 逍遙文工作室 結構相似性 (Structural SIMilarity) » SSIM demo。 取自https://cg2010studio.com/2013/01/07/%E7%B5%90%E6%A7%8B%E7%9B%B8%E4%BC%BC%E6%80%A7-structural-similarity/
[2] 楊柏遠(2014)。影像壓縮優質編碼簿之研究。國立屏東教育大學資訊科學系碩士班碩士論文,屏東縣。 取自https://hdl.handle.net/11296/vchra8
[3] Brock, A., Donahue, J., & Simonyan, K. (2018). Large scale GAN training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096.
[4] Budhiman, A., Suyanto, S., & Arifianto, A. (2019, December). Melanoma cancer classification using resnet with data augmentation. In 2019 International Seminar on Research of Information Technology and Intelligent Systems (ISRITI) (pp. 17-20). IEEE.
[5] Canny, J. (1986). A computational approach to edge detection. IEEE Transactions on pattern analysis and machine intelligence, (6), 679-698.
[6] Chang, C. C., & Hu, Y. C. (1998). A fast LBG codebook training algorithm for vector quantization. IEEE Transactions on Consumer Electronics, 44(4), 1201-1208.
[7] Chang, C. C., & Lin, P. Y. (2004, November). A compression-based data hiding scheme using vector quantization and principle component analysis. In 2004 International Conference on Cyberworlds (pp. 369-375). IEEE.
[8] Demir, A., Yilmaz, F., & Kose, O. (2019, October). Early detection of skin cancer using deep learning architectures: resnet-101 and inception-v3. In 2019 medical technologies congress (TIPTEKNO) (pp. 1-4). IEEE.
[9] Gong, Y., Liu, L., Yang, M., & Bourdev, L. (2014). Compressing deep convolutional networks using vector quantization. arXiv preprint arXiv:1412.6115.
[10] Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., ... & Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11), 139-144.
[11] Han, J. K., & Kim, H. M. (2006). Optimization of requantization codebook for vector quantization. IEEE transactions on image processing, 15(5), 1057-1061.
[12] He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 770-778).
[13] Hore, A., & Ziou, D. (2010, August). Image quality metrics: PSNR vs. SSIM. In 2010 20th international conference on pattern recognition (pp. 2366-2369). IEEE.
[14] Hu, Y. C., & Chang, C. C. (2003). An effective codebook search algorithm for vector quantization. The Imaging Science Journal, 51(4), 221-233.
[15] Huang, B., & Xie, L. (2010, July). An improved LBG algorithm for image vector quantization. In 2010 3rd International Conference on Computer Science and Information Technology (Vol. 6, pp. 467-471). IEEE.
[16] Huffman, D. A. (1952). A method for the construction of minimum-redundancy codes. Proceedings of the IRE, 40(9), 1098-1101.
[17] Kohonen, T. (1995). Learning vector quantization. In Self-organizing maps (pp. 175-189). Springer, Berlin, Heidelberg.
[18] Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6), 84-90.
[19] LeCun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278-2324.
[20] Linde, Y., Buzo, A., & Gray, R. (1980). An algorithm for vector quantizer design. IEEE Transactions on communications, 28(1), 84-95.
[21] Lu, X., Wang, H., Dong, W., Wu, F., Zheng, Z., & Shi, G. (2019). Learning a deep vector quantization network for image compression. IEEE Access, 7, 118815-118825.
[22] Mahajan, A., & Chaudhary, S. (2019, June). Categorical image classification based on representational deep network (RESNET). In 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA) (pp. 327-330). IEEE.
[23] Nallaperumal, K., Selvakumar, R. K., Radhakrishnan, S., ArulMozhi, K., Varghese, J., & Krishnaveni, K. (2006, April). An efficient approach to data hiding based on lattice vector quantization. In 2006 IFIP International Conference on Wireless and Optical Communications Networks (pp. 5-pp). IEEE.
[24] Pourghasemi, H. R., Gayen, A., Lasaponara, R., & Tiefenbacher, J. P. (2020). Application of learning vector quantization and different machine learning techniques to assessing forest fire influence factors and spatial modelling. Environmental research, 184, 109321.
[25] Razavi, A., Van den Oord, A., & Vinyals, O. (2019). Generating diverse high-fidelity images with vq-vae-2. Advances in neural information processing systems, 32.
[26] Simonyan, K., & Zisserman, A. (2014). Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556.
[27] Wang, F. H., Pan, J. S., Jain, L. C., & Huang, H. C. (2004, June). A VQ-based image-in-image data hiding scheme. In 2004 IEEE International Conference on Multimedia and Expo (ICME)(IEEE Cat. No. 04TH8763) (Vol. 3, pp. 2191-2194). IEEE.
[28] Wang, Z., Bovik, A. C., Sheikh, H. R., & Simoncelli, E. P. (2004). Image quality assessment: from error visibility to structural similarity. IEEE transactions on image processing, 13(4), 600-612.
[29] Van den Oord, A., Kalchbrenner, N., Espeholt, L., Vinyals, O., & Graves, A. (2016). Conditional image generation with pixelcnn decoders. Advances in neural information processing systems, 29.
[30] Van Den Oord, A., & Vinyals, O. (2017). Neural discrete representation learning. Advances in neural information processing systems, 30.
[31] Yang, S., & Mao, Y. (2022, June). Vector Quantization of Deep Convolutional Neural Networks With Learned Codebook. In 2022 17th Canadian Workshop on Information Theory (CWIT) (pp. 39-44). IEEE.
[32] Zahisham, Z., Lee, C. P., & Lim, K. M. (2020, September). Food recognition with ResNet-50. In 2020 IEEE 2nd International Conference on Artificial Intelligence in Engineering and Technology (IICAIET) (pp. 1-5). IEEE.
[33] Zaremba, W., Sutskever, I., & Vinyals, O. (2014). Recurrent neural network regularization. arXiv preprint arXiv:1409.2329.
(此全文20250925後開放外部瀏覽)
01.pdf
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *