特定作曲家的音樂生成與驗證__國立東華大學博碩士論文全文影像系統

帳號：guest(3.145.8.94) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
外文摘要
論文目次
參考文獻
電子全文

作者:	黃馨儀
作者(英文):	Hsin-Yi Huang
論文名稱:	特定作曲家的音樂生成與驗證
論文名稱(英文):	Composer-specific Music Generation and Verification
指導教授:	吳建銘
指導教授(英文):	Jiann-Ming Wu
口試委員:	曹振海盧東華
口試委員(英文):	Chen-Hai Tsao Dong-Hwa Lu
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	應用數學系
學號:	610911001
出版年(民國):	112
畢業學年度:	111
語文別:	英文
論文頁數:	59
關鍵詞:	音樂生成、作曲家辨識、馬可夫鏈、長短期記憶、卷積神經網路、MIDI
關鍵詞(英文):	music generation、composer classification、Markov chain、long short-term memory、convolutional neural network、MIDI
相關次數:	推薦:0 點閱:21 評分: 下載:14 收藏:0

本文旨在基於幾位著名作曲家的古典鋼琴MIDI資料探索特定作曲家的音樂分類及生成，目標是建構一個可行的分類器，通過卷積神經網路監督式學習以辨識不同作曲家的音樂作品；而另一個目標為設計特定作曲家的音樂生成模型並驗證其有效性。近年來，隨著電腦科技的發展，將數學模型與深度學習應用於自動化符號音樂生成的研究越來越受歡迎，而以類神經網路為基礎的人工智慧AI作曲也成了一項有趣的議題。於本文，我們提出一種基於長短期記憶模型及一階馬可夫鏈的階層式符號音樂生成模型，並以深層卷積類神經網路所訓練的作曲家辨識模型作為客觀評估機制，以此機制檢驗音樂生成。對於音樂生成，我們以單一作曲家的古典鋼琴MIDI檔作為訓練資料，並將提取出的和弦資訊分為三個部分個別訓練，分別為時序根音類別、和弦及持續時間，最後再階層式生成音樂——先以長短期記憶模型預測根音序列及持續時間序列，並將生成的根音序列視為音樂中的根音進行，再以該根音及和弦的馬可夫轉移機率預測屬於下一個根音類別的和弦。對於根音序列及持續時間序列的生成，會選擇以長短期記憶模型而非一階馬可夫模型是為了捕捉到更長時間的序列結構，而由作曲家辨識模型所評估的實驗結果也證實，混合式音樂生成模型所生成的音樂較符合該位作曲家的作曲特色。

This work explores composer-specific music classification and generation based on materials of classical piano MIDI files of several famous composers. One goal is to construct a classifier feasible for discriminate analysis of music of different composers by supervised learning of convolutional neural networks (CNNs) and the other is to devise composer-specific music generative models and verify their effectiveness. In recent years, with advanced computer technology, research on applying mathematical models or deep learning to automatic music generation has gained popularity, and model-based music composition has also become an interesting topic in the field of artificial intelligence. In this paper, we propose a hybrid generative model of long short-term memory models (LSTMs) and first-order Markov chains for symbolic music generation, also employing a composer classification model derived by training deep CNNs for performance evaluation of verifying music generation. To generate music, we use classical piano MIDI files of an individual composer as training data, decomposing the extracted chord information into three parts for training predictive neural models, namely root-based classes, chords and durations. On the basis, we can generate music in a hierarchical manner — first predicting sequences of roots and durations using the separate LSTMs, and treating the generated root sequence as root progression in the music. The chord belonging to the next root-based class is then predicted with Markov transition probabilities of the current root and chord. For the generation of root sequence and duration sequence, the LSTMs were chosen instead of the first-order Markov models to capture the longer sequence structure. The results evaluated by the composer classification model confirm that the music generated by the hybrid generative model is more consistent with the compositional characteristics of the composer.

1. Introduction 1

2. Data Representation 7
2.1 Dataset Introduction 7
2.2 MIDI 8
2.3 Musical Terminology 10
2.4 Data Representation 12

3. CNN Composer Classification 17
3.1 Software Design 18
3.2 Experiments and Results 19

4. Music Generation with First-order Markov Chains 23
4.1 Root-based Classification of Chords 24
4.2 Briefly Introduction of Markov Chain 27
4.3 Markov Transition Matrices 28
4.4 Music Generation with First-order Markov Transition probabilities 30

5. Music Generation Using a Hybrid Generative Model of LSTMs and Markov Chains 35
5.1 Briefly Introduction of LSTM 36
5.2 Train with LSTM for only Root and Duration Generation 38
5.3 Generate with a Hybrid LSTM-Markov Model 42

6. Numerical Evaluation of Music Generation 47

7. Conclusions and Future Works 55

References 57

1. Hedges, S.A., Dice Music in the Eighteenth Century. Music & Letters, 1978. 59(2): p. 180-187.

2. Merwe, A.V.D. and W. Schulze, Music Generation with Markov Models. IEEE MultiMedia, 2011. 18(3): p. 78-85.

3. Shapiro, I. and M. Huber, Markov chains for computer music generation. Journal of Humanistic Mathematics, 2021. 11(2): p. 167-195.

4. Oord, A.v.d., et al., Wavenet: A generative model for raw audio. arXiv preprint arXiv:1609.03499, 2016.

5. Eck, D. and J. Schmidhuber, A first look at music composition using lstm recurrent neural networks. Istituto Dalle Molle Di Studi Sull Intelligenza Artificiale, 2002. 103(4): p. 48.

6. Choi, K., G. Fazekas, and M. Sandler, Text-based LSTM networks for automatic music composition. arXiv preprint arXiv:1604.05358, 2016.

7. Johnson, D.D. Generating polyphonic music using tied parallel networks. in Computational Intelligence in Music, Sound, Art and Design: 6th International Conference, EvoMUSART 2017, Amsterdam, The Netherlands, April 19–21, 2017, Proceedings 6. 2017. Springer.

8. Yang, L.-C., S.-Y. Chou, and Y.-H. Yang, MidiNet: A convolutional generative adversarial network for symbolic-domain music generation. arXiv preprint arXiv:1703.10847, 2017.

9. Dong, H.-W., et al. Musegan: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment. in Proceedings of the AAAI Conference on Artificial Intelligence. 2018.

10. Huang, C.-Z.A., et al., Music transformer. arXiv preprint arXiv:1809.04281, 2018.

11. Choi, K., et al. Encoding musical style with transformer autoencoders. in International Conference on Machine Learning. 2020. PMLR.

12. Huang, Y.-S. and Y.-H. Yang. Pop music transformer: Beat-based modeling and generation of expressive pop piano compositions. in Proceedings of the 28th ACM International Conference on Multimedia. 2020.

13. Krueger, B. Classical Piano Midi Page. 1996; Available from: http://www.piano-midi.de/technic.htm.

14. Dillen, O.v. Outline of basic music theory. 2011; Available from: https://www.oscarvandillen.com/outline_of_basic_music_theory/.

15. Cuthbert, M.S. and C. Ariza, music21: A toolkit for computer-aided musicology and symbolic music data. 2010.

16. Velardo, V. Melody generation with RNN-LSTM. 2020; Available from: https://www.youtube.com/playlist?list=PL-wATfeyAMNr0KMutwtbeDCmpwvtul-Xz.

17. Kim, S., et al., Deep composer classification using symbolic representation. arXiv preprint arXiv:2010.00823, 2020.

18. Srivastava, N., et al., Dropout: a simple way to prevent neural networks from overfitting. The journal of machine learning research, 2014. 15(1): p. 1929-1958.

19. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.

20. contributors, W. Root (chord). 2023; Available from: https://en.wikipedia.org/w/index.php?title=Root_(chord)&oldid=1154028966.

21. Ross, S.M., Introduction to probability models. 2014: Academic press.

22. Hochreiter, S. and J. Schmidhuber, Long short-term memory. Neural computation, 1997. 9(8): p. 1735-1780.

23. Olah, C. Understanding lstm networks. 2015; Available from: http://colah.github.io/posts/2015-08-Understanding-LSTMs/.

24. Agarwal, S., et al. Lstm based music generation with dataset preprocessing and reconstruction techniques. in 2018 IEEE symposium series on computational intelligence (SSCI). 2018. IEEE.

25. Albawi, S., T.A. Mohammed, and S. Al-Zawi. Understanding of a convolutional neural network. in 2017 international conference on engineering and technology (ICET). 2017. Ieee.

26. Brunner, G., et al. JamBot: Music theory aware chord based generation of polyphonic music with LSTMs. in 2017 IEEE 29th International Conference on Tools with Artificial Intelligence (ICTAI). 2017. IEEE.

27. Brownlee, J. Sequence Classification with LSTM Recurrent Neural Networks in Python with Keras. 2016; Available from: https://machinelearningmastery.com/sequence-classification-lstm-recurrent-neural-networks-python-keras/?fbclid=IwAR1SfSCPCx84cHDHJyu_VRrD5bdzkagVFW5jvT9qYBWRELu8umdw1HvpX9c.

28. Kong, Q., K. Choi, and Y. Wang, Large-scale midi-based composer classification. arXiv preprint arXiv:2010.14805, 2020.

29. Krizhevsky, A., I. Sutskever, and G.E. Hinton, Imagenet classification with deep convolutional neural networks. Communications of the ACM, 2017. 60(6): p. 84-90.

30. Sainath, T.N., et al. Convolutional, long short-term memory, fully connected deep neural networks. in 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). 2015. Ieee.

31. Shah, F., T. Naik, and N. Vyas. LSTM based music generation. in 2019 International Conference on Machine Learning and Data Engineering (iCMLDE). 2019. IEEE.

01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文