藉由神經網路回答國小社會科段考題__國立東華大學博碩士論文全文影像系統

帳號：guest(3.145.199.244) 離開系統

字體大小：

詳目顯示

第 1 筆 / 共 1 筆

/1頁

論文基本資料
摘要
論文目次
參考文獻
電子全文

作者:	王柏皓
作者(英文):	Bo-Hao Wang
論文名稱:	藉由神經網路回答國小社會科段考題
指導教授:	吳建銘
指導教授(英文):	Jiann-Ming Wu
口試委員:	魏澤人黃延安
口試委員(英文):	Tzer-Jen Wei Yan-An Hwang
學位類別:	碩士
校院名稱:	國立東華大學
系所名稱:	應用數學系
學號:	610811001
出版年(民國):	110
畢業學年度:	109
語文別:	中文
論文頁數:	38
關鍵詞:	類神經網路、遷移式學習、問答系統、BERT、國小、社會科、是非題、選擇題
關鍵詞(英文):	neural network、transfer learning、question answering system、BERT、elementary school、social studies、true-false question、multiple choice question
相關次數:	推薦:0 點閱:41 評分: 下載:16 收藏:0

隨著類神經網路的發展，在圖像識別、語音辨識等諸多領域大有斬獲，在自然語言處理方面(NLP)也不落於人後，讓電腦了解自然語言，進而逐漸深入到方方面面，如：問答、語意分析、自然語言生成等，並且有令人稱奇的結果。

自然語言處理裏頭，有多種不同任務，而且模型通常很大，訓練屬於自己專項任務的自然語言模型是曠日廢時的。使用遷移式學習，使得我們得以把原先使用在另一項任務的模型，用於自己的任務，使模型可以一定程度的被通用；使用預訓練模型，並對他進行微調，讓他在我們的任務上表現良好，如此我們既可以減輕環境負擔又可以縮短訓練時長。在上述範疇中，以 BERT、GPT2、GPT3 最廣為人知。

本研究從全國中小學題庫網上取得國小三到六年級社會科段考考卷，並由眾多因格式、排版不一的考卷中抽取純文字的是非題與選擇題文本，再利用正規表達式對這些文本作分割和標準化並存成 json 格式的檔案，成為可以立即使用的資料集，並且我們也將此資料集公開給大眾取用。

我們在製作完資料集後，我們建立了以 GRU 為基礎的模型、以 BERT 為基礎的模型來回答國小社會科是非題與選擇題，並在以 BERT 為基礎的模型上嘗試了三種不同框架來回答選擇題，分別是將選擇題做四選一方式回答、將選擇題做二選一方式回答以及將選擇題當是非題回答，也使用 Facebook 的 fastText 建立簡單的模型作為四個模型的基線，我們也將以上共五個模型作正確率、訓練時間、消耗資源、參數儲存空間等比較。

最終我們分析了模型中錯誤的題目，發掘模型在回答選擇題選項中有以上皆◯的字樣時會比較容易答錯，且模型的正確率與題目文本中出現的生冷詞彙、訓練集的題庫大小相關，並以此結果作為將來為提升模型正確率的可行性評估。

1 簡介 1
1.1 動機 1
1.2 貢獻 2
1.3 結果 3
2 背景知識 5
2.1 斷詞 5
2.2 深度神經網路 7
2.3 LSTM 以及 GRU 7
2.4 Attention 8
2.5 Transformer 9
2.6 BERT 9
2.7 AdamW 10
3 相關研究 11
4 問題與方法 13
4.1 問題: 解決國小社會科是非題與選擇題 13
4.2 方法 13
4.2.1 資料集 13
4.2.2 使用 fastText 庫進行文本分類 14
4.2.3 使用 GRU 製作的模型 14
4.2.4 使用 BERT 製作的模型 15
5 實驗 17
5.1 蒐集資料 (是非題與選擇題) 17
5.2 模型用的資料集型態 19
5.3 訓練時輸入模型的資料 20
5.4 模型 23
5.4.1 GRU 為基礎的模型 23
5.4.2 BERT 為基礎的模型 23
5.5 損失函數 26
5.6 硬體、訓練與儲存 27
6 結果 29
6.1 正確率 29
6.2 訓練時長 32
7 總結 33

T. Beysolow II, What Is Natural Language Processing?, pp. 1–12. Berkeley,CA: Apress, 2018.

X. Rong, “word2vec parameter learning explained,” arXiv preprintarXiv:1411.2738, 2014.

I. Sutskever, O. Vinyals, and Q. V. Le, “Sequence to sequence learning withneural networks,” arXiv preprint arXiv:1409.3215, 2014.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez,L. Kaiser, and I. Polosukhin, “Attention is all you need,” arXiv preprintarXiv:1706.03762, 2017.

J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training ofdeep bidirectional transformers for language understanding,” arXiv preprintarXiv:1810.04805, 2018.

K. Cho, B. Van Merriënboer, C. Gulcehre, D. Bahdanau, F. Bougares,H. Schwenk, and Y. Bengio, “Learning phrase representations usingrnn encoder-decoder for statistical machine translation,” arXiv preprintarXiv:1406.1078, 2014.

F. A. R. lab, “fasttext.” https://fasttext.cc/. [14-July-2021 檢視為有效連結].

P. Clark, O. Etzioni, D. Khashabi, T. Khot, B. D. Mishra, K. Richardson,A. Sabharwal, C. Schoenick, O. Tafjord, N. Tandon, et al., “From’f’to’a’on theny regents science exams: An overview of the aristo project,” arXiv preprintarXiv:1909.01958, 2019.

國家教育研究院, “全國中小學題庫網.” https://exam.naer.edu.tw/. [07-July-2021 檢視為有效連結].

M. Saad, S. Aslam, W. Yousaf, M. Sehnan, S. Anwar, and D. Rehman, “Studenttesting and monitoring system (stms) using nlp.,” International Journal ofModern Education & Computer Science, vol. 11, no. 9, 2019.

T. H. F. Team, “Summary of the tokenizers.” https://huggingface.co/transformers/tokenizer_summary.html. [07-July-2021 檢視為有效連結].

S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural computation, vol. 9, no. 8, pp. 1735–1780, 1997.

R. Sennrich, B. Haddow, and A. Birch, “Neural machine translation of rarewords with subword units,” arXiv preprint arXiv:1508.07909, 2015.

M. Schuster and K. Nakajima, “Japanese and korean voice search,” in 2012IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP), pp. 5149–5152, IEEE, 2012.

T. Kudo and J. Richardson, “Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing,” arXivpreprint arXiv:1808.06226, 2018.

google, “Github - google/sentencepiece: Unsupervised text tokenizerfor neural network-based text generation..” https://github.com/google/sentencepiece. [07-July-2021 檢視為有效連結].

nltk, “nltk.tokenize package —nltk 3.6.2 documentation.” https://www.nltk.org/api/nltk.tokenize.html. [07-July-2021 檢視為有效連結].

fxsjy, “Github - fxsjy/jieba: 结巴中文分词.” https://www.nltk.org/api/nltk.tokenize.html. [07-July-2021 檢視為有效連結].

Ckip, “Github - ckiplab/ckiptagger: Ckip neural chinese word segmentation,pos tagging, and ner.” https://github.com/ckiplab/ckiptagger. [07-July2021 檢視為有效連結].

Droidtown, “Articut 中文斷詞暨詞性標記服務.” https://github.com/Droidtown/ArticutAPI. [07-July-2021 檢視為有效連結].

J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural networks, vol. 61, pp. 85–117, 2015.

K. Yao, T. Cohn, K. Vylomova, K. Duh, and C. Dyer, “Depth-gated recurrentneural networks,” arXiv preprint arXiv:1508.03790, vol. 9, 2015.

F. A. Gers, N. N. Schraudolph, and J. Schmidhuber, “Learning precise timingwith lstm recurrent networks,” Journal of machine learning research, vol. 3,no. Aug, pp. 115–143, 2002.

D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointlylearning to align and translate,” arXiv preprint arXiv:1409.0473, 2014.

Ckip, “ckiplab/bert-base-chinese ·hugging face.” https://huggingface.co/ckiplab/bert-base-chinese. [08-July-2021 檢視為有效連結].

D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXivpreprint arXiv:1412.6980, 2014.

I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXivpreprint arXiv:1711.05101, 2017.

google research, “google-research/bert: Tensorflow code and pre-trained modelsfor bert.” https://github.com/google-research/bert. [14-July-2021 檢視為有效連結].

misobelica, “Github - miso-belica/sumy: Module for automatic summarizationof text documents and html pages..” https://github.com/miso-belica/sumy.[09-July-2021 檢視為有效連結].

nltk, “Natural language toolkit —nltk 3.6.2 documentation.” https://www.nltk.org/index.html. [10-July-2021 檢視為有效連結].

G. A. M. et al., “Wordnet | a lexical database for english.” https://wordnet.princeton.edu/. [10-July-2021 檢視為有效連結].

F. A. R. lab, “Pytorch.” https://pytorch.org/. [11-July-2021 檢視為有效連結].

開源軟體，貢獻者詳見其網站, “Seleniumhq browser automation.” https://www.selenium.dev/. [11-July-2021 檢視為有效連結].

L. R. et al., “Beautiful soup: We called him tortoise because he taught us..”https://www.crummy.com/software/BeautifulSoup/. [11-July-2021 檢視為有效連結].

K. R. et al., “Requests: Http for humans™ —requests 2.25.1 documentation.”https://docs.python-requests.org/en/master/. [11-July-2021 檢視為有效連結].

J. X. McKie, “Pymupdf documentation —pymupdf 1.18.14 documentation.”https://pymupdf.readthedocs.io/en/latest/index.html. [11-July-2021檢視為有效連結].

fi. J. T. et al., “Requests: Http for humans™ —requests 2.25.1 documentation.”https://jupyter.org/about. [11-July-2021 檢視為有效連結].

T. Mikolov, I. Sutskever, K. Chen, G. Corrado, and J. Dean, “Distributed representations of words and phrases and their compositionality,” arXiv preprintarXiv:1310.4546, 2013.

01.pdf

推文
推薦
評分
引用網址
轉寄

top

詳目顯示

相關論文