帳號:guest(3.143.241.159)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目勘誤回報
作者:張鶴
作者(英文):Ho Chang
論文名稱:具屬性分類之短文評論情緒分析研究
論文名稱(英文):A Study of Sentiment Analysis of Short Chinese Comments with Attribute Classification
指導教授:李官陵
指導教授(英文):Guan-Ling Lee
口試委員:羅壽之
張耀中
口試委員(英文):Shou-Chih Lo
Yao-Chung Chang
學位類別:碩士
校院名稱:國立東華大學
系所名稱:資訊工程學系
學號:610521213
出版年(民國):107
畢業學年度:106
語文別:中文
論文頁數:32
關鍵詞:情緒分析短文評論屬性分類字典法機器學習法
關鍵詞(英文):Sentiment AnalysisShort Chinese CommentsAttribute ClassificationLexicon-basedMachine Learning
相關次數:
  • 推薦推薦:0
  • 點閱點閱:30
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:6
  • 收藏收藏:0
科技日新月異,技術瞬息萬變,現今已是一個資訊爆炸的時代,人與人之間的溝通已然從書信轉變成社群軟體,通訊不再受到距離的限制,也導致網路上充斥著各式各樣的文章,小至商品評論,大至論文期刊,皆可在網路上瀏覽取用,人們已經離不開網路了,然而要在充斥著雜訊的資料海中找到對我們有用的知識實屬不易,故而近年來大數據、資料探勘以及機器學習逐漸被大家所重視。
有鑑於此,本文目的即為建立多屬性短文情緒分析模式,先使用NTUSD以及知網之辭典建立詞庫,而後收集大量資料作為訓練資料並交由人工標註屬性後,再將訓練資料交由CKIP中文斷詞系統將短文斷詞後,交由機器分析各屬性關鍵詞以及情緒詞之極性,分別建立關鍵詞庫以及擴充正負詞庫情緒詞之極性,為驗證本文提出方法之正確性,使用人工評分後的分數與本文方法算出之分數進行比較,再將本文系統之簡單貝氏分類器移除並改用原始正負詞庫資料做情緒分析,並與本文方法做比較,實驗結果顯示本文提出之方法有一定的準確性。
Nowadays, technology is changing with each passing day. The communication way among people has changed from writing letters to using social network, and it is no longer limited by distance. As people get used to using Internet, from product reviews to paper journals, there are various articles which are available to be searched on the Internet. Internet has already got into people’s lives. However, it is not easy to figure out useful knowledge in such a large amount of data. Thus, recently, big data, data mining and machine learning have been valued gradually.
In the light of this, the purpose of this paper is to establish a multi-attribute short article sentiment analysis mode. In the beginning, we use NTUSD and HOWNET dictionary to build a vocabulary database, and then collect a large amount of data as training data, hand-mark the attributes, and then pass the data to CKIP to disassemble the short comment. The machine will analyze the polarity of each attribute keyword and emotional letters, establish the keyword database and expand the polarity of emotion letters in positive and negative vocabulary database. To verify the correctness of the proposed method, we compare the difference between the scores obtained by scoring manually and the scores calculated by applying the method in this paper. Then remove the naive Bayesian classifier of the system and use the original negative lexicon data for sentiment analysis, and compare the result obtained by scoring manually with the result obtained by applying the method of this paper. The experimental results show that the method proposed in this paper has certain accuracy.
第一章 前言 1
第二章 相關研究 4
第一節 自然語言處理 4
第二節 情緒辭典 5
第三節 中文斷詞系統 6
第三章 研究方法 7
第一節 資料收集 8
第二節 斷句 9
第三節 特徵詞擷取 11
第四節 情緒辭典之建立 12
第五節 情緒分數之計算 16
第六節 情緒分數之轉換 17
第四章 實驗結果 18
第一節 實驗評估 18
第二節 實驗結果 19
第三節 人工評分與系統比對 21
第四節 系統比較實驗 25
第五節 實驗結論 28
第五章 未來展望 29
第六章 參考文獻 30
[1]Akter, S., & Aziz, M. T. (2016). Sentiment analysis on facebook group using lexicon based approach. 2016 3rd International Conference on Electrical Engineering and Information Communication Technology (ICEEICT), pp. 1-4.
[2]AT&T Bell Lab, Hill, M., & NJ, USA. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), pp. 257-286.
[3]Cambridge Dictionary. Retrieved from https://dictionary.cambridge.org/
[4]Cohen, J. (1968). Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial credit. Psychological Bulletin, 70(4), pp. 213-220.
[5]Gautam, G., & Yadav, D. (2014). Sentiment analysis of twitter data using machine learning approaches and semantic analysis. International Conference on Contemporary Computing.
[6]Gobinda , G. C. (2003). Natural language processing. Annual Review of Information Science And Technology, 37(1), pp. 51-89.
[7]Google translate. Retrieved from https://translate.google.com.tw/
[8]Hu, M., & Liu, B. (2004). Mining and summarizing customer reviews. ACM SIGKDD international conference on Knowledge discovery and data mining, 10, pp. 168-177.
[9]Jieba中文分詞系統. 擷取自 https://github.com/fxsjy/jieba
[10]Ku, L.-W., & Chen, H.-H. (2007). Mining Opinions from the Web: Beyond Relevance Retrieval. Journal of American Society for Information Science and Technology, Special Issue on Mining Web Resources for Enhancing Information Retrieval, 58(12), pp. 1838-1850.
[11]Ma, W.-Y., & Chen, K.-J. (2003). Introduction to CKIP Chinese Word Segmentation System for the First International Chinese Word Segmentation Bakeoff. Proceedings of ACL, Second SIGHAN Workshop on Chinese Language Processing, pp. 168-171.
[12]Pang, B., Lee, L., & Vaithyanathan, S. (2002). Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP '02 Proceedings of the ACL-02 conference on Empirical methods in natural language processing, 10, pp. 79-86.
[13]Rabiner, L., & Juang, B.-H. (1993). Fundamentals of Speech Recognition.
[14]Taboada, M., Brooke, J., Tofiloski, M., Voll, K., & Stede, M. (2011). Lexicon-Based Methods for Sentiment Analysis. Computational Linguistics, 37(2), pp. 267-307.
[15]Why split data in the ratio 70:30? (2012). Retrieved from Information Gain Ltd: http://information-gain.blogspot.com/2012/07/why-split-data-in-ratio-7030.html
[16]百度翻譯. 擷取自 https://fanyi.baidu.com/
[17]李啟菁. (2010). 中文部落格文章之意見分析. 國立台北大學資訊工程學系碩士班碩士學位論文, 頁 1-52.
[18]杨鼎, & 阳爱民. (2010). 一种基于情感词典和朴素贝叶斯的中文 文本情感分类方法. 计算机应用研究, 27(10), 頁 3737-3740.
[19]林彩雯. (2015). 以Google App評論為字詞權重調整之情緒分析系統. 靜宜大學資訊管理學系碩士論文, 頁 1-41.
[20]欧阳纯萍, 阳小华, 雷龙艳, 徐强, 余颖, & 刘志明. (2014). 多策略中文微博细粒度情绪分析研究. 北京大学学报(自然科学版), 50(1), 頁 67-72.
[21]知網. 擷取自 (Hownet): http://www.keenage.com/html/c_index.html
[22]張育容. (2012). 使用情緒分析於圖書館使用者滿意度評估之研究. 國立中興大學圖書資訊學研究所碩士論文, 頁 1-132.
[23]陳淑芬, & 陳力綺. (2017). 現代漢語否定詞「不」和「沒」的句法、語意和言談/語用特點及其教學應用. University System of Taiwan Working Papers in Languistics, 9, 頁 189-202.
[24]戴廷芳. (2014). 2020年全球資料量將成長至44ZB. 擷取自 iThome: https://www.ithome.com.tw/article/87190
[25]謝維宸. (2018). 短文評論之情緒分析研究. 國立東華大學資訊工程學系碩士論文, 頁 1-37.
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *