作者(英文):Zi-Hui Lin
論文名稱(英文):Detection of Mood State of People with Depression: Analyzing PTT Bulletin Board System Articles by Machine Learning
指導教授(英文):Shih-Kuang Chiang
口試委員(英文):Wan-Lan Chen
Shiau-Hua Liu
關鍵詞(英文):DepressionMoodSuicide ideationSocial mediaMachine learning
本研究的結果顯示情緒狀態偵測模型預測能力可達AUC(Area Under the Receiver Operating Characteristic Curve) = .889。而自殺意念偵測模型預測能力可達AUC = .964,但AUPRC(Area Under the Precision-Recall Curve) = .315,顯示該模型對於偵測有自殺意念的使用者預測能力較低,但因臺灣尚未有相關研究,本研究的初步探索可供借鏡。而有和無憂鬱傾向者的文本特性差異分析結果顯示於發文時間、人稱代名詞使用頻率和發文字數有明顯差異,並且於快樂情緒狀態時的差異較大,但表達自殺意念時的差異不明顯。本研究結果未來可能運用於早期偵測和心理衡鑑資訊蒐集,以及追蹤治療後情緒變化,以協助臨床決策判斷等。但本研究因受限於難以取得適當且足夠的自殺相關文本資料,相關分析結果待後續研究進一步檢驗和改善。
The purpose of this study was to have a better understanding of people with depressive tendencies by analyzing social media data, especially some data probably from users who never had contact with psychiatric services. Another aim was to use machine learning techniques to improve the efficiency of psychiatric clinical practice.
This study’s data were from the internet forum “PTT” with a total of 92,273 articles. Among these, there were 365 annotated articles expressing suicide ideation. This study used machine learning techniques to build the “Mood state detection model” (detection of happy-sad mood) and the “Suicide ideation detection model” for people with depressive tendencies. And this study analyzed data with variables of the timing of texts, frequency of personal pronoun use, and word counts by independent t-test and chi-square testing to know the difference of textual features between people with and without depressive tendencies in general, happy, sad, and suicide-ideation states.
The results of this study were that the mood state detection model’s AUC(Area Under the Receiver Operating Characteristic Curve) = .889. And the suicide ideation detection model’s AUC = .964, but AUPRC(Area Under the Precision-Recall Curve) = .315. It means this model has low predictability for suicide ideation of people with depressive tendencies. However, there is no related research in Taiwan, the preliminary exploration of this model can be for future reference. The statistical results show there were significant differences of textual features between users with and without depressive tendencies in the timing of texts, frequency of personal pronoun use, and word counts, especially when users were expressing happy feelings, but no significant differences when users were expressing suicide ideation. The results of this study have the potential to be early detection tools, help collect information for psychological assessment, and track emotion trends after medical therapy for enhancing clinical decision-making. However, this study couldn’t get enough representative data about suicide ideation, related results need to be further inspected and improved by follow-up research.
第一章 緒論 1
1-1、研究緣起與動機 1
1-2、研究目的與假設 2
1-3、名詞解釋 5
第二章 文獻探討 15
2-1、應用機器學習技術分析社群媒體資料運用於臨床心理領域 15
2-2、應用機器學習技術以文本資料偵測情緒狀態 16
2-3、應用機器學習技術以文本資料偵測自殺意念 18
2-4、跨國比較有和無憂鬱傾向者文本特性差異 19
第三章 研究方法 23
3-1、文本資料描述和預處理 23
3-2、「情緒狀態偵測模型」和「自殺意念偵測模型」建立步驟 28
3-3、有和無憂鬱傾向者文本特性差異分析 32
第四章 研究結果 35
4-1、情緒狀態偵測模型 35
4-2、有和無憂鬱傾向者的文本特性差異 37
4-3、有和無憂鬱傾向者表達快樂和悲傷情緒狀態時的文本特性差異 41
4-4、自殺意念偵測模型 49
4-5、有和無憂鬱傾向者表達自殺意念時的文本特性差異 51
第五章 討論 55
5-1、情緒狀態偵測模型和相關文本特性分析 55
5-2、自殺意念偵測模型和相關文本特性分析 59
5-3、研究貢獻與未來應用 60
5-4、研究限制與未來研究方向 62
第六章 結論 65
參考文獻 67
附錄 75
