作者(英文):Chia-Yu Kuo
論文名稱(英文):Using Compound Attributes to Identify Sockpuppets in a Forum-based Social Media Website
指導教授(英文):Ying-Ho Liu
口試委員(英文):Yao-Tang Lin
Jia-Li Hou
關鍵詞(英文):Forum-based Social Media WebsiteSockpuppetsAccount Identification
As the Internet has flourishesd, the use of social media is are indispensable toan integral part of our livesdaily life, . It is observed that a user may own multiple accounts (known as sockpuppets) in a social media website (especially a forum-based one) to advertise products, spread junk information, arouse controversy, etc. but single user registers multiple accounts (called sockpuppets) to spread spam information in social media.This phenomenon is more obvious in the forum-based social media.Identifying multiple accounts registered by a single user is a key step in solving this problem. However, most of the currentexisting research studies is focus on to identifying multiple accounts owned by a single user in cross social mediaacross several social media websites, instead of being in a single website. ThereforeTo address this research gap, our studywe proposes SiMAIM identification method to find a user who has sockpuppets of a user in a forum-type based social media website. We verify the SiMAIM on the data collected from the Mobile01, which is the largest use the Taiwan forum-based social media Mobile01 to verify SiMAIMwebsite in Taiwan. By collecting basic account information, posting content and constructing social network of accounts , we canthe SiMAIM effectively identifiesy sockpuppets created by a single user.
第一章 緒論 1
1.1 研究背景與動機 1
1.2 研究問題與目的 3
第二章 文獻探討 5
2.1 跨社群媒體的使用者身分識別 5
2.2 偵測社群媒體中的垃圾資訊發送者 10
第三章 研究方法 12
3.1 研究架構 12
3.2 資料蒐集 13
3.3 基本資料相關屬性 14
3.4 文章內容相關屬性 15
3.4.1 去除停用詞 16
3.4.2 中文斷詞 16
3.4.3 Term Frequency Inverse Document Frequency 17
3.4.4 LDA 18
3.5 社交網絡相關屬性 20
3.6 辨識分身帳號 25
3.6.1 線性加總 26
3.6.2 分類模型 27
第四章 實驗結果 29
4.1 實驗環境 29
4.2 實驗資料 29
4.3 資料處理 29
4.4 評估指標 32
4.5 實驗結果 32
4.5.1 線性加總實驗結果 33 使用線性加總於第一筆測試資料實驗結果 34 使用線性加總於第二筆測試資料實驗結果 36 使用線性加總於第三筆測試資料實驗結果 38
4.5.2 分類模型實驗結果 40 使用分類模型於第一筆測試資料實驗結果 41 使用分類模型於第二筆測試資料實驗結果 43 使用分類模型於第三筆測試資料實驗結果 45
4.6 比較實驗 48
4.6.1 使用KLD於第一筆測試資料實驗結果 48
4.6.2 使用KLD於第二筆測試資料實驗結果 50
4.6.3 使用KLD於第三筆測試資料實驗結果 51
第五章 結論 54
參考文獻 55
