帳號:guest(3.21.98.234)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目勘誤回報
作者:許琬晴
作者(英文):Wan-Qing Hsu
論文名稱:以大數據分析技術發展大學網站到訪者之行為分析
論文名稱(英文):A Web Visitor Behavior Analysis of A University Website Based on Big Data Analytics Techniques
指導教授:雍忠
指導教授(英文):Chung Yung
口試委員:原友蘭
李官陵
口試委員(英文):Yu-Lan Yuan
Guan-Ling Lee
學位類別:碩士
校院名稱:國立東華大學
系所名稱:資訊工程學系
學號:610821209
出版年(民國):110
畢業學年度:109
語文別:英文
論文頁數:79
關鍵詞:大數據網路探勘區別分析
關鍵詞(英文):Big dataWeb miningDiscriminant analysis
相關次數:
  • 推薦推薦:0
  • 點閱點閱:29
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:4
  • 收藏收藏:0
本論文的目標是利用大數據分析技術,找出大學申請數量與大學系網站瀏覽行為的網絡日誌之間的關聯性。為了找出它們之間的關係,我們將這些國家分成了三類並使用了區別分析。我們利用前四個申請時間的數據來驗證2021春季班的申請結果,準確率為79.6%。最後,我們根據可行的宣傳策略來做出結論。

在本論文中,我們利用分析結果將網站到訪者所在的國家分成三類:1) 申請件數= 0,2) 申請件數= 1,3) 申請件數> 1。我們使用的方法分成三個階段。首先,我們使用Incidental and Frequent User (IFU) 分析將所有到訪者分成兩組;分別是經常性到訪者和偶發性到訪者。其次,我們定義了七個有影響力的因素與IFU結果做排列組合,並在Visitor Browsing Behaviors (VBB) 分析中計算它們的影響。最後,我們對VBB分析的結果進行了區別分析,並找出較大影響的因素。

我們分析了春季班及秋季班的行為模式和整體行為模式。我們使用2019春季班到2020秋季班的數據進行實驗,總共有11,182,613條紀錄和2,246,882名到訪者。我們利用這些數據來預測2021春季班的申請件數類別。最後,我們利用四個班級的整體到訪者行為分析結果,進行國家申請件數類別的預測。我們得到了79.6%的分析準確率。而如果只使用2019春季班與2020春季班的數據進行2021春季班的結果進行預測,準確率則降低到77.7%。

我們希望在這個預測結果出來後,我們可以第一時間知道各國在申請時間結束可能的申請情況。如果本次申請某個國家的到訪情況不符合預期,我們可以利用剩餘兩個月的申請時間內通過網路廣告或影片來在該國家進行宣傳。希望未來能加入其他大學或科系的數據,用不同學校或同一個科系的結果來觀察瀏覽條件和宣傳策略的差異。
The goal of this thesis is to use big data analysis technology to analyze the relationship between the number of applications from various countries and the web log of the browsing behavior on university department websites. In order to explore their relationship, we decided to divide the countries into three groups and use discriminant analysis to analyze them. We used the data from the first four semesters to establish a model to verify the results of the 2021 spring semester, and the experimental accuracy rate is 79.6%. Then based on the results, we propose feasible strategies to make conclusions.

In this thesis, we use the analysis results to classify the countries, from which the visitors of the website, into three categories: 1) = 0, we accept to the application from the countries, 2) = 1, we accept exactly one application from each of the countries, and 3) > 1, we accept more than one application from each of the countries. Our methodology includes three plases. First, we use Incidental and Frequent User (IFU) analysis to classify all visitors into two groups; namely, frequent visitors and incidental visitors. Second, we define seven influential factors for IFU and experiment on their impact in the Visitor Browsing Behaviors (VBB) analysis. Finally, we perform the discriminant analysis with the influential factors with the best indication (indices) in the VBB analysis.

We analyze the behavior patterns for spring semesters and fall semesters and the overall behavior pattern. We use the data from the 2019 spring semester to the 2020 fall semester for experiments. There are a total of 11,182,613 records and 2,246,882 visitors. We use these data to predict the number of applications for the 2021 spring semester. Finally, we use the analysis result of the overall visitor behavior of the four semesters and predict on the classification of countries based on thesis number of applications. We get the analysis accuracy rate of 79.6%, while the accuracy rate is reduced to 77.7% with result of 2019 and 2020 spring data only.

We hope that after the results of this prediction, we can instantly know the application status of those countries from which. If the visit status of a certain country during this semester is not as expected, we can use online advertisement and videos to promote the country in the remaining two months of application time. It is hoped that in the future, we will be able to add data from other universities or departments and use different universities or the same department to observe the differences in browsing conditions and publicity strategy.
1 Introduction   1
2 Background   5
2.1 Big Data   5
2.2 Big Data Analytics Architecture   7
2.3 Web Mining   8
2.4 Discriminant Analysis   9
3 Visitor Classi cation   11
3.1 Overall Analysis Framework   11
3.2 Incidentals and Frequents User Analysis   13
3.2.1 Definition   14
2.2.2 Analysis Algorithm   16
3.3 Seven Influential Factors   19
3.4 Visitor Browsing Behavior Analysis: Based on Seven Influential Factors   22
4 Visitor Behavior Analysis   29
4.1 Combinations of Independent Variables   29
4.2 Partial Least Squares Regression Analysis   30
4.3 Discussion on PLS Regression  35
4.4 Preparation For the Discriminant Analysis  36
4.5 Discriminant Analysis in SPSS  38
5 Strategy Based on Analysis Results  53
5.1 Strategy of Pattern B  58
5.1.1 Case1:Indonesia  58
5.1.2 Case2:Kyrgyzstan  59
5.1.3 Case3:Malawi  60
5.1.4 Case4:Mongolia  62
5.1.5 Case5:Pakistan  63
5.1.6 Case6:SouthAfrica  64
5.2 Strategy of Pattern C  66
5.2.1 Case1:Gambia  66
5.2.2 Case2:Haiti  67
5.2.3 Case3:Nigeria  68
6 Discussion and Conclusion  71
6.1 Program Running  71
6.2 Analysis result  72
6.3 Conclusion   73
[1] J. Sun, C. Zhang, L. Ou (2021). Towards Visualized User Profile Analysis from Massive Web Log. 2021 IEEE 6th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), pp. 281-286.
[2] H. Chen & Y. Xiao (2021). Research on The Analysis of Users' Behavior Based on Big Data. 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), pp.184-187.
[3] Meng-Yuan Wu (2018). A new web visitor behavior analysis based on
big data analytics techniques. (Master's thesis, Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien County). Retrieved from https://hdl.handle.net/11296/343rte
[4] Hsieh, Yi-Wei (2019). Exploring the Position of Travel Tourism Competitiveness in the Relationship between Web Browsing History and the Number of Visitors - Taking the 2018 Taiwan Lantern Festival as an Example. (Master's thesis, Graduate Institute of Sport, Leisure and Hospitality Management, National Taiwan Normal University, Taipei City). Retrieved from https://hdl.handle.net/11296/gfky33
[5] Chia-Ching Chen (2018). Big Data Analysis for Largest Combination of Frequently Visited Web Pages Based on Web Log Data. (Master's thesis, Department of Computer Science and Information Engineering, National Dong Hwa University, Hualien County). Retrieved from https://hdl.handle.net/11296/3cx82m
[6] M. Kumar and Meenu, "Analysis of visitor's behavior from web log using web log expert tool," 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA), 2017, pp. 296-301
[7] B. M. Gayathri and C. P. Sumathi, "Feature selection using Linear Discriminant Analysis for breast cancer dataset," 2018 IEEE International Conference on Computational Intelligence and Computing Research (IC- CIC), 2018, pp. 1-5
[8] J. Ghosh and S. B. Shuvo, "Improving Classification Model's Performance Using Linear Discriminant Analysis on Linear Data," 2019 10th International Conference on Computing, Communication and Networking Technologies (ICCCNT), 2019, pp. 1-5
[9] Min Chen, Shiwen Mao, Yin Zhang, Victor C.M. Leung (2014). Big data: related technologies, challenges and future prospects. Springer, Cham.
[10] Fatima-Zahra Benjelloun, Ayoub Ait Lahcen, and Samir Belfkih (2015). An overview of big data opportunities, applications and tools. 2015 Intelligent Systems and Computer Vision (ISCV), 2015, pp. 1-6.
[11] T. Chen, S. Rao and J. Hong, "Research on the Development of Maritime and Air Intelligence Big Data," 2020 6th International Conference on Big Data and Information Analytics (BigDIA), 2020, pp. 367-371
[12] A. Juneja and N. N. Das, "Big Data Quality Framework: Pre-Processing Data in Weather Monitoring Application," 2019 International Conference on Machine Learning, Big Data, Cloud and Parallel Computing (COMITCon), 2019, pp. 559-563
[13] John Wiley & Sons (2015). Data Science & Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. Indianapolis : EMC Education Services.
[14] Chung Yung (2015). Mining Massive Web Log Data of an Official Tourism Web Site as a Step towards Big Data Analysis in Tourism. Proceedings of the 5th ASE International Conference on Big Data (BigData 2015), (Article F3-03). Kaohsiung, Taiwan, R.O.C.
[15] Oren Etzioni (1996). The World-Wide Web: quagmire or gold mine? Commun. ACM 39, 11 (Nov. 1996), 65{68.
[16] Yeqing Li (2017). Research on Technology, Algorithm and Application of Web Mining. 2017 IEEE International Conference on Computational Science and Engineering (CSE) and IEEE International Conference on Embedded and Ubiquitous Computing (EUC), 2017, pp. 772-775.
[17] Jokar, Nasrin & Honarvar, Ali & AgHAMIRZADEH, Shima & ESFANDIARI, Khadijeh (2016). Web mining and Web usage mining techniques. Bulletin de la Societe Royale des Sciences de Liege, 85, 321-328.
[18] R. Cooley, B. Mobasher and J. Srivastava, "Web mining: information and pattern discovery on the World Wide Web," roceedings Ninth IEEE International Conference on Tools with Artificial Intelligence, 1997, pp. 558-567.
[19] B. Singh and H. K. Singh, "Web Data Mining research: A survey," 2010 IEEE International Conference on Computational Intelligence and Computing Research, 2010, pp. 1-10.
[20] Wolfgang Karl Härdle, Léopold Sima (2015). Applied Multivariate Statistical Analysis. Springer, Berlin, Heidelberg.
[21] Huberty, C. J. (1994). Applied discriminant analysis. New York : John Wiley and Sons.
[22] Vincenzo Esposito Vinzi, Wynne W. Chin, Jörg Henseler, Huiwen Wang (2010). Handbook of Partial Least Squares. Springer, Berlin, Heidelberg.
[23] Keith McCormick, Jesus Salcedo (2017). SPSS Statistics for Data Analysis and Visualization. Indianapolis, IN : John Wiley and Sons.
[24] Geoffrey J. McLachlan (2004). Discriminant Analysis and Statistical Pattern Recognition. Wiley Series in Probability and Statistics.
[25] V. M. Jerković, V. Kojić and M. B. Popović (2015). Linear discriminant analysis: Classification of on-surface and in-air handwriting. 2015 23rd Telecommunications Forum Telfor (TELFOR), 2015, pp. 460-463.
[26] J. Zhang, "Research on Big Data Storage Structure and Query Optimization," 2017 International Conference on Computer Systems, Electronics and Control (ICCSEC), 2017, pp. 1508-1511.
(此全文20241024後開放外部瀏覽)
01.pdf
 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *