帳號:guest(          離開系統
字體大小: 字級放大   字級縮小   預設字形  


作者(英文):Meng-Yuan Wu
論文名稱(英文):A New Web Visitor Behavior Analysis Based on Big Data Analytics Techniques
指導教授(英文):Chung Yung
口試委員(英文):Yu-Lan Yuan
Wuu Yang
關鍵詞(英文):Big Data AnalysisWeb Visitor Analysis
  • 推薦推薦:0
  • 點閱點閱:49
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:11
  • 收藏收藏:0
許多大數據的研究和討論,分別針對不同領域資料來源的大數據,提出各種的 分析方法。在雍教授發表的論文中,已正式證明網站日誌資料是大數據。本論文 進一步以大數據分析技術針對網站日誌資料進行分析,並進而發展新的網站到訪 者之行為分析。 依照李、原、雍三位教授的討論,網站到訪者可以分類為經常性到訪者和偶發 性到訪者。活動宣傳可以為網站帶來更多新的到訪者,而這些新的到訪者中,許 多會成為活動的實際出席者。這種經常性和偶發性到訪者的分類,引發了我們的 研究動機。本論文以大數據分析技術,發展新的分析方法,稱為 IFU,將網站到訪 者分類成經常性和偶發性兩個類別。 我們針對 IFU 分析所需完成的工作提出一個完整架構,並依此架構把 IFU 分析 程式開發出來,此架構共分為五個階段:一、定義階段,二、中間資料設計,三、 設計運算中間資料的演算法 PM,四、根據分類基準設計演算法 PC,五、執行 PC將 網站到訪者進行分類。 在本論文中,我們使用 2016 年 1 月 1 號到 2017 年 12 月 31 號的網站日誌資料 進行實驗。這兩年的網站日誌資料共有 1,092,370,562 筆紀錄。在這完整的兩年 當中,總共有 7,185,433 位到訪者。我們的 IFU 分析程式成功的將這完整兩年的 到訪者分類為 551,614 位經常性到訪者和 6,633,819 位偶發性到訪者。
Many big data researches and discussions have proposed various analysis methods for big data from various sources. In the paper published by Professor Yung, the web log data has been officially proved to be big data. In this thesis, we use big data analysis technology to analyze the web log data, and develop a new behavior analysis for web visitors. According to the discussion of Professors Li, Yuan and Yung, web visitors can be classified into frequent visitors and incidental visitors; activity promotion can bring more new incidental visitors to the websites. Part of the new visitors will potentially become the actual attendants. This classification of frequent and incidental visitors has sparked our research motivation. This thesis uses big data analysis technology to develop a new analysis method called IFU, which classifies web visitors into two categories: frequent visitors and incidental visitors. According to the work needed to be done in developing the IFU program, we present a five-phase framework: 1. Definition phase, 2. Meta data design, 3. Design algorithm PM to compute Meta data, 4. According to the classification criteria to design algorithm PC. 5. Execute PC to classify the web visitors. In this thesis, we use the web log data from January 1, 2016 to December 31, 2017 for experiments. In the spanning time of two full years, there are 1,092,370,562 records in total. These records are from 7,185,433 distinct web visitors. Our IFU analysis successfully classifies the web visitors into 551,614 frequent visitors and 6,633,819 incidental visitors.
ABSTRACT .........................................3
Chapter 1 引言與動機 ...............................9
Chapter 2 背景與文獻探討 ...........................11
2.1 大數據分析 ....................................11
2.2 基於網站日誌的旅遊大數據分析 ......................13
2.3 網站探勘 ......................................14
2.4 基於網站日誌的分析 ..............................15
Chapter 3 將到訪者分類為類別的大數據分析 ..............17
3.1 中間資料設計 ...................................19
3.2 分類程式設計 ...................................21
Chapter 4 實驗和實作 ...............................24
4.1 實作 .........................................25
4.2 討論 ..........................................33
Chapter 5 案例研究:出席者預測 .......................34
5.1 應用一:2017 年台灣燈會各類到訪者統計 ..............34
5.2 應用二:2018 年台灣燈會各類到訪者統計 ..............37
5.3 線性分析 .......................................38
5.4 討論 ..........................................40
Chapter 6 結論.....................................43
REFERENCE ........................................44
[1] Chung Yung. Mining massive web log data of an official tourism web site as a step towards big data analysis in tourism. In Proceedings of the ASE BigData & SocialInformatics 2015, ASE BD&SI ’15, pages 62:1–62:4, New York, NY, USA, 2015. ACM
[2] Chung Yung, Ching Li, and Yu-Lan Yuan. IFU method of website visitor analysis using big data analytics techniques. An early draft, to be published soon, 5 2018.
[3] Fatima-Zahra Benjelloun, Ayoub Ait Lahcen, and Samir Belfkih. An overview of big data opportunities applications and tools.
[4] Data Science and Big Data Analytics: Discovering, Analyzing, Visualizing and Presenting Data. John Wiley and Sons, Inc., 2015.
[5] Waleed Albattah. The role of sampling in big data analysis. In Proceedings of the International Conference on Big Data and Advanced Wireless Technologies, BDAW ’16, pages 28:1–28:5, New York, NY, USA, 2016. ACM.
[6] N. Cassavia, P. Dicosta, E. Masciari, and D. Sacc. Improving tourist experience by big data tools. In 2015 International Conference on High Performance Computing Simulation (HPCS), pages 553–556, July 2015.
[7] Seebode C, Ort M, Laat C, Regenbrecht C, and Peuker M. Big data infrastructures for pharmaceutical research. Big Data-2013 IEEE International Conference in Silicon Valley, 2013.
[8] Sanaz Shafiee and Ali Rajabzadeh Ghatari. Big data in tourism industry.
[9] Raymond Kosala and Hendrik Blockeel. Web mining research: A survey. SIGKDD Explor. Newsl., 2(1):1–15, June 2000.
[10] T. Hussain, S. Asghar, and N. Masood. Web usage mining: A survey on 45 preprocessing of web log file. In 2010 International Conference on Information and Emerging Technologies, pages 1–6, June 2010.
[11] R. Cooley, B. Mobasher, and J. Srivastava. Web mining: information and pattern discovery on the world wide web. In Proceedings Ninth IEEE International Conference on Tools with Artificial Intelligence, pages 558– 567, Nov 1997.
[12] D. S. Sisodia and S. Verma. Web usage pattern analysis through web logs: A review. In 2012 Ninth International Conference on Computer Science and Software Engineering (JCSSE), pages 49–53, May 2012.
[13] Jaideep Srivastava, Robert Cooley, Mukund Deshpande, and Pang-Ning Tan. Web usage mining: Discovery and applications of usage patterns from web data. SIGKDD Explor. Newsl., 1(2):12–23, January 2000.
[14] Chia-Ching Chen. Big data analysis for extract frequent webpage combination from weblog using apriori algorithm. Master’s thesis, National Dong Hwa University, 7 2018.
[15] G. Chareyron, J. Da-Rugna, and T. Raimbault. Big data: A new challenge for tourism. In 2014 IEEE International Conference on Big Data (Big Data), pages 5–7, Oct 2014.
[16] H. Yin and Y. Zhu. The influence of big data and informatization on tourism industry. In 2017 International Conference on Behavioral, Economic, Socio-cultural Computing (BESC), pages 1–5, Oct 2017.
[17] John Bertot and Heeyoon Choi. Big data and e-government: Issues, policies, and recommendations. In ACM International Conference Proceeding Series, pages 1–10, 06 2013.
第一頁 上一頁 下一頁 最後一頁 top
* *