帳號:guest(18.217.180.169)          離開系統
字體大小: 字級放大   字級縮小   預設字形  

詳目顯示

以作者查詢圖書館館藏以作者查詢臺灣博碩士論文系統以作者查詢全國書目勘誤回報
作者:楊宗翰
作者(英文):Zong-Han Yang
論文名稱:普及化串流大數據樣式偵測與分析服務及動態分享機制研究
論文名稱(英文):Streaming Big Data Pattern Analytics Service Popularization and Dynamic Query Sharing
指導教授:吳秀陽
指導教授(英文):Shiow-Yang Wu
口試委員:孫宗瀛
謝鴻琳
口試委員(英文):Tsung-Ying Sun
Horng-Lin Shieh
學位類別:碩士
校院名稱:國立東華大學
系所名稱:資訊工程學系
學號:610821245
出版年(民國):112
畢業學年度:111
語文別:中文
論文頁數:75
關鍵詞:物聯網串流大數據樣式偵測服務普及化動態分享
關鍵詞(英文):IOTStreaming Big dataPattern detectionService popularizeDynamic sharing
相關次數:
  • 推薦推薦:0
  • 點閱點閱:11
  • 評分評分:系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔系統版面圖檔
  • 下載下載:11
  • 收藏收藏:0
隨著各式各樣IoT設備與網路服務App的推陳出新,帶來串流資料與應用的蓬勃發展。其中絕大多數應用,都需要針對串流資料的變化樣式和未來趨勢進行動態分析和辨識,進而即時觸發適當的反應或服務。但是目前的串流資料樣式偵測工具都須有一定的專業知識才能掌握,一般人很難使用。同時許多工具都只停留在針對少量有限來源的數值偵測與分類,對於複雜的多來源組合變化趨勢,通常只能透過具備寫程式能力的使用者才能夠進行客製化偵測分析。
本論文的第一個目的,是提出一個多來源組合串流樣式偵測分析服務,並且讓一般大眾都能夠非常容易的使用而不需要具備串流資料處理專業知識和程式設計能力。我們在本實驗室學長Henry Gunawan所提出的SIFTTT[14]基礎上進行了擴充,提供多種常用串流樣式分析和組合辨識服務。
面對大量使用者的服務需求,當串流資料來源相同且偵測條件一致或相容時,提供了分析服務共享的機會,以避免重複計算,提升偵測效率。現有的多查詢分享方法大部分都還停留在批次處理,或是有需求的時候才進行比對或計算查看是否有符合的結果。這些方法並不適合用在串流資料動態處理和查詢分享上。
本論文的第二個目的,是設計一套高效能串流資料多查詢動態分享機制。我們參考了Shuping Ji和Hans-Arno Jacobsen所提出的A-Tree方法[28],提出一個新的查詢處理分享網路架構,稱之為Query Sharing Forest (QS-Forest),讓每個條件偵測判斷結果,能夠自動擴散到所有相關節點上,避免重複偵測計算。我們還提出了一個Switch機制,對於相同來源的數值只需要檢測一遍就能夠得到所有針對該來源的數值查詢條件成立與否的判斷,並配合動態分享機制有效擴散那些已成立的條件,滿足所有查詢成立與否的即時判斷。
經過實作和多面向實驗結果證實,我們所提出的架構和方法能夠成功地提供串流大數據樣式分析服務普及化,同時達到動態查詢分享,有效提升效能。


With the proliferation of IoT devices and Web Apps, the demand and development of streaming data technologies and applications are booming. Most of these applications need to dynamically analyze and identify the changing patterns and future trends of streaming data, so as to trigger appropriate responses or services in real time. However, the current streaming pattern detection tools require a certain degree of professional knowledge to master, and are difficult to be used by the general public. Meanwhile, many tools only focus on numerical detection and classification of a small number of limited sources. Complex multi-source combinations of changing trends, customized analytics can only be performed by users with programming skills.
The first goal of this paper is to propose a multi-source combination streaming pattern detection and analysis service that can be easily used by the general public without the need of streaming data processing expertise and programming skills. We extend SIFTTT[14] proposed by Henry Gunawan, to provide a variety of commonly used stream pattern analysis and combination identification services.
Facing the service demand of a large number of users, when the stream data sources are the same and the detection conditions are consistent or compatible, it provides the opportunity to share the analysis service to avoid duplicate computation and enhance the detection efficiency. Most of the existing multi-query sharing methods are still in batch processing, or only perform comparison or computation to see if there are any matching results when needed. These methods are not suitable for dynamic streaming data processing and query sharing.
The second goal of this paper is to design a high-performance multi-query dynamic sharing mechanism for streaming data. We refer to the A-Tree proposed by Shuping Ji and Hans-Arno Jacobsen[28], and propose a new query sharing network architecture, called Query Sharing Forest (QS-Forest), to allow each conditional detection result to be automatically spread to all relevant nodes, avoiding the need for repeated detection and condition evaluation. We also propose a Switch mechanism, which detects the values of the same source only once to obtain the Boolean results of all the query conditions on the values of that source, and effectively spreads the results with the dynamic sharing mechanism to satisfy all relevent query evaluation in real time.
Our proposed framework and methodology can successfully popularize the streaming big data pattern analysis service and achieve dynamic query sharing to effectively improve the performance, as demonstrated by the results of implementation and comprehensive experiments.
第一章 緒論 1
第一節 研究背景與動機 1
第二節 研究目的與方法 2
第三節 研究成果 3
第四節 論文架構 3

第二章 相關工作 5
第一節 物聯網(IoT) 5
第二節 串流資料(Streaming Data) 6
第三節 串流樣式偵測(Streaming pattern detection) 7
第四節 串流資料共享 7
第五節 普及化和SIFTTT 9
第六節 相關系統工具和語言 9
第六之一節 Kafka 10
第六之二節 Spark 10
第六之三節 Play Framework 12
第六之四節 IFTTT 13
第六之五節 Scala 15

第三章 普及化串流大數據樣式偵測服務與共享架構 17
第一節 問題描述 17
第二節 串流樣式擴充與處理 18
第三節 串流大數據樣式服務處理架構 21
第四節 Query Sharing Forest (QS-Forest)介紹及建立方法 22
第四之一節 Query sharing forest 介紹 22
第四之二節 QS-Forest建立方法 24
第五節 Switch介紹與建立方法 29
第五之一節 Switch 介紹 29
第五之二節 Switch 建立方法及機制 31

第四章 串流樣式即時分析與查詢服務動態分享 37
第一節 串流樣式即時分析服務 37
第二節 動態分享機制 41
第三節 樣式偵測機制 45

第五章 系統實作效果與效能評估 49
第一節 實驗環境 49
第二節 實驗資料 49
第三節 實驗結果 50
第三之一節 QS-Forest建立結果 50
第三之二節 通知正確性 56
第三之三節 Switch正確性 57
第三之四節 樣式捕捉正確性 59
第三之五節 不同查詢數量之建立時間及記憶體占用 61
第三之六節 不同查詢長度和數量對建立時間的影響 63
第三之七節 不同查詢數量下的處理時間 64
第三之八節 不同查詢數量不同共享程度之平均處理時間 66

第六章 結論與未來工作 69
第一節 結論 69
第二節 未來工作 70


[1]物聯網:定義、應用和風險 – NordVPN。
https://nordvpn.com/zh-tw/blog/wulianwang/ (參照日期2023/07/16)
[2]物聯網(IoT)是什麼? – TIBCO。
https://www.tibco.com/zh-hant/reference-center/what-is-industrial-internet-of-things-iiot (參照日期2023/07/16)
[3]什麼是串流資料? - AWS。
https://aws.amazon.com/tw/streaming-data/ (參照日期2023/07/16)
[4]王宜倫, 資料流(Streaming Data)技術於大數據(Big Data)之應用淺析。
https://www.syscom.com.tw/ePaper_New_Content.aspx?id=456&EPID=209&TableName=sgEPArticle (參照日期2023/07/16)
[5]A. K. Gupta and R. Johari, “IOT based Electrical Device Surveillance and Control System,” 2019 4th International Conference on Internet of Things: Smart Innovation and Usages (IoT-SIU), pp. 1-5, Ghaziabad, India, 2019.
[6]Apache Kafka - Introduction
https://www.tutorialspoint.com/apache_kafka/apache_kafka_introduction.htm(參照日期2023/07/16)
[7]Apache Kafka
https://httpd.apache.org/ (參照日期2023/07/26)
[8]Apache Spark
https://spark.apache.org/ (參照日期2023/07/26)
[9]AWS-What is Spark
https://aws.amazon.com/tw/big-data/what-is-spark/ (參照日期2023/07/16)
[10]B. Yu, H. Wang, and Q. Wang, “Research and Application of Complex Event Processing Method Based on RDF Stream,” 2021 33rd Chinese Control and Decision Conference (CCDC), pp. 6303-6308, 2021.
[11]Bosch - IFTTT
https://www.bosch-easycontrol.com/gb/en/easycontrol/ifttt/ (參照日期2023/07/16)
[12]Daniel Cooley, 2022年將是物聯網發展的轉捩點。
https://www.eettaiwan.com/20220413nt71-2022-marks-an-inflection-point-for-the-internet-of-things/ (參照日期2023/07/16)
[13]Ferrari L., Valtolina S., and Mesiti M., “Developing IoT Spark-Streaming Applications by Means of Stream Loader,” in Ubiquitous Networking. UNet 2018. Lecture Notes in Computer Science, Boudriga N., Alouini MS., Rekhis S., Sabir E., Pollin S., Eds. Hammamet: Springer, Cham, vol. 11277, 2018.
[14]H. Isah, T. Abughofa, S. Mahfuz, D. Ajerla, F. Zulkernine, and S. Khan, “A Survey of Distributed Data Stream Processing Frameworks,” in IEEE Access, vol. 7, pp. 154300-154316, 2019.
[15]H. Röger, S. Bhowmik, and T. Linn, “A Framework for Decentralized Parallel Complex Event Processing on Heterogeneous Infrastructures,” 2021 IEEE International Conference on Big Data (Big Data), pp. 190-196, 2021.
[16]Henry Gunawan, SIFTTT-Highly Customizable and Accessible Streaming Services with IFTTT and Spark Streaming, 國立東華大學資訊工程學系碩士論文, 2020。
[17]IFTTT
https://ifttt.com/explore (參照日期2023/07/26)
[18]Introduction to Play Framework in Scala
https://blog.knoldus.com/introduction-to-play-framework-in-scala/ (參照日期2023/07/16)
[19]J. Cao, H. Huang, and S. Qian, “CLOSED: A Cloud-Edge Dynamic Collaborative Strategy for Complex Event Detection,” 2022 IEEE International Conference on Web Services (ICWS), pp. 73-78, 2022.
[20]Medhabi Ray, Chuan Lei, and Elke A. Rundensteiner,, “Scalable Pattern Sharing on Event Streams*,” In Proceedings of the 2016 International Conference on Management of Data (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 495–510, 2016.
[21]Mohammad Hossein Namaki., Keyvan Sasani., Yinghui Wu., and Tingjian Ge., “BEAMS: Bounded Event Detection in GraphStreams,” 2017 IEEE 33rd International Conference on Data Engineering (ICDE), 2017
[22]Nextcode – What is apache Saprk
https://nexocode.com/blog/posts/what-is-apache-spark/ (參照日期2023/07/16)
[23]O. Poppe, A. Rozet, C. Lei, E. A. Rundensteiner and D. Maier, “Sharon: Shared Online Event Sequence Aggregation,” 2018 IEEE 34th International Conference on Data Engineering (ICDE), Paris, France, pp. 737-748, 2018.
[24]Olga Poppe, Chuan Lei, Lei Ma, Allison Rozet, and Elke A. Rundensteiner,, “To Share, or not to Share Online Event Trend Aggregation Over Bursty Event Streams,”In Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). Association for Computing Machinery, New York, NY, USA, 1452–1464. 2021.
[25]Panagiotis Liakos, Katia Papakonstantinopoulou, Alexandros Ntoulas, and Alex Delis, “Rapid Detection of Local Communitiesin Graph Streams,” in IEEE Transactions on Knowledge and Data Engineering, Vol. 34, pp. 2375 – 2386, 2022.
[26]Play framework - Introduction to Play
https://www.playframework.com/documentation/2.8.x/HelloWorldTutorial (參照日期2023/07/16)
[27]S. Mahfuz, H. Isah, F. Zulkernine, and P. Nicholls, “Detecting Irregular Patterns in IoT Streaming Data for Fall Detection,” 2018 IEEE 9th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON), Vancouver, BC, pp. 588-594, 2018.
[28]S. Zhang, H. T. Vo, D. Dahlmeier and B. He, “Multi-Query Optimization for Complex Event Processing in SAP ESP,” 2017 IEEE 33rd International Conference on Data Engineering (ICDE), San Diego, CA, USA, pp. 1213-1224, 2017.
[29]Sailesh Krishnamurthy, Chung Wu, and Michael Franklin,. “On-the-fly sharing for streamed aggregation,” In Proceedings of the 2006 ACM SIGMOD international conference on Management of data (SIGMOD '06), pp.623–634, 2006.
[30]Scala Play Framework
https://www.playframework.com/ (參照日期2023/07/26)
[31]Scala
https://www.scala-lang.org/ (參照日期2023/07/26)
[32]Scala-Tour of Scala
https://docs.scala-lang.org/tour/tour-of-scala.html (參照日期2023/07/16)
[33]Shuping Ji and Hans-Arno Jacobsen. “A-Tree: A Dynamic Data Structure for Efficiently Indexing Arbitrary Boolean Expressions,” In Proceedings of the 2021 International Conference on Management of Data (SIGMOD '21). 2021.
[34]Streaming Data: How it Works, Benefits, and Use Cases – Confluent。
https://www.confluent.io/learn/data-streaming/ (參照日期2023/07/16)
[35]T. T. Nguyen, T. T. Nguyen, T. C. Phan, Q. D. Nguyen, and Q. V. H. Nguyen, “Realtime Bushfire Detection with Spatial-based Complex Event Processing,” 2021 15th International Conference on Advanced Computing and Applications (ACOMP), pp. 1-8, 2021.
[36]Tatsuki Matsuda, Yuki Uchida, and Satoru Fujitam “Method of Complex Event Processing over XML Streams,” In Proceedings of the Second International Workshop on Exploratory Search in Databases and the Web (ExploreDB '15), pp.21–26, 2015.
[37]TIAO QIAN, SHIMING SUN, XIN SHAN, XUEYUN WEI,CHUNLIANG TAI, AND CHAO LIU, “Distributed-Swarm: A Real-Time PatternDetection Model Based on Density Clustering,” in IEEE Access Vol.10, pp. 59832 – 59842, 2022.
[38]Wang J, Ji B, Lin F, Lu S, Lan Y, and Cheng L. A, “multiple pattern complex event detection scheme based on decomposition and merge sharing for massive event streams,” International Journal of Distributed Sensor Networks. 2020.
[39]Wang, Shihan, and Takao Terano, “Detecting rumor patterns in streaming social media,” 2015 IEEE International Conference on Big Data (Big Data) (2015): 2709-2715, 2015.
[40]What is IFTTT

 
 
 
 
第一頁 上一頁 下一頁 最後一頁 top
* *