Process-Oriented Stream Classification Pipeline: A Literature Review

Clever, Lena; Pohl, Janina Susanne; Bossek, Jakob; Kerschke, Pascal; Trautmann, Heike

Research article (journal) | Peer reviewed

Abstract

Due to the rise of continuous data-generating applications, analyzing data streams gained increasing attention over the past decades. A core research area in stream data is stream classification, which categorizes or detect data points within an evolving stream of observations. Areas of stream classification are diverse -- ranging, e.g., from monitoring sensor data to analyzing a wide range of (social) media applications. Research in stream classification is related to developing methods that adapt to the changing and potentially volatile data stream. It focuses on individual aspects of the stream classification pipeline, e.g., designing suitable algorithm architectures, an efficient train and test procedure, or detecting so-called concept drifts. As a result of the many different research questions and strands, the field is challenging to grasp, especially for beginners. This survey explores, summarizes, and categorizes work within the domain of stream classification and identifies core research threads over the past years. It is structured based on the stream classification process to facilitate coordination within this complex topic, including common application scenarios and benchmarking data sets. Thus, both newcomers to the field and experts who want to widen their scope can gain (additional) insight into this research area and find starting points and pointers to more in-depth literature on specific issues and research directions in the field

Details about the publication

JournalApplied Sciences
Volume12
Issue8
Page range1-44
Article number9094
StatusPublished
Release year2022
Language in which the publication is writtenEnglish
DOI10.3390/app12189094
Link to the full texthttps://www.mdpi.com/2076-3417/12/18/9094
KeywordsData mining; Big Data; Stream Classification; Data Stream Analysis; Supervised Learning; Machine Learning

Authors from the University of Münster

Clever, Lena
Data Science: Statistics and Optimization (Statistik)
Lütke-Stockdiek, Janina Susanne
Data Science: Statistics and Optimization (Statistik)
Research Group Computational Social Science and Systems Analysis (CSSSA)
Trautmann, Heike
Data Science: Statistics and Optimization (Statistik)