We are really happy to announce that our colleague Wang Yue has successfully defended his Ph.D. dissertation with the title 'Efficient Window Aggregation for Decentralized Stream Processing Systems' on December 19th, 2025.
Abstract:
The growing scale of Internet-of-Things (IoT) applications has led to the rapid expansion of decentralized networks that continuously generate unbounded data streams. To process these streams, many stream processing systems have been developed. They aim to support large volumes of concurrent queries while maintaining low latency, high throughput, and scalability. However, existing systems still struggle to meet these requirements. Centralized processing approaches collect all data to a center, which introduces computation bottlenecks and significant network utilization. To address these issues, decentralized approaches push calculations down to edge devices close to data sources. While this improves efficiency, these approaches only focus on time-based windows with decomposable functions and are incapable of dealing with a large number of queries. This thesis investigates how to enable efficient and scalable window aggregation in large-scale decentralized networks, supporting diverse query semantics and aggregation functions with minimal redundancy and network utilization.
To answer this challenge, we develop a system with three approaches: Desis, Deco, and Dema. Each approach solves a different part of the problem. Desis supports multi-query processing. Deco focuses on count-based windows. Dema enables nondecomposable aggregation. These approaches improve the efficiency and scalability of stream processing in decentralized networks. Desis is a stream processing system that can efficiently process multiple stream aggregation queries. We propose an aggregation engine that can share partial results between multiple queries with different window types, measures, and aggregation functions. In decentralized networks, Desis moves computation to data sources and shares overlapping computation as early as possible between queries.
Desis outperforms existing solutions by orders of magnitude in throughput when processing multiple queries and can scale to millions of queries. In a decentralized setup, Desis can save up to 99% of network traffic and scale performance linearly.
Deco focuses on count-based window aggregation in decentralized networks. We propose a lightweight prediction method that derives local window sizes based on the previously observed event rates and performs corrections when necessary to ensure accurate and fast query results. Windows are processed in a decentralized manner on local nodes, verified for correctness, and then aggregated on a root node. Our evaluation shows that Deco clearly outperforms centralized approaches. It significantly reduces network traffic and scales linearly with the number of nodes.
Dema addresses the challenge of supporting non-decomposable aggregation in decentralized networks. Dema reduces network traffic and computational load by performing localized sorting and transmitting statistical summaries rather than raw data. Our approach efficiently calculates median and quantile values, achieving up to a 99% reduction in network traffic compared to state-of-the-art methods. Our evaluation results show that Dema significantly outperforms existing approaches in terms of throughput and scalability, while ensuring accurate results.
In summary, this thesis provides a general framework for efficient window aggregation in decentralized networks. By supporting multi-query processing, countbased window aggregation, and non-decomposable functions, e.g., median and quantile, our system demonstrates the feasibility of efficient stream processing in large-scale decentralized networks.