Hasso-Plattner-InstitutSDG am HPI
Hasso-Plattner-InstitutDSG am HPI

Desis: Optimizing Decentralized Window Aggregation

Wang Yue

Chair for Data Engineering Systems

Hasso Plattner Institute

Office: F-1.06
Tel .: + 49- (0) 331 5509-280
Email: Wang.Yue (at) hpi.de
Links: Homepage

Supervisor: Prof. Dr. Tilmann Rabl



We are focusing on the efficient window aggregation in distributed networks regarding to stream processing. In particular, we are interested in finding the approach that shares multiple queries among different nodes and slicing windows with multiple queries in an effective and efficient way.


In the past decade, we are witnessing an explosion of Internet of Things (IoT) applications in many domains, from environmental condition monitoring to industrial process control and health care. These applications are usually organized with a huge amount of IoT devices. By 2025, the number of IoT devices is expected to be over 75 billion[1]. The massive amounts of devices consist of decentralized networks and continuously generate data streams. Current stream processing engines (SPEs) such as Apache Flink, Apache Spark Streaming, and Apache Storm were developed for processing large-scale and high-velocity data streams. They perform efficient data aggregation that allows processing millions of events per second and are widely used in industry and research.

An example of a decentralized network use case is a smart driving system. The smart driving system aims to assist with traffic management and traffic estimation. With that system, drivers can be alerted in case of dangerous situations or simply avoid traffic congestion. In figure below, Sensors are deployed on cars and generate data event continuously (eg speed, location, and temperature). The events then are collected and sent to data centers. Also, users input queries, such as calculating average speed of all cars. Smart driving systems perform queries based on data events and then output results. There are two features of this example, multiple data streams that are from IoT devices are distributed in large networks, and there are multiple long-running queries that need to be performed simultaneously.


Research Problem

To efficiently process data from decentralized networks, current SPEs [2,3,4] are deployed in data centers and centralize raw data from distributed IoT devices to that centers. Then, SPEs process those data and output results. Therefore, all data produced by IoT devices are required to be collected via networks to data centers. That leads to high network costs especially when large-scale data is transferred between devices and SPEs. In addition, due to all computation centralized in data centers, SPEs have to scale up their clusters to deal with large-scale data. Otherwise, data centers can become bottlenecks and be harmful to throughput. To reduce network overhead and improve performance, state-of-the-art [5,6] off-loads part of the data processing that would be performed in the data center to devices closer to the data streams. Instead of sending every event, thoes devices produce and transfer partial results. The data center then aggregates partial results to output final results. However, when processing multiple queries, current solutions have to create multiple instances against queries even if they perform similar queries and process the same data streams. 

The data streams produced from decentralized networks are unbounded. The most common way for current SPEs to process an unbounded stream is to window the input data into fixed-size windows and then process each window separately. Depending on different queries, SPEs define windows with different window types, aggregation functions, and window measures, which are processed by different approaches. Data analysis often involves many simultaneous queries and this generates many concurrent windows. Due to those windows processing the same data there are overlaps between windows. So reusing intermediate results from window overlaps can avoid redundant computation that reduces resource consumption and improves performance. To share intermediate results between windows, state-of-the-arts solution cut windows into slices. Those slices can be shared between windows that have the same window types and aggregation functions. But for windows with different aggregation functions and different window types, even different window measures, the intermediate results cannot be fully reused.


In this work, we present Desis, an SPE designed to work on edge and fog computing environments by offloading part of the processing on machines closer to data streams. Desis is optimized to process multiple queries with decentralized aggregation and to share the computation between windows. Our approaches mainly consist of three features: First, Desis is capable of decentralized aggregations with tumbling windows, sliding windows, session windows, and user-defined windows. Instead of collecting data to a center and aggregating windows there, Desis is able to calculate windows with decentralized aggregation by pushing down window aggregation to other machines in the system. Second, we propose an optimizer that first performs window slicing to slice concurrent windows, and then, with window merging, shares the intermediate results of overlaps between original windows, which have different window types, window measures, and aggregation functions. Third, Desis can process real-time data based on event-time and deal with disorder data with ensuring accuracy. Desis exhibits good scaling in terms of processing high amounts of queries and throughput.

Related Work

We now present some related work with stream processing in distributed network and window aggregation among multiple queries. Madden et al. [10] propose the tiny aggregation that is a tree-structured distributed data aggregation. And it works a lot on how to build the network route between different nodes. In Cougar [11] and LEACH [12], the authors make use of a class-structured network, which selected cluster node among other nodes that aggregate windows from other nodes. The PEGASIS [13] introduces a chain-structured approach that all the nodes are a priori and conscious of all other nodes. SlideSide [14] proposed an approach for incremental processing in multi-queries scenarios that maintains a running prefix / suffix scan over the input stream. FlatFAT [15] uses a tree structure to store the partial result.

In summary, there are so many stream processing techniques that focus on sensor networks and multiple queries. However, none of them try to combine those together, which is a general problem met by many realistic scenarios. Also, they lack the solution of counting window in stream processing and only concentrate on the sliding or tumbling windows.


We compared our solution with the centralized system that we discussed, also we have the same network structure. The only query we test is ' output average speed of every 1000 tuples '. We measure three metrics of both systems, latency is how much time system output results, throughput is how many events are processed per second, and network overhead is network traffic used by each system. The experiments show our system outperforms baseline by 100 times while reducing 99% network overhead.


[1] Prasanna, Srinivasa, and Srinivasa Rao. "An overview of wireless sensor networks applications and security." International Journal of Soft Computing and Engineering (IJSCE) 2.2 (2012): 2231-2307.
[2] Carbone, Paris, et al. "Apache nimble: Stream and batch processing in a single engine." Bulletin of the IEEE Computer Society Technical Committee on Data Engineering 36.4 (2015).
[3] Zaharia, Matei, et al. "Discretized streams: Fault-tolerant streaming computation at scale." Proceedings of the twenty-fourth ACM symposium on operating systems principles. 2013.
[4] Toshniwal, Ankit, et al. "Storm @ twitter." Proceedings of the 2014 ACM SIGMOD international conference on Management of data. 2014.
[5] Benson, Lawrence, et al. "Disco: Efficient Distributed Window Aggregation." EDBT. Vol. 20. 2020.
[6] Liu, Guojin, et al. "Volcanic earthquake timing using wireless sensor networks." 2013 ACM/IEEE International Conference on Information Processing in Sensor Networks (IPSN). IEEE, 2013.
[7] Akidau, Tyler, Slava Chernyak, and Reuven Lax. Streaming systems: the what, where, when, and how of large-scale data processing. " O'Reilly Media, Inc.", 2018.
[8] Traub, Jonas, et al. "Scotty: General and Efficient Open-source Window Aggregation for Stream Processing Systems." ACM Transactions on Database Systems (TODS) 46.1 (2021): 1-46.
[9] Carbone, Paris, et al. "Cutty: Aggregate sharing for user-defined windows." Proceedings of the 25th ACM International on Conference on Information and Knowledge Management. 2016.
[10] Samuel Madden, Michael Franklin, Joseph Hellerstein, and Wei Hong. TAG: a Tiny Aggregation Service for Ad-Hoc Sensor Networks. In OSDI, pages 131-146. ACM, 2002.
[11] Yong Yao and Johannes Gehrke. The cougar approach to in-network query processing in sensor networks. ACM SIGMOD Record, 31 (3): 9-18, 2002.
[12] Wendi Heinzelman, Anantha Chandrakasan, and Hari Balakrishnan. An application-specific protocol architecture for wireless microsensor networks. IEEE Transactions on Wireless Communications, 1 (4): 660-670, 2002.
[13] Stephanie Lindsey, Cauligi Raghavendra, and Krishna Sivalingam. Data gathering algorithms in sensor networks using energy metrics. TPDS, 13(9): 924-935, 2002.
[14] Theodorakis, Georgios, Peter Pietzuch, and Holger Pirk. "SlideSide: a fast incremental stream processing algorithm for multiple queries." (2020). 
[15] Tangwongsan, Kanat, et al. "General incremental sliding-window aggregation." Proceedings of the VLDB Endowment 8.7 (2015): 702-713.