Energy-Efficient Aggregation

EEF and LEF have considered the energy- and the lifetime-efficiency of wireless sensor networks and focused on the network's delivery performance as well as its energy consumption. As the radio transceiver is one of the major energy consumer and it can be expected that most of the sensor nodes only carry a limited and irreplaceable power supply, such energy efficient algorithms are essential to provide an infrastructure for any kind of application. A main challenge thereby occurred is to reach the common goal of the whole network but keep energy costs to a minimum in order to prolong the lifetime a single nodes.

Extending the network lifetime can be achieved in many ways and several protocols have been proposed in the last few years, considering different layers in the network stack. One way for minimizing the energy consumption is the use of data aggregation during the forwarding process. Taking advantage of aggregation may reduce the communication overhead significantly, especially if it is taken into account during the establishment of energy-efficient forwarding paths. Since the energy cost are much lower if packets are sent over paths with high aggregation ratios, the nodes' as well as the network's energy efficiency will be increased. However, only if the forwarding strategy is aware of the cost reduction due to aggregation.

The potential for data aggregation can be motivated as follows. Usually, a senor node operates in one of the following modes: Sensing, processing, or communicating. For typical applications like habitat and environmental monitoring, disaster detection, or military surveillance applications, the detection of an event or a particular stimulus is in general first processed by a node and then forwarded to one or several sink nodes in the network. Since most nodes might be scattered over a wide area, a direct communication with a sink is not always possible. Data is therefore sent over intermediate nodes along a forwarding path until a sink node is reached. As it is likely that several nodes report the same event, information received from different nodes may be combined such that only one packet need to be forwarded towards the network sink. Especially, if the source nodes are close together, most of the data might be redundant since the correlation of data readings is probably high.

In addition to event-based aggregation, data queries issued by the sink are another example where data aggregation might be useful if the sink is not interested in each reading separately captured by a node. Common examples of query-based aggregation are, e.g., temperature or humidity queries. But every other query that affects correlated data reports desires the possibility of data aggregation as the number of packets transmissions may be significantly reduced.

By in-network processing, data gathering is then processed as follows: Based on the aggregation tree, which can be considered as a reverse multicast tree that is rooted at the sink, inner nodes summarize the readings from their child nodes before they forward the data to their parent nodes in the tree. Therefore, special aggregation function are used. Aggregation functions are for example SUM, COUNT, AVG and MIN/MAX. While these functions require few memory and low processing capabilities, they are limited to relatively simple types of queries. But even more advanced queries like approximations for the median, the most frequent data values, a data distribution histogram or range queries are possible.

In contrast to classic address-centric routing where packets are routed based on unique destination addresses and the data is not changed, routing along aggregation trees, where inner nodes perform any kind of in-network processing, is termed data-centric routing. Although data-centric routing in conjunction with data aggregation was already proposed several years ago for the first time, the way how aggregation trees are built is still an open research problem. Concerning this issue, our aim is to provide an energy-efficient aggregation forwarding scheme that tries to find the best energy-efficient spanning tree in order to optimize both the packet delivery and energy cost in the network.

Single-Link and Multi-Link Energy-Efficient Aggregation Forwarding

Concerning the energy efficiency of a source or forwarding node, the ability of aggregation along the path might have a big impact. While the end-to-end delivery ratio does not change through aggregation since the probability that the delivered data packet at the sink contains the information issued by the appropriate node is the same, independent if aggregation is used or not, the energy required to finally deliver the packet is surely smaller than in the case where no aggregation is employed. But how does a node know a priori if it is an aggregation node? And if so, how does it know how much data it is able to aggregate? The problem of constructing an aggregation tree is very similar to the Steiner tree problem which is NP-hard. A Steiner minimum tree (SMT) is a tree in a graph consisting of N nodes that connects n source nodes with minimum path costs. In general, n is smaller than N, otherwise, the Steiner tree would be the same as the MST covering all N nodes. The high complexity in constructing the SMT is therefore due to the problem of finding the optimal aggregation nodes.

Since the Steiner tree problem is NP-hard, we simplify the problem by considering the case where only source nodes are able to aggregate data packets. However, we only need this assumption during the construction of the forwarding aggregation tree. Afterwards, all nodes are of course allowed to aggregate as much data as possible.

The simplification allows us to construct a MST approximation on the overlay graph consisting of the source nodes only, with the link costs between all sources being the end-to-end path costs in the underlaying complete graph. The path costs between two sources can be calculated based on the energy efficiency metric used in the last section. However, while the end-to-end delivery ratio Er remains the same, the required energy Ee has to take into account that another source node along the path might aggregate the information issued by a node into one single data packet. That is, in order to construct an energy-efficient aggregation tree, a node must propagate Er, as in the SEEF and MEEF strategy, and a changed expected required energy value �e that is equal to Ee if the node is a source node and zero otherwise. Due to these changes, we name this extended forwarding strategy single-link and multi-link energy-efficient aggregation forwarding (SEEAF and MEEAF), respectively.

However, a problem that arises from the ability of aggregation (and the changed energy costs in such a case) is that the energy efficiency is no longer a monotonic decreasing function. But this is mandatory in order to prevent cycles during the forwarding tree construction. Otherwise, a node might select one of its child nodes as its new parent in the tree since the child node might have lower energy costs due to aggregation. A solution to this problem is the usage of sequence numbers that indicate how up-to-date the used information is. The general idea is that the sink node periodically generates an increasing sequence number that is propagated throughout the network, however, only from a parent node to all its children. Each time the forwarding variables of a node change, the node sets a timestamp which is equal to the current node's sequence number. The next time the node calculates its energy efficiency, it only considers neighbors that have a higher sequence number than the one set at the last node's change. However, we still have to consider the case where a neighbor j previously received a higher sequence number over another path and later became a child of a node i. Therefore, each node does not use its current sequence number for the timestamp but the highest sequence number the node has previously received from any parent node.