Cornell Cougar Project

Home
Online Demo
BOOM Demos
People
Publications

Cougar (Older Versions)
Cornell DB Group
Berkeley TinyDB
ISI Diffusion
DARPA ITO SensIT
BBN Technologies

Cougar ::. Introduction

The widespread distribution and availability of small-scale sensors, actuators, and embedded processors is transforming the physical world into a computing platform. Sensor networks that combine physical sensing capabilities such as temperature, light, or seismic sensors with networking and computation capabilities will soon become ubiquitous. Applications range from environmental control, warehouse inventory, and health care to scientific and military scenarios.

Existing sensor networks assume that the sensors are preprogrammed and send data to a central frontend where the data is aggregated and stored for offline querying and analysis. This approach has two major drawbacks. First, the user cannot change the behavior of the system dynamically. Second, communication in today's networks is orders of magnitude more expensive than local computation; thus in-network storage and processing can vastly reduce resource usage and extend the lifetime of a sensor network.

Cougar ::. Constraints

Sensor nodes come in a variety of hardware configurations, from nodes connected to the local LAN attached to permanent power sources to nodes communicating via wireless multi-hop RF radio powered by small batteries, the types of sensor nodes following resource constraints:
  • Communication: The wireless network connecting the sensor nodes provides usually only a very limited quality of service, has latency with high variance, limited bandwidth, and frequently drops packets.
  • Power consumption: Sensor nodes have limited supply of energy, and thus energy conservation needs to be of the main system design considerations of any sensor network application.
  • Computation: Sensor nodes have limited computing power and memory sizes. This restricts the types of data processing algorithms on a sensor node, and it restricts the sizes of intermediate results that can be stored on the sensor nodes.
  • Uncertainty in sensor readings: Signals detected at physical sensors have inherent uncertainty, and they may contain noise from the environment. Sensor malfunction might generate inaccurate data, and unfortunate sensor placement (such as a temperature sensor directly next to the air conditioner) might bias individual readings.

Cougar ::. Research

We investigate two unique approaches to sensor networks. First, we will use a database approach to unite the seemingly conflicting requirements of scalability and flexibility in monitoring the physical world. The objective of this research is to build a new distributed data management layer that scales with the growth of sensor interconnectivity and computational power on the sensors over the next decades. Our system will reside directly on the sensor nodes and create the abstraction of a single processing node without centralizing data or computation. The system will provide scalable, fault-tolerant, flexible data access and intelligent data reduction, and its design involves a confluence of novel research in database query processing, networking, algorithms, and distributed systems.

Second, we believe that due to the heavily resource-constrained environment of sensor networks, cross-layer optimizations allow interesting opportunities for the preservation of resources. Due to the regularity of query processing patterns we believe that we can design query-layer specific routing algorithms that are optimized --- not for general point-to-point communication --- but for the more regular types of communication patterns that are generated by a query layer. Investigation of such cross-layer optimizations is the second major goal of this research.

From a research standpoint, the central issues are the following:
  1. Sensor networks as a distributed database system. What is the impact of storing data in the networks for later querying? How do we optimize and process declarative queries involving sensor data? Sensors have limited battery power and the wireless network has limited bandwidth and quality of service; query execution has to take these constraints into account.
  2. Cross-layer optimizations. How can we expose structure that originates in the data management layer to the routing layer? Should we design a data management layer that is optimized for a given routing layer, or should we optimize the routing layer given the data management layer? What are suitable interfaces that enable this tight coupling?

Cougar ::. Illustrative Example

Consider a sensor network equipped with temperature, pressure, humidity and smoke detection sensors deployed in a forest for dealing with fire emergencies. On a regular daily basis, archival queries are sent to the network to extract summarized information about the status of the forest. The regular communication patterns of these long-running queries could be exploited to configure underlying communication protocols in an energy-efficient way. Significant energy amounts would be saved if instead of shipping all relevant data to the the gateway, we stored them in the network, processed them locally and sent only the summarized query results. Similarly, polling the sensor network regularly to detect hazardous events incurs unnecessary communication overhead; an event-based approach would instead enable us to respond timely to a fire emergency in an energy-efficient manner.

In the event of a fire, the respondents would need current information about the spread of the fire and the weather conditions that led to it. This information should be accessible from multiple points within the network, depending on the current locations of the firefighters. Furthermore, the access patterns during such emergency scenarios will exhibit geographical locality; so shipping the data to the gateway and processing these internal queries there is inefficient. On the other hand, storing the raw data only at the nodes where it is generated will make it difficult to access status information. Therefore, in-network storage mechanisms need to be designed so that the queries are answered efficiently with the most current information.

Information about current fire patterns could in turn be used to tune the functionality of routing and MAC layers, making them adaptive to the dynamics of the environment. Areas close to the fire should provide a very reliable communication protocol, since we would like accurate and timely information to prevent firefighters from being trapped. Furthermore, since many nodes might be destroyed in the presence of a fire, providing a fault-tolerant storage abstraction is an important consideration.

Cougar ::. Acknowledgements

The COUGAR Project is supported by the Defense Advanced Research Project Agency, the Cornell Information Assurance Institute, and by a gift from Intel. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the sponsors.