Why use EmStar?

This text was written by Lew Girod from CENS/LECS. It's not done yet, but is a start at describing the basics of why we are investing in a new programming model.

What is a Programming Model?

In the most general terms, a programming model is really a style of implementation. To help define this style, the ``model'' is often accompanied by languages, libraries, templates, ``wizards'', canonical forms, and rules of thumb. These accoutrements guide implementors in the pursuit of certain solutions to certain problems.

The object of a programming model is to make a certain class of problems easier to solve, perhaps at the risk of making others more difficult. As a rule, some problems are solved naturally in a given model, while others are not. Echoing the age-old wisdom of selecting the right computer language for the task, this is even more critical when the tool is made more specific.

It is a common mode of failure in building a programming model to succumb to the desire to invent the perfectly general model. In fact, since the whole point of a programming model is to solve a specific class of problems more easily, absolute generality is often counterproductive, even if that design choice rules out some possible applications.

Some rules of thumb in designing a programming model:

There are many preexisting programming models that are designed to solve problems in other application domains. For example, ``.net'' and Java are popular programming models for implementing Internet-enabled applications. Programming models for desktop software environments include the Microsoft Foundation Classes and Win32 API, as well as the KDE and Gnome environments for UNIX.

However, none of these are particularly well-suited to the needs of CENS applications.

The CENS Application Domain

Work in CENS is primarily concerned with developing robust, large-scale, wireless distributed embedded systems. The intent is to build large systems composed of independent wireless sensors, that cooperatively and autonomously perform sensing and actuation tasks.

Strictly from a numerical perspective, the large number of independent devices would make manual configuration difficult. However, the presence of significant system dynamics makes manual configuration impossible. Values set during an initial manual configuration phase would be unlikely to hold for the lifetime of the system. Some typical examples of dynamics in embedded sensor systems are:

The presence of these system dynamics forces applications to be adaptive. Often in cases like this the simplest solution would implement this adaptive behavior in a centralized fashion. Unfortunately, for many CENS applications, configuration of the system can't be centrally managed. An a centralized implementation might be feasible where a high-bandwidth backbone is available. However, in low-power multi-hop networks, centralized control is infeasible because of the communications requirements, both in terms of bandwidth and power.

These two fundamental requirements, namely adaptation to environmental dynamics and the need for a distributed and autonomous implementation, lead to a very specific application area, that is not well served by existing programming models. In many respects this kind of system can be considered to be a ``distributed robot'': the distributed sensor system senses its environment and performs a task autonomously. Perhaps the closest models come from the research community in autonomous robotics, especially research into ``reactive'' and ``behavior-based'' robotics. However, the focus of most robotics research is on sensing and actuation rather than exploiting digital communications networks.

The existing programming models for Internet-based services are clearly insufficient for sensor applications, for a number of reasons.

First, dynamic configuration is rarely attempted in Internet-based services. Rather, Internet systems typically rely on a combination of existing naming and configuration infrastructure, such as Jini or the DNS, and server-side maintenance and configuration. In the cases where no infrastructure is required, it is usually achieved by search techniques that do not scale to a large multi-hop network, for example SMB server advertisements.

Second, many innovations required to build CENS applications must be implemented at the lower layers in the network stack. Often these innovations are incompatible with the most basic assumptions made in the Internet. For example, naming in a sensor network may be quite different: the names of nodes may be large unique identifiers with no routing semantics -- or names may be applied to data rather than to nodes. Furthermore, data may be processed hop-by-hop rather than in an ``end-to-end'' transaction. Programming models and frameworks that are implemented above a TCP/IP stack would make these kinds of innovations difficult if possible at all.

Third, Internet-based programming models generally make assumptions about the underlying network that may not be valid for wireless sensor networks, and in the process develop abstractions that mask important feedback channels. For example, an abstract RPC mechanism based on TCP and XML is not appropriate to a system formed of low-bandwidth wireless links with significant node and network dynamics. Many of the cases that are assumed by RPC to be unusual, such as transient failures in network connectivity, are in fact the common case in a wireless sensor network.

These deficiencies led us in the direction of building a new programming model that borrows ideas from UNIX system design, Internet protocol design, and reactive robotics.

Introduction to the CENS Programming Model

The CENS model is an amalgam of good design ideas culled from many areas of systems engineering:

Lots of Little Processes

One of the tenets of the UNIX design philosophy is the idea of solving large problems by composing many smaller, reusable programs, each running as its own process. This idea is applied in the CENS model by decomposing the system running on a node into a collection of reusable services. These services communicate using a standard set of IPC mechanisms that clearly define the modular dependencies and interfaces.

By making each service its own process, the CENS model directly exposes the modularity of the system at the process level, and also makes the system more robust by taking full advantage of memory protection across process boundaries. This approach is also language agnostic; individual processes may be written in any language. As the IPC channels are implemented by POSIX file descriptors, they are accessible from any language, although notification features require access to the select() system call.

Interactivity and Transparency

Another powerful idea culled from UNIX is support for interactive usage from the shell and the use of human-readable configuration formats. Similar to the /proc file system, many components in the CENS model enable interactive access to internal state variables. The programming model simplifies the creation of these interfaces with the intent that they can become the common case.

These attributes greatly enhance transparency of the system. It's possible to argue that this feature is less efficient, and unnecessary in a deployed system. However, in practice a great deal of time is spent debugging and troubleshooting the system, and at those times interactive access to the system is invaluable. A further benefit of this kind of approach is the ability to easily write shell scripts that interact with the system components. Since the cost of computation is typically low compared with the cost of external communication, the decrease in efficiency resulting from these mechanisms is typically negligible.

Soft State

Soft state is a common technique used to simplify the design of robust distributed systems. The fundamental idea is to employ a periodic state refresh mechanism to eliminate long-lived assumptions about the state of external modules, by ageing out old data and replacing or refreshing the exsting state when new data arrives. These designs are most appropriate in cases where dynamics is the common case. However, they also can be used to trade efficiency (in terms of message traffic) for robustness and simplicity, because an enormous class of transient inconsistencies and errors are simply corrected by the next successful refresh.

Soft state is an important part of many of the most successful Internet routing protocols, in designs where the application of soft state to routing message traffic can be assumed to be negligible compared with the data traffic. In low-bandwidth multi-hop wireless networks, these techniques are often quite important, but can't be applied indiscriminately. However, we have also found considerable application of soft state techniques for improving robustness within the system across IPC channels where communications cost is less important. By using soft state design liberally within the system, it is possible to control the complexity of individual modules while substantially increasing the sophistication of the system's adaptive behavior.

For example, a service that reports connectivity to one-hop away neighbors might send its complete neighbor list (through IPC) to its clients whenever a change in connectivity is observed. Reporting the complete state in each refresh rather than reporting deltas substantially simplifies the protocol while eliminating many opportunities for infrequent but persistent state mismatches.

In addition, making the refresh periodic in the absence of changes at once eliminates the possibility that a bug in the neighbor discovery service prevents notification of clients, while eliminating the possibility that a bug in a client permanently drops an update or otherwise gets out of sync. Because of the relatively long time constants of dynamics in wireless sensor networks, refresh intervals do not need to be rapid to be effective: intervals on the order of seconds are generally sufficient.

Reactive Robotics

: Achievement of complex adaptive behavior from interaction of simple rules.

document not done yet...