CS 213: Advanced Topics in 

Distributed Embedded Systems

Assignment 4: Building Aggregation Trees in TinyOS

Due Date: Wednesday, Feb 11th, 11:59pm PST

This assignment will follow along the lines of the Em* assignment that you recently completed, and will require you to build an aggregation tree. The questions and approach differ a little.
The flooding solution that you used in the first assignment is simple, but clearly doesnt scale to large networks, especially in terms of the number of transmissions (which translates to energy usage). Your task in this assignment will be to fix this problem by propagating counts over an aggregation tree.

2. Building an Push-based Aggregation Tree

Push and Pull paradigms are popular techniques used in distributed systems in general, and these techniques apply to sensor network algorithms as well (one-phase pull diffusion is an example). The distinction between push and pull methodologies is best described in the context of a client-server model where a web browser on a user machine is loading data from a server. In a push approach, the browser has an open connection to the server and the server pushes data to the browser as and when required. In the pull approach, the client reloads the page periodically, say, every minute.

For the Em* assignment, most of you would have implemented a version of Pull-counting. In this technique, when the sink needs a count, it requests the count explicitly by initiating a new flood, followed by the nodes responding up the tree. Each node sends its data up only once in this case (although it might transmit multiple times as a relay for data from downstream nodes).

In the TOS assignment, you will use a push-based approach to solve the counting problem. Here, the sink floods a count request only once in the beginning of the simulation, and each node periodically pushes data along the aggregation tree. I will refer to this period as an epoch, using terminology introduced in TinyDB. The sink receives a count of the nodes in the network every such period and can use this information to determine if nodes have failed.

The following core components are required for this implementation:

2.1 Reverse path tree-construction

This is the same procedure that you followed for your Em* assignment. Flooding allows you to setup an instance of a reverse path tree, where each node chooses the first node from which it receives the flood packet as its parent on the tree. Note that significant work has gone into construct stable and robust trees (Diffusion [UCLA], Bless [UCB], etc), and reverse-flooding trees are not generally considered the best. However, we will adopt this technique for its simplicity.

Building the reverse-flood aggregation tree consists of two steps. At the start of the simulation, the sink floods a count request packet to the network to initiate the tree construction procedure (you already have this code from the last assignment). When each node gets the flood packet, it can query the Flood module to get the previous hop from which the packet was received. This previous hop node is chosen as the parent on the aggregation tree.

The previous hop can be obtained by using the GetLastHop() command - look at lib/FloodData.nc.

2.2 Periodic reverse-tree forwarding

Each node periodically sends one unicast packet to its parent every EPOCH_PERIOD (set to 2000 ms). This packet contains both a sequence number to identify duplicates, and a count of the number of children that the node has heard from in the previous EPOCH_PERIOD. After a few initial EPOCH_PERIODs during which information percolates from up from the leaves, the root (sink) can be expected to have an estimate count of the number of nodes in the network. After an initial period of instability, the sink should continuously get an estimate of the count.

In the first assignment, each node was sending broadcast packets through the flooding module, but in this case, each node needs to send a unicast packet to its parent, therefore, it cannot use the flooding module. You will need to write two pieces of code:

  1. Component Wiring: In your Node.nc file, you need to add wiring to the GenericComm component, specifically to the Send and Receive interfaces that it exports. Examples of how this can be done can be found in the tinyos tutorial, apps/Surge application and in Flood.nc in lib/ directory.
  2. Send unicast packets: To send unicast packets from each node to its parent, you need to write code that uses the SendMsg and ReceiveMsg interface that you obtained after the wiring. To send unicast instead of broadcast packet, just place the address of the intended recipient as the first field in your SendMsg.send( ) function call instead of TOS_BCAST_ADDR.
The message structure that I used for my packets is below.

// Count message structure
typedef struct _CountMsg {
uint16_t child_id;
uint16_t parent_id;
uint16_t subtree_count;
uint8_t seqno;
} __attribute__ ((packed)) CountMsg;

typedef CountMsg *CountMsgPtr;

#define FLOOD_CACHE_SIZE 25
#define MAX_RESENDS 2
#define EPOCH_PERIOD 2000
#define FLOOD_PERIOD 10000

#define SINK_ID 0

For most of the testing in Steps 1 and 2, you can start off with the non-lossy models and then proceed to the lossy models. Note that in the non-lossy cases, you will not see multihop behavior since all nodes are in a single cell. Also, you will need to use randomization for the same reasons specified in Assignment 1, i.e. to minimize the impact of collisions.


Submit code that sends data up the tree (5 marks - 5 for wiring correctly, 5 for correctly sending packet to parent)

2.3 Dealing with Packet-loss

In the previous assignments, you had to deal with the split-phase functionality in TinyOS i.e. the possibility that the Flood module was busy and the need to post a timer to attempt a re-send.

In this case, you will have to deal with per-hop packet-loss. This information is available in the ack field in the tinyos message, which is set to 1 if a positive ack was received and 0 if not. Look at /doc/stack.pdf (page 4, para 2) for more information.

If no acknowledgement is received, you should re-send the packet. Cap the maximum number of resends by MAX_RETRANSMISSIONS. To prevent double-counting (for instance, when a parent receives multiple retransmissions from its child), each node needs to maintain a packet cache that contains the information received during each epoch. The cache should be flushed at the start of each period since each node only transmits the count for each interval.

The structure that I used for my cache is shown below for reference:
typedef struct _Cache {
uint16_t addr;
uint8_t seqno;
uint8_t epoch;
uint8_t subtree_count;
} Cache;
Each retransmission involves an actual packet transmission over the air, therefore, for energy reasons, it is best to minimize the number of retransmsisions. Keep your retransmissions low (between 0-5) for your testing.


Implementation of re-transmissions (10 marks- 5 for re-transmission based on the ack field and 5 for working caching scheme)

Use the following choice of variables for the graph below.

Flood_Period = 10000
Epoch_Period = 2000
MAX_RETRANSMISSIONS = 1

Topology = 5x5-5.nss


Construct a graph of Count received at Sink Vs Number of Epochs. (10 marks based on quality of result)

2.4 Sink-level filtering

Once you are done with steps 1,2 and 3, you will see that if the number of retransmissions is kept low (0-3), the sink sees a time-varying count of the number of nodes in the network. If the sink believed the count that it received during each epoch, it would think that the number of nodes fluctuates widely, while this is not actually the case. While a larger number of retransmissions might solve the problem, it incurs significantly more overhead.

Lets say that the sink needs to react to the loss of nodes in the network within a period of 5 epochs and only cares if more than 10% nodes (3 nodes in the case of the 25 node topology) seem to have failed.

Implement a simple filtering (also called damping or hysteresis) scheme that the sink can use to figure out whether or not the number of nodes in the network has actually changed or whether the changes are the effect of packet-loss. For instance, if the sink receives counts 25,25,24,25,24,25, averaging across a window of previous 5 responses might be a good option.

For the results, you will need to use the 5x5-5.nss topology within which you will introduce artificial node failure. Kill nodes 5, 10, 15 and 20 between epochs 10-20, 40-50 and 70-80.

Unlike in Em*, there is no clean way to introduce node failure. You can fake node failure by inserting code that makes the above numbered nodes ignore all messages for the specified epochs.

Use the same choice of parameters as in Section 2.3 for this study.


Plot a graph of Expected Count (after Damping) Vs Epoch, where the Expected Count is the output of the filtering algorithm at the sink and the number of epochs is 100. (10 marks)

3. Extra Credit: Choosing parameters for different design goals

In the course of the Em* and TOS assignments, you have learnt about different techniques and parameters that impact the counting problem. Here, you will be evaluated on your understanding of each parameter and its impact, as well as the joint impact of multiple parameters on system design goals.

The parameters that you can choose from are methodology (push/pull), randomization interval (short/long), number of retransmissions (small/large), choice of damping filter at the sink (max/median/mode/min) and history length in the damping filter (short/long).

Which combination of the above parameters (only one choice of each parameter) would you pick and why for the following design criteria? Please answer briefly in bullets or a couple of lines.(1 mark each)
  1. Minimize latency of detecting node failure.
  2. Minimize latency of recovering from new node grafting (dead node wakes up).
  3. Minimize energy in terms of number of transmissions and receptions.
  4. Minimize false positives (the percentage of time the sink detects failure when there is actually no failure).
  5. Minimize false negatives (the percentage of time the sink does not detect failure when there actually is one).



Submission Instructions

To spare further confusion regarding submission instructions, I will stick to the same scheme that you used for the previous assignment. You should submit ONE copy of the code. If you have multiple behaviors, please invoke them via command line options and document them.

Please submit your homework as a .tgz file using the following scheme:
  1. Submit your written answers as a .doc, .txt, or .pdf file.  Be sure to include your name and student ID at the top of the file.
  2. Name your file in the source tree, as emstar/cs213-W04/ps4-answers.XXX
  3. Tar up your portion of the source tree using these commands:
    cd emstar
    tar -czvf Last_First_213-W04-4.tgz cs213-W04
  4. Place the file in a web-accessible place and mail me the URL, or mail me for alternative arrangements.