CS 213: Advanced Topics in
Distributed Embedded Systems
Assignment 1: Getting Started with Em*
Due Date: January 19th, 11:59pm PST
This assignment will allow you to gain familiarity with Em*, the system
framework that many of you will use to build your projects. It will require
some setup, some implementation, and some testing to complete it.
As we have discussed in class, Em* requires Linux 2.4 with DevFS installed.
The Em* source code is freely available via anonymous CVS, and we are also
providing you a supplemental tar file of additional files to get you started
on the assignment. If your home machine runs Linux with DevFS, you can use
it for this assignment. This would have its advantages, but beware that
the installation might be time-consuming. If you want to give it a try,
see this page.
For people who are not going to be using their own equipment, we have several
machines available for student use. Each student has an account on the system
with plenty of disk space. Your account name is the same as the one you
supplied as your email address.
NOTE: If you are using
the class machines for simulations, be warned that there may be contention
for their use as the deadine approaches. So START EARLY!!
Part I: Setup
Step 1:
Log in to your account on the simulation
machines. There are three machines available for use, "tako",
"tobiko" and "dragon". To get to them, you must first log in through
the firewall gateway machine, "toro". Access is permitted via
ssh. See this page
for more detailed information. Only tako, tobiko and dragon have the
correct configuration to build and run the Em* system.
It's
best to log in from a machine that can support X-windows, so that
you can export the viewer program, and multiple xterm windows.
Step 2:
Log in to tako, tobiko or dragon, and
download the source code via CVS (see this page for a detailed
explanation). Then, download the additional files, install
them:
cvs -d :pserver:anoncvs@www.cens.ucla.edu:/home/cvs/cvsroot
login
(When prompted for a password, type anoncvs)
cvs -z9 -d :pserver:anoncvs@www.cens.ucla.edu:/home/cvs/cvsroot co
emstar
wget http://www.cens.ucla.edu/213/213-ps1.tgz
tar -xzvf 213-ps1.tgz
NOTE: *** The CVS server is called "www.cens.ucla.edu" when INSIDE the
firewall (e.g. from tobiko), but it is called "cvs.cens.ucla.edu" when OUTSIDE
the firewall. Sorry for the confusion. ***
Once you have the files downloaded, build the system. The first "export"
line is required each time you log in before building (you can add it to
your .bashrc file if you prefer).
export PKG_CONFIG_PATH=/usr/lib/pkgconfig
cd emstar
make
The 'make' will do a lot of work, and should complete with no errors. All
executables, object files, and libraries are stored in an architecture-specific
directory: emstar/obj.i686-linux. Within this directory, subdirectories
contain the executables you compile.
The next step is to try running something.
Step 3:
When people run Em* software on a shared simulation machine, they must
share the same device filesystem and other resources. This sharing is accomplished
through "groups". Each user should select a unique group number. To select
one, ls /dev/sim and choose a number that is not already present. You can
then use this group number for all of your future work.
Once you have your group number, you can run the Em* viewer and a small
test program. To run the viewer, start an xterm, and run the following command
in it:
cd obj.i686-linux/emview
./emview -g <group number> -a localhost
The viewer will start in a window with several menus. Now, let's start
something to view. You can run the Assignment 1 framework code that we supplied.
In a new xterm:
cd obj.i686-linux
emrun/emsim ../cs213-W04/testtabs/count-test.conf 25 20
After the simulation starts up, nodes should appear in the window. Select
Layout/Reported to show the nodes in their "real" locations. Select Modules/Examples/Count
to enable the "count" visualization scheme. This scheme was designed to
visualize the state of the code you will write in this assignment.
Now, trigger activity and watch the visualization respond:
echo hops=20 > /dev/sim/group<#>/node1/count/status
You should see a flood tree form, and little dots flicker on and off.
Part II: Implementation
For this part of the assignment, you will need to implement an application
that counts the number of nodes in the system. To help you get started,
the tar file we provided you contains a skeleton program with a few missing
pieces, in cs213-W04/ps1.
The algorithm we use in this assignment makes use of the Em* flooding component.
Through a command to a device file, one of the nodes is designated the "sink".
The sink floods out a request message some number of hops to reach all other
nodes in the system. Each node, upon receiving the request message, will
flood a reply message. All nodes receiving flooded replies count the number
of unique senders and report this number through a status device file.
(Yes, flooding back is not very efficient, but we're trying to start out
simple! In Assignment 2 we will implement a more efficient algorithm.)
IMPORTANT: We are using a simulation environment in this assignment.
However, the object of Em* development (and of this assignment) is to develop
code that works outside of simulation. While the simulation and the
visualizer both have an omnicient view of the system, and the nodes are physically
on the same machine, you must NOT leverage any of those artefacts of simulation
to solve your problem. When you are writing your code, think of it as
running on a real node, separate from the others.. the point of Em* is that
it could be deployed unchanged.
The lines written in red indicate questions to answer in your solution.
Step 1:
Familiarize yourself with the code provided:
- count_i.h: Implementation header file, contains message formats and
internal data structures that are not a part of the device interface.
- skeletons/include/count.h: Interface header file, contains device
file names and binary formats of reports on the device file.
- count.c: Main application file, contains the main() function, event
initialization, device interface code, and utility functions.
- count_net.c: Contains the packet processing functions, and the counting
algorithm.
There are three places in the code (marked with '$$$') where you will need
to add code to complete the program.
Step 2: (20 points)
Complete the missing code segments:
- count_send_reply() needs to send a flooded reply.
- count_handle_reply_packet() needs to process a flooded reply packet
- main() needs to add another event constructor to listen for flooded
reply messages.
In order to speed the process, we have made sure that the existing framework
runs, so you can see how it works. In addition, the emview visualization
will show the progress of your program. Each node icon displays information
from that node's status devices. The number above the node icon tells the
number of nodes counted so far. The number inside the node icon tells the
number of hops remaining when the flooded message arrives at that node. The
blue indicator tells that the node has triggered a send. The red indicator
tells that the node has recently transmitted data, and the green tells that
the node has recently received data. The arrows represent the reverse of
the paths taken by the flooded request message.
To test and debug your program, run emsim with the count-test.conf config
file, and small numbers of nodes (e.g. 5 or 10). The first argument is the
number of nodes, the second argument is the width (NOT the area) of the
square field the nodes lie in.
emrun/emsim ../cs213-W04/testtabs/count-test.conf 10 10
Note: If you specify too large a field, some nodes may be out of range
of the sink.
Helpful hints and facts about Em*:
- Debug using smaller numbers of nodes
- Use the elog() function to emit log messages at various loglevels.
- Cat /dev/sim/groupX/nodeY/emlog/count/all to view recent log messages
emitted by the count program. Cat all-f to see a streaming log output.
- Cat /dev/sim/groupX/nodeY/count/status to see the current status
of a count process. Augment the status report with additional information
by modifying the status callbacks in count.c
- Use "echocat -w" to 'watch' a device. Echocat appears in obj.i686-linux/libdev
- Use the CENS LXR site
to quickly locate symbols, structure definitions, etc.
- See the Em* Documentation Pages here for more information.
- If your code is crashing, try attaching to it with gdb. Run
gdb and instead of entering the "run" command, enter "attach <pid>".
- If your code is crashing, you will lose log messages after it crashes,
unless you configure it to be type "daemon" in the emruntab count.run.
You can also set up a streaming log (cat all-f) before it crashes, which will
report the messages up to the crash.
Submit your final source code as part of your
solution.
Step 3: (5 points)
Now that you have your system correctly computing node counts, you will
need to characterize its performance. We will use latency as the metric
of performance, as a function of number of nodes. But we will first examine
the effect of collisions on the node count system.
The count-test.conf file you used in step 2 does not simulate contention
in the radio channel (i.e. packet collisions). Collisions can be enabled
by adding a -c option in the command line for sim_mote in count-test.conf.
(We have done this for you in the count.conf file.) Try running your system
with collisions enabled.
What happens when collisions are present?
Step 4: (10 points)
How can you adjust the timing of the system to fix it? Implement a fix.
Can you implement a fix that is independent of node count and density?
Note: A delay parameter can be submitted in the user's command to /dev/count/status.
In the code, this parameter is ignored, but you can use it to tune your algorithm
without recompiling.
Note: We are looking for an adaptive
solution that is independent of node count. It should also not depend
on tuning from the user to scale, i.e. your code should be run with the same
parameters for all runs, e.g. delay, max_hops, and any others. It should
also be independent of any artefacts of the simulator, for instance the fact
that node IDs are small contiguous numbers, or the fact that all nodes are
in fact running on the same CPU. Basically, we want a solution that
requires no tuning of any kind, gives an accurate count in a range of conditions,
e.g. varying density, network size, topology, and would run in a real distributed
system of nodes rather than a simulator.
Describe how your timing scheme works. What assumptions
(if any) does your solution make? When would it break? Is there a tradeoff
between latency and accuracy?
Step 5: (15 points)
Measure the time required to acquire a complete count as a function of
the number of nodes. Test with node counts 5, 10, 15, 20, and 25. Be sure
to run your tests at a constant density, by adjusting the field size.
Remember that the field size is the width of a square field, so you want to
set it to sqrt(nodes / density) where density is nodes/unit area.
You may want to make several runs and average the results. You may
also want to try runs at several different densities.
Can you formulate an approximate model for the latency?
Submit the latency data, any relevant parameters
for each test, and your model.
Extra credit:
- How do you know when to stop counting? (1/2 point)
- What are the ramifications of new nodes appearing and existing nodes
disappearing? (1/2 point)
- Discuss how these issues influence the design of systems in this
space. (1 point)
- How does density affect system performance? (2 points)