(file) Return to TODO CVS log (file) Jump to this file's LXR Page (dir) Up to [CENS] / tos-contrib / sympathy

File: [CENS] / tos-contrib / sympathy / TODO (download)
Revision: 1.33, Sat Mar 26 22:42:33 2005 UTC (4 years, 8 months ago) by nithya
Branch: MAIN
CVS Tags: rdd_alpha_version_1, pregeonet, acoustic-05-18-06, PRE_TOSNIC_FIX, PRE_64BIT, MOTENIC_PRE_BUGFIX_20050415, LAURA_CALIBRATION_EXPERIMENTS, HEAD, ESS_RELEASE_3_5, ESS_RELEASE_3_4, ESS_RELEASE_3_3, ESS_RELEASE_3_2, ESS_RELEASE_3_1, ESS_RELEASE_3_0, ESS_RELEASE_2_0, ESS_CONNECTIVITY, ESS_CENTROUTE_TESTING, ESS2-CMS-V1_5_pretest, ESS2-CMS-V1_4cMergeSympathy_2, ESS2-CMS-V1_4c, ESS2-CMS-V1_4b, ESS2-CMS-V1_4a, ESS2-CMS-V1_3, ESS2-CMS-V1_2, ESS2-CMS-V1_1, EMSTAR_RELEASE_2_5, CYCLOPS_RELEASE_CANDIDATE_2_0, CYCLOPS_PRERELEASE_STABLE, CENTROUTE_EMSTAR_SOCKETS, BG_1_0, BANGLADESH_ARSENIC_1_2, BANGLADESH_ARSENIC_1_1, AMARSS_JR_DEPLOYMENT_6_05_07
Changes since 1.32: +4 -0 lines
Working version - node reboot/congestion detected,
node dieing not because packet-dropping is not working yet.

for paper:
- explain iterations (epochs) in order to ensure some state synchronization
- explain using time-awake to diff bet teboot/rollover/stale packets
- failure scenarios:
    *** excessive congestion and 2-hop nodes result in insufficient data
    	delivered at the sink - correlate unexpected events: when have
	next-hop with a fault - mention that!

Use #times tried to tx as additional indication of congestion.

Failure Scenarios:
- dont receive data from node - but its because we're not receiving data from its
	next-hop 
	- also nobody claims that node as a neighbor!

INITIAL TESTED:
- when disable jitter - detects congestion, and when enable it,
   congestion goes away (i think?)
=> add: no nodes claim node as a neighbor as a fault?
	- results in a fault if nothing else

DONE - add to narrative/design file
too much congestion:
	- check if mult nodes sharing same next-hop (
	check if all nodes w same next-hop) have lots
	of errors: then prob congestion issue
	- this is a fault - can correlate with not enough data?

Faults detected:
- faults can be detected from 1-hop away
   * too much congestion
- faults can be detected from anywhere
  * all others

add conditions: too much congestion, route-flapping
asynch notification of - route-flapping


Testing Plan:
Node rebooting
Errors induced by Packet-Loss:
INSUFF DATA: Excessive packet loss due to channel/hidden terminal/unsynchronized
radios - this is addressed by the first option (telling a single node to
drop x% of packets from another node).

** NO DATA: node doesn't have next-hop so cant send data!! but neighbors have heard
	it so we know it exists!
NO DATA: A node that doesn't have data to send - this should be done at the
sender: have the sender not send data out (this could happen at the comm
layer, or even at the application layer - may be more interesting at the
application layer - that way we culd have the situation where one type of
pckates are being sent - but another type are NOT being sent).

NODE DIE: A node that has died - this should also be done at the sender: have the
sender not receive/transmit any packets.

	- dont care much about point failures - care if a failure
		is persistent - how do we track this??
For tests:
add asynchronous events:
- too many errors all w same next hop
- routing a pkt for a node thats not a neighbor
- route flapping

if haven't heard from a node then:
  - see how long it takes if we wait for nodes to drop it off their neighbor lists */
  - flood request to get info on node (nodes that have routed pkts for it and
  	node itself, and neighbors should respond with:
	num-pkts routed for node, last time it heard pkt from node - involves
	tos-code that snoops lowest layer and understands multihop packets as
	well as app-layer pkts to determine original source)
  - if no nodes report it as a neighbor then record that!

command-logistics:
add ability to trigger a check to status-device
ability to user-interactive: "ping" a node - basically send a request for
	metrics! command: ping=<node_id>

Trigger self-tests: each node should determine if its getting what it needs
	for each module (i.e. ts pkts for ts, beacon pkts for routing, 
	dse-queries for application layer) - each module can export an
	interface: trigger test and respond with yes/no.

record children (you know implicitly) as well as parent (next-hop)

Add set_next_hop() primitive that checks if data-loss is due to next-hop
	selection - only do this if all nodes using a next-hop are losing
	data
- check more info on surrounding nodes using same nh

Later Design
upon reboot - save last 5 events, and line number of segfault, 
	and number of seg-faults
	- flood help message after seg-faulting > 1 time w/events

Experiments
-----------
- ping-quality:
   are pings an accurate representation of link-quality - couldnt there be
   a link that is allowing pings through but not larger data-packets through?
   add a test where i compare the 'ground-truth' with measured link-quality to
   number of packets making it through.
- Are there situations where a sink does not get data from a node, but its neighbors
can still talk to it? what are these situations?

Abstract Issues
- Don't make thresholding static: number-msgs need before declaring
  a problem - instead, measure average time you are hearing from all
   nodes, and if one node is some sigma away from this then it is a
   problem. or if a node is irregular.

- Figure out a way to determine if number beacons corresponds with
  actual long data packets that are getting through.

What I know:
------------
mh-hdr = 8B
when sink calls flood, nodes NodeI.recvFlood interface get called - 
     len of data is len of data sent (so nodes dont get multihop_hdr)

7th byte = type (5: DSE, 6: SYMPATHY)
mh data starts on 9th byte (first byte is: 1st byte)
packets passed up to dse/sympathy are TOS type 4 (TOS[4,125]),
beacons are TOS-1, and the debugging packets sent by dse are type TOS:2

DSE packets:
byte 9,10 = seqNo
byte 11,12 = srcAddr
byte 13 = type (1:TOPO_DATA)
byte 14 = pkt-length
if TOPO:
byte 15 = len of next-hop data
k

ext_type = 1 for Beacon
ext_type = 2:
  struct is: uint16_t node_id, 
	     uint8_t egress quality
sympathy-req:
>>  src=0.0.0.2 dst=255.255.255.255 type=TOS[4,125] data_len=9 rssi=93 time=...
0000: 02 00 FF FF 01 00 06 03 03   

CENS CVS Mailing List
Powered by
ViewCVS 0.9.2