1 Ahas:
2 1) Knowing number requests vs how many it tried to send vs... helpful!
3 2) Knowing how many of each type of packet very helpful - often just certain
4 type of packet is not making it through!
5 3) Having events is not very helpful so far - only metrics
6 4) Giving the positives is helpful as well as the negatives - maybe have
7 a file where i summarize positives? but TOO MUCH DATA - how do iake
8 it a useful summary - what goes in a useful summary?
9 5) Setting check period to 2xrequest period.
10 6) BIST - even for sympathy - by running troubled_node checks!
11 - just looking at metrics reported about each app
12
13 Solutions:
14 1) Start/stop topo-data
15
16 Problem:
17 1) Not getting sympathy responses, even from one-hop neighbors, but getting
18 beacons from them.
19 Measured:
20 - number of requests they are getting (they were getting requests),
21 they just weren't sending responses.
22 - then added, number of times tried to send but failed, also very few.
23
24 Things that help:
25 - nice to have space to put some external variables - during pre-dep
26 debugging
27 - should have sympathy periodically send metric info with these
28 vars - can use this to see if sympathy just not getting tasked or
29 somethign else going wrong.
30 - use sympathy request/response as indicator of whether motes can
31 be reached and be tasked - cuz sympathy uses same routing overlay
32
33 2) Not getting sympathy responses
34 - see reported number of sympathy requests received - this needs to
35 be periodically just broadcast without being requested.
36
37 3) Not getting sypathy respnoses
38 - periodically have metrics sent without request
39
40 4) Not getting sympathy requests:
41 - answer questions:
42 - are we getting regular automated metric returns from node?
43 - if no: goto check_routing
44 - if yes: is node transmitting requested metrics?
45 - if yes: is the node transmitting sympathy events?
46 - if no: is the node receiving sympathy requests?
47 - if no:
48 - if yes: are they continuing to increase even though number
49 events transmitted are not?
50 - if yes:
51 - if no:
52
53 - check_routing:
54
55 5) Fault detection: identify nodes that are not receiving sympathy
56 requests, or not sending as many event updates as they have
57 received - is this because their last event hours is not
58 same as others?
59 ** Separate number requests msgs rcvd into: number requested
60 events rcvd vs number requested metrics rcvd, and
61 specify number requests sent out!
62 ** for nodes that havent sent many events:
63 - check how many events were transmitted compared to
64 how many were received!
65 - for these nodes, find nodes with same next-hop
66 - else check how many requests they received
67 - else see if last-event time is just < time-awake (so maybe
68 just dont have events to send)
69
70 6) Too many errors (caught that with num errors) on nodes with node-2 as a next-hop.
71 so added jitter to response to flooded response
72
73 7) Wasn't getting metrics or data from a node, BUT i was getting
74 other data packets (beacons/neighbor-lists etc) - so added this to
75 the test.
76
77 8) observe performance degrade (i.e. stop getting data), and then
78 node would reboot
79
80 Design:
81 start/stop symp-requests - command-line/status-device
82 check for reboot! (ensure its not just a roll-over, but a reboot!)
83
84 Why is wsn different:
85 - energy debugging, performance/network usage assertions,
86 functionality debugging all collide -
87 how do we express those needs/rules -even dynamically, how do you assess
88
89 X graph time awake
90 X graph its fault-category
91 Classify nodes:
92 - we are not getting data from node
93 - nodes have no data to send
94 - data is not getting through
95 - performance: nodes sending data, but we arent getting a lot of it
96 - nodes that are fine
97
This page was automatically generated by the
LXR engine.
Visit the LXR main site for more
information.