Optimizing Message Parsing in the Depth Controller Web UI

While preparing for the next revision of the Depth Controller firmware, I found that the web interface would start to show significant lag when a large number of zones were changing.  Some profiling narrowed this down to the interface between the depth camera backend (written in C) and the web frontend (written in Ruby):

Results (267.6715s elapsed):
Zone.new count: 22162 time: 110.78 each: 0.005 overall: 41.39% 
ovh_plot count: 693 time: 61.45 each: 0.089 overall: 22.96% 
ovh_png count: 693 time: 53.60 each: 0.077 overall: 20.03% 
kvp count: 22015 time: 48.95 each: 0.002 overall: 18.29% 
Zone.normalize! count: 22162 time: 28.21 each: 0.001 overall: 10.54% 
get_zones count: 123 time: 13.55 each: 0.110 overall: 5.06%

analysis

The Depth Controller web interface uses the same text-based protocol on port 14308 that is made available to users for custom integration.  As such, every time a zone's contents change, the web frontend has to parse a line of key-value pairs describing the change.  The Zone.new line measures the total amount of time spent parsing zone information from the backend.  This total includes time spent in kvp, which is the step of parsing a line of key-value pairs into a Ruby hash, and Zone.normalize!, which makes sure all of a zone's attributes have the right data type.

The numbers show that, even with the web interface open and watching the overhead view, zone parsing is taking more time than either of the image processing steps for generating the overhead view (41.39% for zone parsing vs. 22.96% and 20.03% for overhead generation and compression)!  This code had already gone through a couple of rounds of optimization before it ever reached the public, but has become a bottleneck once again.  It looks like finding a way to attack both kvp and Zone.normalize! will knock out almost 29% of that 41% total.  Additionally, the get_zones task, which executes rarely but runs for 110ms straight, includes some zone processing, so improving the speed of Zone.new will also shorten get_zones and cut down on intermittent jitter.

To understand how to remedy the situation, I prepared benchmarks of each of the methods I could use to speed up kvp.  In addition to different kvp implementations, I tested JSON, YAML, and Ruby's eval function, as well as a standalone Zone class that didn't rely on hashes at all.  After some testing, I wrote a simple state machine-based version of kvp in C that avoided all use of regular expressions.  It also parsed integers, floating point numbers, and strings directly, thus eliminating the need for Zone.normalize!.  These are the results on my desktop i7 (ARM is slower, but the proportional relations hold):

Implementation→

↓Task

Baseline kvp: Ruby regex, C unescape Hash-free Zone class kvp 2: C regex, C unescape kvp 3: C state machine, C unescape JSON.parse YAML.load Ruby language eval()
Parse full
zone line
12804/s N/A
7191/s
(0.56x)

65832/s
(5.14x)

46910/s
(3.66x)
11707/s
(0.91x)
24014/s
(1.88x)
Parse partial
zone line
23485/s N/A
 18003/s
(0.77x)

152892/s
(6.51x)

N/A
N/A
N/A
Update zone
with new data
935222/s

 83390/s
(0.09x)

N/A N/A
N/A
N/A
N/A

conclusion

The C-language state machine-based kvp is the clear winner, resulting in as much as a 6.5x speed improvement.  With all overhead accounted for, the new key-value parser can process zone updates 2.4 times faster than the original code.  The final word will come from the results on the controller itself:

Results (274.1677s elapsed):
ovh_plot count: 1044 time: 74.07 each: 0.071 overall: 27.02%
ovh_png count: 1044 time: 60.15 each: 0.058 overall: 21.94%
Zone.new count: 20687 time: 38.20 each: 0.002 overall: 13.93%
unpack count: 1049 time: 19.26 each: 0.018 overall: 7.03%
kvp count: 21324 time: 11.68 each: 0.001 overall: 4.26%
get_zones count: 131 time: 5.89 each: 0.045 overall: 2.15%

There you have it.  With the new and improved key-value pair parser, Zone.new finds its rightful place below the CPU-intensive image processing tasks, and the web UI feels much snappier.  We gained nearly all of our expected 29%.  All this was accomplished without changing the backend protocol.  There are still optimizations that could be made in the new kvp code, but if zone processing becomes a bottleneck again, I will switch to something like MessagePack (initial testing shows a further 2x speedup over the new kvp).