Tuesday, May 27, 2014

It's the little things that matter

Little things called bugs...

Silent crash on specific data

I started the past week by investigating a bug. It can be reproduced on the current dashboard:

Updating to the latest versions of d3 and nvd3 didn't do the trick so I spent some time looking at the data and the callstack. This plot uses a library called nvd3 that is built on top of d3. The plot calls the voronoi method (voronoi layouts). This method returns an array of polygons, one for each input vertex in the specified data array. If any vertices are coincident or have NaN positions, the behavior of this method is undefined: most likely, invalid polygons will be returned. 

In our case we've got points that are coincident so..we get undefined behavior. The fix is adding some random tiny noise to the data. 

 Tooltip undesired behavior

 Sometimes, on the current dashboard, the tooltip doesn't show up.  After testing for a while I realized that if you refresh the page (and get an empty svg) it works, but if you don't the tooltip might not show up. Emptying the svg resolves this problem. 
 Until now we were preprocessing the data at any filter entry, no matter if we had duplicated data or not (x_x).  It was about time to add some cache support. I added some caching in both dashboard.js and in telemetry.js. In dashboard.js we cache all the data that we prepared for plotting and in telemetry.js we cache all the requests made (patch here). 
Because we sync with all the events on the page the url may get really long so "tiny url" is a nice to have feature. I looked at the free options and picked bitly for it's jsonp call support. Because of the same-origin policy you can't make normal calls to a different domain so you either implement CORS or stick to JSONP.
New legend
A nice clean legend for several series was on my list last week. I tried to implement it with d3 and didn't get to the result I wanted so I switched to JQuery and added some d3 formatting.

Last but not least... 

Fresh air (or steam) comes with fresh ideas. Check out Yellowstone :)


Tuesday, May 20, 2014

A step backward, after making a wrong turn, is a step in the right direction..

The version* feature

As a nice to have feature we could aggregate all the data in a single series for all nightly/aurora/etc.
I spent one day on this matter and ..it can be done ...but let me explain why not:
At this point when we make a request for a specific versions/series/etc we get three jsons:
* filter-tree.json
* histograms.json
* [measure]-[by-build-date/-by-submission-date].json

In histograms.json we get all the measures while in filter-tree.json a tree of filters. 
After traversing the filter tree and selecting the filter ids we take the list of ids and look in  [measure]-[by-build-date/-by-submission-date].json at each entry. Whenever finding a corresponding id we append the data that afterwords is plotted.

Imagine doing that for all versions add a large quantity of data. Remember that the processing is done on the client side (maybe on a slow device?)... it might end up in flames :(...
End of version* story for now...(it could always be done on server side)

Several series. Synchronize with top filter

If you want to preserve all your selections for another version you can do that by synchronizing with the first filter and the selection will get propagated on all other filters. 

Because I added "sync with hash" now you can land on the dashboard in two ways:

 * by specific state (example:/index.html#nightly/32/CYCLE_COLLECTOR&nightly/25/CYCLE_COLLECTOR) 

In this case the filters are not synchronized. If you want to synchronize them you just click the lock button.

* by a default landing

On the default page the sync with first is enabled. If you want to disable the sync you just push the lock button once.

Sync with state

In this version you can preserve your options for another time by saving the url from the dashboard and plotting it in a browser afterwords. This weeks version will come with a refactored and detailed one using some jquery magic. It will get some tinyUrl option too so it can be easily dropped on irc.

By adding multiple versions at some point my dashboard started dying..I began profiling and I got lucky.

In the previous version we used a library called Bootstrap, we had a default implementation for a simple selector that was extended with some shiny bootstrap magic. All was awesome till we started adding several selectors ... 
Multiple things might be the cause:

 * big overhead for computing and aggregating the data
 * d3/nvd3 lib 
 * selectors
 * bad synchronization 
 * me writing horrible and unoptimized code

The last three were a quick fix :D
After switching to a different library, modifying the selector and resolving some bugs I ended up with this comparison:

If you take the same mix of series on load and look at the time spent in the selector you get interesting findings:

Select2 FTW!

Temporary fixes and D3

On the production dashboard the plot comes with a very nice legend that turns unusable when you add several versions. On this weeks version I had to fix that. I removed the legend and put it aside in a scrollable svg. It turns out that even so it is cluttered. I removed the information that doesn't make sense for multiple versions: Summary & Histogram. The functionality is still there but is set visible only for one filter or land on the default page. 

This week

Today I had an interesting discussion with one of the users and it turns out that the Summary and Histogram would be a nice to have for multiple versions too. A zoom in the histogram would be nice :D. 
I think I figured out why D3 crashes when it gets some particular type of data..(production bug)

I went for a quick fix, but that is not a long term solution. Still looking for some other options. My guess, this bug triggers the tooltip undesired behavior too. Getting at the bottom of this might fix more than one problem.

Monday, May 12, 2014

The Good, the Bad and the Ugly

I just finished my second week as "the intern" that works on Telemetry Dashboard.

After my first blog post I got some feedback that made me switch between tasks.
Although dygraphs sounds like an option and d3 can be a pain in the ass I switched back to the latter for flexibility reasons (d3 can be pretty powerful you know:p).

I spent a day or two trying to figure out a way to make the bootstrap selectors for versions/measure take multiple versions. The selector was written "one version/measure at a time" and because all the filters were tightly linked together the code got uglier and uglier...
till my eyes almost started bleeding so I said...enough!...let's find a fast workaround...and maybe on the way we get things done even better :p 

The "right in the face" option was to add the same kind of selector for each new version dynamically. 
I needed some new elements and functionality: add/remove buttons.

This way we comply with the feature request: "add multiple versions on the same plot" and also we can add different measures for each specific version (if that might be the case).
Next step was to add a new multiple selector so that we can automatically select the submission/percentile/etc over several versions:

Next steps this week will be trying to add "the * feature", adding support  in telemetry.js for aggregating all data over all versions and some other details that at this point make sense but don't in the context of multiple versions (histogram/summary).
 If I get the time I will also like to investigate why in some cases(maybe because of the data) d3 crashes.

Monday, May 5, 2014

Teh foxy telemetry dashboard...a first week at Mozilla

My story starts with the Telemetry Dashboard at Mozilla...

I just started working as an intern at Mozilla for the Performance team and my first task was to improve and add some features to the Telemetry Dashboard.

Keeping the story short

Telemetry submissions gather into large files and are then uploaded to S3. The analysis framework then downloads them, unpacks them and runs an aggregation over the submissions, on file at the time.
When a set of compressed files from S3 containing telemetry submission have been aggregated. The aggregates are then merged into the a public S3 bucket.
The telemetry dashboard fetches the aggregated results from the public S3 bucket. To make it easy to consume this in a browser, everything is stored in JSON, but for storage efficiency reasons (and bandwidth) the storage format is not very intuitive.

To allow people to access the data without having to read a complicated JSON format one can use telemetry.js.  The idea is that people who write dashboards and present the results of the aggregated data should access it through this library. That way, if ever wanted one can change the serverside storage format (this is something we will be doing) and still have a working dashboard.

Usability, use cases and feature requests


Feature request one:
To see aggregated results  one has to specify channel, version and measure.
There are two plots on the side, the top one shows evolution of the measure over "build dates" or "calendar dates". It does so by plotting mean, submissions, and percentiles over time.
However, there is currently no way to show the evolution over multiple versions.
Feature request two:
It would be useful to have the latest build version as a default one so there wouldn't be an additional step of selecting from all the nightly versions to get the latest updates on the current version.
Feature request three:
Get all the the selected options as parameters in the url.
Feature request four:
Add checkAll/uncheckAll buttons so you don't need to select all the features by serial check.

First week updates


As I started looking at the code from dashboard.js and after talking to my mentor we got to the conclusion that some major changes might be needed as the dashboard itself is designed to support only one version at a time.
I started looking for some open source libraries (a nice post on this topic can be found here) and played with javascript as it is new to me. I looked at dyghaph library and it seemed like a reasonable one to use.
I started by changing the histogram evolution diagram.
 I also removed the range selector as the new diagram provides zoom in by selecting range.

Next step was to dynamically add checkboxes for selecting the specific features that we need to plot ..(that made me learn about how damn neat jquery can be :p)..it is not that pretty yet but it works.


I added two buttons so that checking/unchecking can be a click away for the all features selected/unselected.

By Friday I added data(test data) for several versions on the graph. (because of same duplicated series given as input the graph looks like as drawing only measures for one version but actually, if you are looking at the labels, two identical are present). 

Bottom line: I started doing some neat stuff last week an Mozilla aaaand got a foto with the Fox at the Firefox 29release :D

 This week I will be refactoring the code and I will implement the missing parts. I am pretty sure I've got a loooooot to work/learn this week :D.
BTW my live playground is hosted here.