Some Notes on Optimizing a D3 Visualization

After noticing that a visualization was getting a burst of traffic the last week or so, I decided to make an optimization pass on it.

The visualization uses d3 and requires several large-ish data files. The updated site is viewable here or here. The pre-optimized site can be viewed here.

Details are further below (as well as a list of tools used), but here is a comparison of the "before" and "after" for page loading is shown in the short video below (it might take a moment for the iframe to load). This was built easily via the awesome http://www.webpagetest.org.

Comparison of Page Load Before and After Optimization Pass
You can also view this directly on webpagetest.org
(both tests done from Miami-FL with Chrome using http://www.webpagetest.org)

And, to put that in perspective, here's a comparison of page-loading for news.google.com to my updated site:

Comparison of Page Load of news.google.com to Updated Site
You can also view this directly on webpagetest.org
(both tests done from Miami -FL with Chrome using http://www.webpagetest.org)
And so we start...

The visualization requires that the user download several megabytes of data files, which are then processed a bit before starting the simulation. Much of the lag associated with this is unavoidable, but I was seeking ways to both improve this as I could with this pass, and to inform the user as soon as possible that stuff is actually happening as these steps are occurring. In the previous version of the visualization, it takes a while for a "Loading" message to appear, and the page can seem to just freeze.

One of the first things I used to help diagnose problems was google's PageSpeed Insight tool (there is also a chrome plugin that integrates with devtools). Seeing the initial results was a bit painful, but highly informative. I had heard of the recommendations it made before, but the immediate relevance meant that maybe this time these rules stuck a bit.

There were a number of howlers right away:

  • Having blocking javascript downloads, which delayed when rendering could begin
  • Several of the data files were not being compressed by the server (in this case, googledrive, where some of the sites are hosted)
  • Too many separate javascript files, which were not minified

Data-wise, there was almost 8MB of data files that had to be downloaded before the visualization could begin, and the existing "Loading blah..." messages took a while to appear. Further, as noted above, some of them were not getting compressed prior to being served.

Optimization Pass on the Data Files

The data files are either json, csv, or tab-delimited. Json can be a verbose format, and since one of the json files was for a flat structure, it was a candidate for conversion to csv. This was done with one of the node utilities available - json2csv (there are several others that do this as well).

A quirk of the server behind googledrive (which I have no control over) is that it seems to use the file extension to decide whether to gzip the content or not. In my case, there was at least one large json file that was not for a flat structure, and so I couldn't just convert it to csv. But duh - just renaming the json file to have a "csv" extension worked, and googledrive would compress it then.

This reduced the total size of the required data files, AND resulted in all of them being compressed by the googledrive web server.

Javascript

I had about a dozen separate javascript files being downloaded. While most were being done at the bottom of the body of the html document, in reviewing these I realized that I didn't even need four or five of them anymore at all - they had been used in initial versions of the visualization. One of them was almost 300kb.

The answer, of course, was to make sure all of the javascript was actually needed, and use grunt (which I hadn't done yet - shame on me) to combine and minify the javascript, resulting in a single javascript file to be downloaded.

CSS

I only had a few css files, but I had also peppered some <style> elements in the body of the page, which can impact performance. All of the css was moved to a single file, including the css that had been being downloaded from mailchimp, and this is now minified via the node module clean-css.

Showing Content as Soon as Possible

While it is somewhat unavoidable that some large files will have to be downloaded and processed, it's important to make sure the user knows the page is not stuck or something. For this reason, a high priority is given to showing the barebones html "Loading - please be patient", with an indeterminate progress bar. The progress bar is an animated gif create via the nice online tool at http://www.ajaxload.info/. Further, rather than require an additional trip back to the server to get the image file, it is done with inline base64, which was generated with another nice online tool at http://http://www.base64-image.de/.

With the changes above, the initial rendering of the page occurs fairly quickly now.

Mobile

While I did not make any specific changes for mobile here, I did notice that the iPad has a hard time - I don't know at this point know if it's the calculations or trying to render all of the svg. The interface for the iPhone doesn't show a map at all, and the lists (which are all that is shown) seem to update fine.

To IE or Not to IE

ugh.

IE9-11 supports svg, but its implementation of the spec can be spotty. I get the feeling that IE12 will be a significant improvement, but this might not be out for a while.

Anyway, there is obviously something going on with IE, in both the old and new versions of this, as the map extends on top of the header bar and over the list on the right. I am guessing that it is another unimplemented feature of the svg spec at play here. For a visualization someone might look at for just a minute or so to get the gist of the processes involved, it kind of still works for that. However, I need to fix it, and will leave addressing this to another day. Note that if you don't have a box with IE on it (as I don't anymore), I highly recommend BrowserStack for doing this - they make it so easy to run different browsers virtually in your own browser and step through the browser's debug tools. You can even get heck of a lot done with just the 30-minute free trial.

It is interesting to me that less than 10% of the users hitting this visualization are even using Internet Explorer at all (compared to IE's market share of about 20%, according to http://gs.statcounter.com/). At moments like these, all I can say is "thank goodness".

Further Optimization

There are a number of things I can still look at for improving the performance for this visualization:

  • Do more of the processing offline, and have the user download the generated processed datafiles - I can probably set up a grunt task for that. I think that this would primarily reduce the delay that occurs after all of the data files are downloaded.
  • When the map isn't even going to be used (such as for the iPhone interface), don't even download the map files - this could have a significant impact.
  • Reduce the size of the barebones html and css for the initial page load - there's a lot of stuff in there right now that isn't needed until the visualization is fully loaded.
Summary of Tools Mentioned/Used

  • webpagetest.org - I can't believe I hadn't heard of this. This is an open source project that is primarily being developed and supported by Google. It is an awesome free online tool for debugging/inspecting your site's loading performance.
    • Run tests against your site from locations around with the world, with various browsers and devices
    • Obtain detailed information on page load speed, how long it takes to load the various resources
    • Get specific tips on improving your page load
    • Save previous tests
    • Do visual/video comparisons across sites (including to any other publicly available site), and easily share these comparisons
    • View how your page loaded
    • And the list goes on...
  • Google's PageSpeed Insight tool - online and chrome plug-in. Provides tips on improving the performance of your page/site.
  • BrowserStack - "Live, Web-Based Browser Testing" - so easy to run different browsers virtually in your own browser and step through the browser's debug tools.
  • json2csv, a node module for converting flat json structures to csv
  • clean-css, a node module for minifying css
  • http://www.ajaxload.info/, an online tool for creating animated gif progress bars/spinners
  • http://http://www.base64-image.de/, an online tool for getting the base-64 encoding of png/jpg/gif files

No comments:

Post a Comment

Popular Posts