Saturday, December 1, 2012

Simulating US Births/Deaths in Real-Time - a D3 Visualization

Note: I have extended this to a visualization for the entire world, which is on googledrive here.

Recently, I was wondering how what it would look like to "watch" as births and deaths were occurring across the globe, and I thought it might be interesting to put together a little app to assist with this.  I began exploring doing a 3D visualization with OpenGL/Shaders, but as part of the initial exploration into this I realized that the geographic data file for US counties was readily available from wikipedia, and so I decided to start with this first, using this as a good way to continue to learn how to use the excellent D3 javascript library as the core visualization engine.  Since the implementation is just html/css/javascript, this actually could be incorporated into a mobile app a la something like PhoneGap, but that is still down the road.

You can check out the visualization here or here (hosted for the moment on google drive).  The interface design is by Bill Snebold of Bill Snebold Design.


The visualization requires an svg-compatible browser (Safari, Chrome, Firefox, IE9). I've successfully tested with Safari, Chrome and Firefox on an iMac, and Chrome and Firefox on Windows 7.  I've heard that it works in Chrome and Firefox on tablets with Android's Ice Cream Sandwich. It acts flaky on my iPad for some reason - still tracking that down.  Some initial changes have been made for iPad/iPhone, but the animations on those devices may be "jerky".  Please  feel free to send me feedback about what you see on your device.

Here are a few notes on interacting with the visualization:
  • at the moment, the (json) data files together are a total of about 6MB, so it may take a bit to download them, depending on your connection
  • once a simulation starts, it may take a few seconds for the first simulated(!) birth/death to occur
  • Mousing over counties in the map will show a popup with some basic info on the county; clicking the county will take you to a google search on the county in a new window
  • Mousing over a birth/death in the list on the left or right will cause the associated county to briefly "pop out" on the map; clicking in the list will take you to a google search on that city or county

Predicting Births/Deaths

Currently, the visualization is for a statistical simulation using average birth and death rates for the entire US, as estimated in July 2012 in the CIA World Factbook.  These are 13.7 and 8.4 per thousand people per year, respectively (in contrast to the values of 13.5 and 8.1 used in the initial implementation, which were from the U.S. Census Bureau, Statistical Abstract of the United States: 2012).  The (simple!) approach to simulating births/deaths is as follows:
  • for a given time interval of length dt and total US population of N (~300 million), the expected number of births or deaths λ is calculated as 
    • λ = (birth or death rate per year per 1000) * (N/1000) * dt
  • the total number of births or deaths for the entire US predicted for the given time interval is estimated by sampling from a Poisson distribution with mean λ; this is done by obtaining a random number between 0 and 1 and then comparing this to the cumulative distribution function for the specified Poisson distribution
  • for each birth or death obtained this way, the county is determined by obtaining a random number and then choosing the county based on its fraction of the total population (e.g., if a county has 0.001% of the total population, then there is a 0.001% chance that this county would be the one where it occurred). Once the county is determined, a similar process is used to determine which city/town this event occurred in in the county (or outside of any city/town in the county, which can happen)

There are far fancier ways of approaching this simulation; for example, use state/county/demographic-specific rates,  estimate the cause of death for each occurrence based on available data from the US Census or other mortality tables freely available, and/or use information on how the rates may be change during the day/week/month.  Replacing this component of the visualization shouldn't be difficult - certainly nothing compared to the potential effort in gathering/curating/cleaning/formatting the data.

N.B. The birth and death rates above should correspond to about 11,500 births and 6,800 deaths per day, on average. I accidentally left a Safari window open last night, and here a half-day later it is at roughly half of these values (~5500 births and~ 3400 deaths).

Data

A number of sources of data were brought together for this. However, I think that I have just skimmed the surface in terms of exploiting the data available.  In all cases the final form of the data is in json, although there was some preprocessing required for some aspects.  In particular, getting the relevant county for each of the ~30,000 places/locations from their latitude/longitude was more of a challenge than expected, as this association was not included in the original files that I found.

  • The county-level svg map data are from USA Counties.svg on Wikipedia; note that adjustments were made to address a handful of counties, primarily in Alaska, that have changed since this map was released

Implementation with D3

D3 is a (free) javascript library by Michael Bostock.  While it may take a little while to get your head around the way it works (which I am still working on), the glorious visualizations that people continue to implement are powerfully motivating.  For the current visualization, while there are not that many lines of "D3-specific" code in the custom javascript, this amazing library is the core for everything.  It makes loading and rendering a snap, and its animation capabilities (via what are referred to as "transitions") are straightforward to use.

The animations for this visualization consist of the "popping out" of the counties when a birth/death occurs (and subsequent "nestling back" to its original location on the map), and the slight "popping" out whenever a prior birth/death is hovered over in the list.  Both of these animations utilize standard svg scale/translate transformations specified via D3 methods, coupled with D3's transitions.  There is a small bit of an extra step with the timing of the transitions, as a queue is used to ensure that only one birth or death is brought to the front at a time.

Note: After getting this visualization close to completion, I came across an interesting Flash-based 3D visualization for real-time births/deaths across the globe, originally implemented in 2008.  Implementing a real-time world birth/death simulation visualization using something like Google's Chrome with WebGL is likely around the corner by someone.


And finally...

There might be some quirks on different browsers, especially on mobile devices, that I have to fully investigate.  I know that on my iPad3 the map kept moving down as the animations move up, but I believe that this was fixed by properly setting the preserveAspectRatio property of the relevant svg element (I wrote a short note about it here). 



Popular Posts