Exploration of the Lognormal Distribution - a D3/MathJax/jStat Interactive Visualization

This is a short note on a interactive visualization I have been playing with. It is embedded below from this site on googledrive. Depending on your browser and device, there may be rendering quirks with MathJax and/or the embedding iframe - please let me know if you see any oddities. It looks fine for me in the latest version of Chrome on a desktop (and I've updated some things to improve responsive design).

What is This?

The project is a visualization to let you explore the lognormal distribution. It is built with D3, MathJax, and jStat. It's a somewhat niche thing, as it's intended for someone somewhat familiar with the lognormal distribution (and continuous distributions, etc.), and so is not intended as an introduction. However, it might help someone get a better feel for how the lognormal distribution works. In particular, how its shape, mean, and standard deviation are (interactively) impacted by various changes, including changes in the parameters of the underlying normal distribution (by "underlying normal distribution" it is meant the distribution for which \(ln(X)\) is normal, where \(X\) is a random variable that is lognormally distributed).

A short video on some of the features
of this interactive visualization

The lognormal distribution has a simple parameterization due to its definition in terms of the normal distribution, but this can still result in surprising behavior: since the lognormal distribution is derived as the exponential of another distribution, the magnitude of the impacts of changes can be difficult to predict: we are generally not good at appreciating the magnitude of exponentiating anything.

As noted on the wikipedia page for the lognormal distribution, the lognormal distribution is applied in many many areas. Chances are, if you have ever been involved with the analysis of data that had positive values with a long-ish upper tail, you have been exposed to the lognormal distribution, and either considered using it or actually did so.

Summary of Features

The visualization allows you to play with the various parameters defining the distribution or the underlying normal distribution. You can dynamically see the impact of changing

  • the upper or lower 95th percentiles of either the lognormal or its underlying normal distribution
  • the median of the lognormal
  • the mean of the lognormal or the underlying normal distribution
  • the standard deviation of the lognormal or the underlying normal distribution

Note that there are an infinite number of lognormal distributions that have a specified percentile; for the purpose of this visualization, I fix \(\mu_N\) (the mean of the underlying normal distribution) and solve for \(\sigma_N\) (the standard deviation of the underlying normal distribution).

One main goal I have is to try to highlight connections in context. For example, when you are tweaking the mean or standard deviation of the underlying normal distribution, it highlights where these are in the pdf equations themselves.

As you change the parameters, and the graph changes accordingly, a dashed line is maintained from the pdf equations themselves to the rendered curves.

When you move your mouse over the curve, a little dashed line shows the value on the \(x\) axis and the associated cdf value at that \(x\).

I'm using standard jquery tooltips everywhere to try to make clear what the different parts are. This actually resulted in some issues with lingering tooltips as you are trying to do something else, but I think I've dealt with this now by hiding/showing them based on mouseup/mousedown events.

Beautiful Mathematics Typesetting: MathJax

I wanted to include the formulas for the pdf of the lognormal distribution and the underlying normal distribution on the visualization. Further, I wanted to be able to highlight parts of them based on which parameter the user might be tweaking, with the purpose being to help show the connections between the various entities. This kind of thing always helps me at least. I considered just munging through this with html-encoded values, and maybe some table layouts, which would make it easy to highlight the subparts I cared about, but these looked awful. I decided to go ahead and check out MathJax again.

Based on Donald Knuth's TeX, this pure javascript solution renders absolutely beautiful mathematical typesetting, and to top it off it turns out that - since the equations are generated with svg/html - you can use html element ID's for additional styling and interaction.

$$ \cssId{equationlognormal}{\text{pdf }} f(x) = {1 \over {x \ \cssId{sigmanormal_1}{\sigma_N} \sqrt{2\pi}}} e^{- \large{ {{ { ({ln \ x} - \cssId{meannormal_1}{\mu_N})^2 } } \over {2 {\cssId{sigmanormal_2}{\sigma_N}}^2} }} } $$
Beautiful mathematics typesetting with MathJax,
integrated right into the DOM as svg/html.
Here, when you "hover" over \(\mu_N\) or \(\sigma_N\), their color changes to red

Further, it looks like it would be straightforward to MathJax inside the jquery tooltips as well.

Note that I have noticed that sometimes the equations don't render on a tablet for some reason.

I had first been manually generating and labeling the axes "myself", but decided to take the few minutes required to see how to better use D3's built-in functionality for this. Coupled with AlignedLeft's excellent tutorial and this stackoverflow post, so much drudgery was removed, and then I got the beautiful number formatting and tick selection that's incorporated under the hood by D3.

Every time I might think it's worth it to get something done quickly myself, it's always been well worth it to check out what D3 can do already.

Still to Do...
Responsive design for this visualization
- slowly getting there

I am still squabbling with responsive design with this thing. I have gotten it to work better in the last day or two (12/28-2013), and am perhaps getting closer to figuring out for once and for all how to get that to work reliably and in a general way for D3 visualizations.

I have been able to improve performance and usability on a Nexus 7 and iPad, as well as get the equation to show up (helped by using my own copy of Mathjax).

When I embark on these little explorations, I usually only have a vague and intuitive idea of what I want the interface to look like, and it grows and wanders organically based on what I think would be cool as I get to each new location in its evolution. Sometimes I feel like I'm on a trip as described in Chapter V of Poe's Julius Rodman, where

No sooner had I examined one region than I was possessed with an irresistible desire to push forward and explore another.... I could not help being aware that some civilized footsteps ... had preceded me in my journey - that some eyes before mine own had been enraptured with the scenes around me.... I was anxious to go on - to get, if possible, beyond the extreme bounds of civilization - to gaze, if I could, upon those gigantic mountains...

No comments:

Post a Comment

Popular Posts