EmojiViz - Experiments with Visualization of Real-time Emoji Twitter Trends

I am currently playing with visualizing tweet activity of emojis. The first steps involved hooking into Matthew Rothenberg's emojitracker streaming api, which itself is based on the twitter streaming api.

This first draft experimental viz (not yet tweaked for small screens) is available at http://learnforeverlearn.com/emojiviz/

Viewing dynamic emoji tweet activity
Experiment #1 - Archimedean Spiral
Interactive viz is at http://learnforeverlearn.com/emojiviz/

Note that my mentions of Rothenberg's work here barely touch the surface of what he has done with his emojitracker project, and he has several interesting satellite projects.

How I Got Here
A German Rebus from 1620
i.e., emojis of the 17th Century

The other week, I was playing around with writing a twitterbot/sentiment-analysis app with node.js. I can run this app and watch (and listen) to tweets stream by from the twitter streaming api. The sentiment analysis component libraries I've evaluated are not particularly satisfying: it's a hard problem.

As part of reading more about tweet sentiment analysis, I came across a recent paper from Sept 2015 that explored the "Sentiment of Emojis" (authors are P. Novak, J. Smailovi, B. Sluban1, and Igor Mozeti). Not unsurprisingly, they concluded that "The sentiment distribution of the tweets with and without emojis is significantly different." They also noted that "emojis tend to occur at the end of the tweets", which is interesting because my wife had independently observed that, for text messages at least, emojis can provide a subtle note of finality to a conversation that lets the receiver know that no further message back is necessarily expected.

Novak et al also have a nicely condensed "sentiment bar" visual for summarizing the sentiment of an emoji:

The "Sentiment Bar" by P. Novak et al in Sentiment of Emojis (Sept 2015)
The colored bar extends from −1 to +1, the range of the sentiment score. The grey bar is centered at the mean score and extended for ±1.96 times the standard error of the mean. Colored parts are proportional to negativity (p−, red), neutrality (p0, yellow), and positivity (p+, green).

I've incorporated something like this (although I use a light grey for "neutral") into a custom world tweet map that consumes the twitter streaming api and it is fun to watch.

In their paper, they refer to and make use of the site emojitracker.com by Matthew Rothenberg, "a web site which monitors the use of emojis on Twitter in real-time." I had never heard of it, and was pretty impressed when I took a look.

Screenshot from Matthew Rothenberg's emojitracker.com
emojitracker includes an app that consumes the twitter streaming api, tracking the real-time use of 845 different emojis.

What started as a "quick weekend hack" for Rothenberg morphed into a fairly substantial set of services and architecture that comprise emojitracker (code details here). He has a long-and-well-worth-the-read writeup about it on medium.com (note that he has indicated that this is out of date with regards to the much of the architecture). The popularity of emojitracker first spiked in July 2013, and he includes his experience with that as well in the writeup.

Rothenberg also exposes a public streaming api that lets you get the raw updates as well, via server-side events (which themselves are very very cool).

Towards the end of Rothenberg's delightful "brain dump", he wonders about alternate visualizations, and he tells the internet that "I’d love to see what ... folks ... can come up with for ways to show the data in interesting ways. I’d be happy to work with anyone directly who has a cool idea that requires additional access."

I was certainly interested in playing with the curated emoji data he was sending out to the world. After contacting him to confirm things were still a go on that, I started down the path on my own little side project of emoji tweet visualization, the first fruits of which are described here.

EmojiViz

In thinking about what I wanted to "see", several things came to mind

  • All emojis should be on the same screen
  • There should be some indication of the current trends

I initially didn't know if I would use d3.js or webgl. Based on the initial amorphous ideas I had with regard to animation, it seemed that webgl would be necessary for performance reasons. However, as things crystallized, I settled into d3.js for the first tests.

So, how do you fit all of the emojis on the screen and still have them be identifiable? Well, the displayed numbers will need to go for now (I want to bring them back as I can). But how should they be arranged? I settled on a spiral for the first steps into the forest. Rounded curves are nice. This forced the implementation of many pieces that should be largely independent of display. While ranking is a little confusing for the more rarely used emojis near the center, it looks kind of nice.

I had first looked into using the logarithmic spiral, as its occurrence in nature has been noted widely, which adds a pointless but interesting tinct to it. However, the logarithmic spiral gets too large too fast, and the Archimedean spiral has a nice steady increase of the internal radii. On a side note, Jacob Bernoulli was fascinated with the logarithmic spiral, and even wanted one engraved on his headstone. But alas, an Archimedean spiral was erroneously put there instead (wikipedia). It makes me wonder if there was a similar error when it came to the words on Shakespeare's headstone. Shakespeare died in 1616, and Bernoulli died in 1705. While a bit of a stretch, it could have been the same mistake-prone tombstone engraver.

Anyway.

I based the core routines hitting the api on Rotthenberg's implementation. He used very clean coffeescript, which I started to emulate for a while, but it was too frustrating to ease that into the toolchain given other priorities, so I converted what I needed to to plain js. And isn't coffeescript going to fade away after its importance influence on the recently approved ES6?

Server Sent Events

The data is pushed from Rothenberg's server via server sent events (SSE). I had never used these, although they've been around a while. They seem to work pretty well.

emojis and svg

On the main screen, I decided to use images for the emojis, with the thinking that scaling could be handled better.

You can incorporate images into svg directly via an image tag with a data uri, like so:

<svg ...>
    <image xlink:href="data:image/png;base64,iVB...(rest of data)" 
           height="blah"
           width="blahwhatever"
           x="x-for-upper-left-corner"
           x="y-for-upper-left-corner"
    </image>
</svg>
I got the data uri's into a form usable in the svg by creating a js file defining a hashtable (key being the emoji id) by parsing the css files from Rothenberg's https://github.com/mroth/emojistatic (yet-another-emojitracker-satellite) project.

Spacing the emojis along an Archimedean Spiral

In polar coordinates $(r, \theta)$, where $(x,y) = (r \cos\theta, r \sin\theta)$, the relationship between $r$ and $\theta$ for an Archimedean spiral is given by $$ r = a + b \theta$$ The parameter $a$ determines the initial $r$ when $\theta=0$, and $b$ determines the distance between each successive spiral. Picking values for these was a bit of an empirical process, as I needed to include enough turns (i.e., how large $\theta$ gets) to include all of the emojis. This distance is actually determined from the spacing, and it seemed to be a more tedious thing than I realized to calculate a point at an arbitrary distance along the spiral.

It turns out that svg has some built-in methods for this. In particular, if I define an svg path for the spiral with a large number of points (and simply not display it), then I can use the getPointAtLength method to find the arbitrary point I seek. The spacing I used depended roughly on the basic size of each emoji image.

I insert the svg for each emoji in reverse order of rank, so that (initially at least) the ones with the highest overall rank will be "on top of" the lower ranked ones when they get larger based on activity. There is no other notion of "z-order" in svg as far as I know. This manual z-order-ish process breaks down when I rearrange the emojis based on recent activity, but I might be able to address it with better spacing.

This process seems fairly generalizable to other curves, which I look forward to experimenting with.

Updating Viz Based on Recent Tweet Activity

By default, every 10 seconds or so the ranks of the emojis are recalculated based on the activity since the page was loaded. For emojis that have the same number of tweets, they are sorted by their previous rank. This minimizes unnecessary movement of the emojis, especially for the more rarely used ones.

The updating process is animated with almost 900 simultaneous d3 transitions and tweens. A desktop is really the best platform at the moment.

Seeing Details for a Particular Emoji

If you click on an emoji, a popup is shown (this is a condensed and tweaked version of the one that is shown on emojitracker.com).

Popup when you click on an emoji.
This one is for the consistently most popular emoji.
"Face with Tears of Joy"
Popup is a condensed and tweaked version
of the one originally in emojitracker

For the popular emojis, it can be a challenge to follow the stream of tweet text. For now, this is partially addressed by pausing the list when you mouseover the list, but it raises an interesting question about what additional techniques could be used to help a person preattentively and comfortably comprehend the content of fast text stream displays.

Next Steps

The basic implementation of placing the emojis along a predetermined path seems fairly generalizable to using other curves that might be better suited for this type of display, which I hope to explore next.

Clearly, there is a need to some high level summary information included as well, such as top 10/rising 10, etc.

No comments:

Post a Comment

Popular Posts