Nowhere Near Ithaca: A Searchable US Tweet Map (Node.js/D3/OAuth)

The app discussed in the post is available here: https://ustweetmap-bflnni.rhcloud.com/main.html. I have also put together a searchable World Tweet Map using similar tools that is available at https://worldtweetmap-bflnni.rhcloud.com/.

Last year, I put together a searchable tweet map of the United States, mainly using d3.js. This used the anonymous 1.0 twitter search api, so that it could be run as a static little javascript app in the browser. The interface was based on the US births/deaths visualization I had done with Bill Snebold.

Later in 2013, twitter officially removed that api, requiring all calls to any of their apis to be authenticated, either for the app or for the app+user using OAuth. This broke many apps (including mine) relying on the previous 1.0 api.

The past few days, I went ahead and implemented the searchable tweet map to use the new 1.1 twitter api. This required moving some parts of the app to an app server to make the calls, for which I used node.js and a bunch of node modules (express, mongoose, passport, request, etc.) on the free tier of RedHat's OpenShift platform.

Per the requirements by twitter of the new api, OAuth is used when making the calls, which itself requires having the user sign into twitter and allow the app limited access in order to make the calls.

Here's a screenshot right after Tennessee lost to Florida in the SEC tournament today (I've since added ability to pause the app):

Tweets for "Tennessee", up to and a little after
the Tennessee basketball team lost to Florida today
in the SEC Tournament.
This app is available here (requires sign-in via twitter)

As far as changing the client app, the changes were minimal: mainly just change the api url from twitter to the one I put up with the node.js app.

As for the new app in node.js, this was surprisingly not much work. It was a lot of fun to finally dig into node and learn a lot of things. The ecosystem of helper node modules is astounding, handling just about every aspect of the process. And there are countless useful node tutorials across the web.

Here are a few of the resources that stand out in helping me along the way

This blog post by Dillon Buchanan on putting together a node.js/twitter streaming api/socket.io app
While I did not want to use the streaming api, as I need user-specific queries and don't want to have to store a ton of stuff myself, his little app (available on github) really opened my eyes to the easy power of node.
NodeJS 5 Key Things I Wish I Knew Earlier by rm2kdev
I love these kinds of articles, because they can typically save you a lot of time. His number one thing was the IDE: he said WebStorm, and so I used that, which made putting this together pretty fun - and debugging on the server side seems straightforward. He also stressed the breadth and depth of the node module ecosystem: if you need some grunt stuff done, chances are there is a node module that is gonna do it for you.
Passport.js by Jared Hanson
I find the documentation across the web fairly confusing on OAuth. Jared Hanson's module seems to be the defacto solution for handling OAuth authentication, allowing you to write just a few lines of code for basic functionality.
The request node module by Mikeal Rogers
Once I was able to get a person to sign in via twitter with passport.js, I still needed to be able to make authenticated calls to the twitter api. It took me a little while to realize that passport didn't do this - in fact, I finally saw that passport's Jared Hanson recommended Mikeal Rogers' module for that. Once again, a few lines of code, and I was getting search results like I did before, plus I can see what's in the headers coming back from twitter so that I can report back to the client where the user is against their rate limits.

If you're using Rogers' request module, then the specific header values related to the rate limits are given below. Note that the case is different than that listed on the twitter dev site about rate limits.
- x-rate-limit-limit
  This is the rate limit ceiling for this particular type of api call; this is 180 for the search api
- x-rate-limit-remaining
  The number of requests left for the current 15 minute window
- x-rate-limit-reset
  This is a date, in UTC epoch seconds, for when the rate limit will be reset (to 180 for the search api); I initially thought there was a rolling window, but in looking at the results, this does not seem to be case. You can calculate the current time in epoch seconds by using the value
  var epochSecondsForNow = (new Date()).getTime()/1000
  
  and compare that to the value returned in the header in order to see when it is safe to make calls again, should you exceed the rate limit and need to hold off until the reset occurs.

A Few Practical Notes

Avoiding the Rate Limit for Searches

As with the previous app, the tweets are found across the country by doing several localized searches centered at various locations around the country. This requires more searches, but I've found that you get more results that way. This set of about ten areas is searched in a staggered fashion every 90 seconds (the staggering seems to make it feel a little more responsive). This would result in about 100 hits against the api every 15 minutes, which seems comfortably below the 180 hits that are allowed for a user+app in every 15 minute window. In watching it run, I didn't see it get very close, but I went ahead and pass back both the number of searches left in the current interval, and when the counter will be reset back to 180 (as returned from twitter). When this gets close, the user is alerted that the searching has stopped until this time is reached.

If we get close to the limit of searches allowed
by twitter in a 15 minute interval, searching is paused
until after twitter says the counter will be reset.

In watching the values returned from twitter for both the reset time and number of searches left in the interval, I noticed some slight differences between consecutive calls. For example, one call might say the reset would occur at a certain time, and the next call would return a reset time that was several seconds different. I guess this is based on different twitter servers being hit behind the scenes.

Kludginess to Remove

The svg current map is not explicitly tied to latitude/longitude, and as before I am determining the county for a given latitude/longitude by having the client make a call to the fcc's block api that will tell you the county a given coordinate is in. This is really crude, and I need to move on from this. If I migrate the app to cover the whole world, I think that this will actually be simpler to do.

Working with OpenShift

Working with RedHat's OpenShift platform is really simple. They have a command line client "rhc" you can use for about anything. Deploying an updated version of this test app is done via a simple "git push". As far as deployment, I did notice that the server returned a 503 when I was deploying - I assume this would only happen on the free tier.

A Pleasant Path

I don't know what it is, but working with javascript on both the client and server felt really nice. Doing this little project has really opened my eyes to even more exploratory projects I am looking forward to.

I've read that Elvis had a favorite saying: To be happy, you just need something to do, something to look forward to, and someone to love. Certainly the possibilities of node have the first two covered for me here.

Nowhere Near Ithaca

A Searchable US Tweet Map (Node.js/D3/OAuth)

No comments:

Post a Comment

Popular Posts