Graphing Total Daily Tweets

Last year I started tracking the total number of tweets posted to Twitter every day. I wrote a Ruby script that uses the Twitter API to post a new tweet to a private Twitter account and log the ID number of that tweet. That script runs once a day at the same time every day. Tweet IDs appear to be auto-incrementing integers, so we can track the total number of tweets posted in the past 24 hours by subtracting today’s tweet ID from the previous one. Another Ruby script processes the log file so I can feed it to Excel to produce pretty graphs.

I started the script in April 2008 and promptly forgot about it before checking out the results today.

Daily Tweets

This graph shows the number of tweets posted each day for most of the last year. I had to smooth the data in a couple of places when Twitter was down… in those cases I just extrapolated between the closest two points.

A few quick takeaways:

  • As noted by others, Twitter is busier on weekdays than on weekends
  • Notice the traffic surge during the November 2008 elections
  • Notice the decrease over the 2008 Christmas holidays
Advertisements

Inside the LinkRiver Favicon Server – Ruby + Nginx + Thin + Rack

Favicons on LinkRiver

LinkRiver displays favicons next to most links to help users recognize link targets. Those favicons are served separately from the main LinkRiver server. This post describes some of the design decisions and approaches I took when building the FI server.

My most important requirement for the favicon server (FI) was that it be loosely coupled to the LinkRiver (LR) server and reusable for other applications. The LR server could link to a favicon for *any* page without worrying about whether the icon exists on the FI server. If the FI server already had the icon, great – it would serve it up. If not, it would send back a default icon. This requirement ruled out Amazon’s S3 service because it won’t allow you to return a default image/page in response to “404 Not Found” errors.

When LR wants to display the favicon for a site like Twitter, it generates a URL like this:

http://favicons.linkriver.com/f1/25/twitter.com.ico

LR knows how to “map” host names to the directory structure (f1/25 in the example above). Keeping icons in a two-tiered directory system likes this makes it easier to manage the large number of cached files (its bad to have zillions of files in one directory). It also serves as a minor obstacle to others hotlinking to these favicons.

Behind the scenes it would work like this. A fast/lightweight web server like lighttpd or Nginx would sit in front of all requests to serve already-cached static files. When an uncached icon is requested, the FI server queues it up for later download. I have a lightweight non-persistent message queuing class built on memcached and Ruby that would be perfect for this. All the FI server has to do push the request values onto memcached and then tell the web server to send back the default icon.

First Attempt — PHP via Lighttpd and FastCGI

LinkRiver is written in Ruby on Rails using Nginx as a load balancer and static page server with mongrel as the Ruby app server. I love working in Ruby, but for this app, rails would have been overkill. I wasn’t familiar with ways to run Ruby using a faster/lighter server so I dusted off my trusty/rusty PHP skills. Remember – the only thing PHP had to do was push request values to memcached and tell the web server to return the default icon. Something like this:


X-LIGHTTPD-Send-File header tells lighttpd to return a static file to the browser — this is much faster than having PHP do it. I banged this out in about an hour and it worked great.

Second Attempt — Ruby Via Nginx and Thin/Rack

My PHP+Lighttpd version of the FI server worked just fine but I didn’t like supporting both Nginx and lighttpd. I also prefer coding in Ruby whenever possible. Was there a lightweight way to run Ruby on a web server? That’s where Thin and Rack come in.

Thin is a wicked-fast Ruby web server that’s perfect for what I was trying to do — run a fairly simple Ruby script on a web server. Thin is the web server itself – Rack is an interface that defines how Ruby interacts with the server.

Thin runs a Rack config file that looks something like this:

require 'favicon'
require 'mcqueue'
q = MCQueue.new(QUEUE_SERVER, QUEUE_NAMESPACE)
map '/' do
  run FaviconAdapter.new(q)
end

For all requests that make it to Thin (remember – all cached icons are served by Nginx directly and never reach Thin), Thin creates an instance of my FaviconAdapter class and “runs” it, which means it will call the FaviconAdapter’s “call” method and pass in information about the request. Our call method parses out some request information (the hostname for the favicon), pushes it to memcached, and returns an HTTP status code, headers and body, just like the PHP version.

require 'rubygems'
require 'thin'

DEFAULT_HEADERS = {
  'Content-Type' => 'image/x-icon',
  'X-Accel-Redirect' => '/protected/default.png'
}

class FaviconAdapter
  def initialize(queue)
    @queue = queue
  end

  def call(env)
    req = Rack::Request.new(env)
    //
    // A couple of lines removed to parse the request and
    // push it to memcached...
    //
    [200, DEFAULT_HEADERS, ['']]
  end
end

The X-Accel-Redirect does the same thing for Nginx that the X-LIGHTTPD-Send-File header does for lighttpd: it tells the web server to return the file directly instead of streaming it through our Ruby or PHP code.

The new Ruby FI version has been solid and stable like the PHP version before it. The new version should scale better too — in my tests, Nginx handles high load better and serves static files at the same high speed at lighttpd. My Ruby code is outperforming the PHP code by about 30%, but that’s not quite a fair comparison. The Ruby version caches its connection to memcached while the PHP version must reconnect for each request.

That’s all for now.

Refactoring with the Pickaxe

Programming Ruby, better known as the Pickaxe, the most valuable "programming book" on my shelf. Actually, it doesn’t live on my shelf, it lives on my desk. I’ve learned and used lots of languages over the years but I never remember using a reference book like this before. Ruby is an incredibly rich language and I find myself "mining" the Pickaxe looking for cool ways to fix problems and writer shorter and more readable. Here’s an example of refactoring a simple method.

Start with a hash object called params. A hash object is just a list of name, value pairs: this1 = that1, this2 = that2, etc. I have a hash object and want to build a URL query string out of it. We need to string our variables together to get something that looks like this: this1=that1&this2=that2&this3=that3. The key are joined to values with an equals sign (=), and the combined values are joined with an ampersand (&).

Here was my first crack at the code.

query = ""
params.each do | key, value |
    if (query != "") query += "&"
    query += key + "=" + value
end

This works, but it seems longer than necessary. So here’s take two.

query = ""
params.each { | key, value | query += (("&" if (query != "")) || "") + key + ‘=’ + value }

This code works, buts its not very readable. I wonder if there isn’t some cool Ruby method that help us clean this up. Thumbing through Pickaxe, we find that the Hash object mixes-in Enumerable, which has a method called collect that should make things easier. Hash::collect iterates over a hash, processes each item according to a block, and returns a new array – perfect. Let’s use collect to create a new array of keys joined to their values with the equals sign. After that, we can use the Array::join to join the tuples with the ampersand.

tuples = params.collect do | key, value |
  key + "=" + value
end
return tuples.join("&")

This can be shortened…

tuples = params.collect{ | key, value | key + "=" + value }
return tuples.join("&")

… or even shorter, and I think just as readable …

params.collect{ | key, value | key + "=" + value }.join("&")

Tags: ,

Spoiled by Ruby

I’ve been doing a lot of Ruby work lately and love the concise yet readable syntax. But its also tainting my appreciation for other languages. Instead of writing this in Ruby…

return if (!something)

… PHP/Delphi/C++ makes me write something like this:

if (!something) {
    return false;
}

Of course you can pull off this one-liner PHP/C++, but its nowhere near as readable as the Ruby version, especially when something is an expression of any complexity.

if (!something) return false;

Ah well.

Hacking Tour de France Updates with Ruby

I’m going to miss the Tour de France coverage tomorrow because I’ll be
on a plane up to Portland for the day. Normally I would catch the race
on OLN.tv or on the "Live Tour" page
at the official tour web site. The "Live Tour" page is pretty slick and
provides updates every few minutes during the race, but that doesn’t
help me when I don’t have web access. Hmmm….

I threw together a Ruby script last week when I took a similiar trip. It scrapes
the Live Tour page for a given stage, looks for new news updates, and
sends them to my cell phone. Use at your own risk – you’ll get LOTS of text messages during the race. This is definitely a one-off implementation.

Download letour.rb.txt

Printing the Beta Ruby on Rails Book

Railsbook I was one of the over 1000 people who bought the beta version of the new Agile Web Development with Rails book by Dave Thomas and David Heinemeier Hansson. I bought the combo pack, which means I got PDF of the beta version now and I’ll get a print version when its ships in July.

I’ve been developing web apps for almost 10 years (WinCGI, Cold Fusion back in the DBML days, ASP, ASP.NET, PHP, etc), and Rails is easily the most intriguing and exciting technology I’ve come across in a long time. The Rails framework takes away much of the repetitive drudgery you often run into in web app development. The learning curve doesn’t seem too (though I’m still in the Ruby-is-a weird-looking-language stage) and the Rails book should help in that area too.

I didn’t want to read a 500+ page book on the computer though, and also didn’t want to burn out my laser printer, so I used Kinkos Online Ordering system. Its slick – you upload the PDF to them and pick up your print copy a couple of hours later at your local store. I wish all my books could be spiral bound!