HOW TO: Automate Facebook Interaction using Ruby and WWW::Mechanize

Posted on September 05, 2007

I told those fudgepackers that I liked Michael Bolton’s music. – Michael Bolton, Office Space

Overview

By the end of this tutorial, you will have learned how to use Ruby and WWW::Mechanize to login to your Facebook account and post a random Office Space quote on your friend’s wall!

Origins of this Tutorial

I first got interested in automating Facebook actions due to limitations in the the Fantasy Stock Exchange app on Facebook—I wanted a way to automate stop loss orders like a real brokerage firm might offer.

See this post on FSX Trader for more on that project, which implements the same techniques in this tutorial and more. You can download its source code here on rubyforge. Now onto the tutorial:

Autobots: Mechanize!

First you’ll need the mechanize ruby gem:

sudo gem install mechanize

Next create a new file called mechanize_ruby.rb. First we’ll need to make available the mechanize library:

require 'rubygems'
require 'mechanize'

For this brief example, let’s create a MechanizeFacebook class that will login to Facebook. It will take two paramaters, an auth Hash and boolean telling it whether or not to be verbose in its output:

class MechanizeFacebook
  def initialize(auth, verbose = false)
    @auth = auth
    @verbose = verbose
    @agent = WWW::Mechanize.new
    @agent.user_agent_alias = 'Mac FireFox'
    @agent.redirect_ok = true
  end

  def login
  end
end

auth = {'email' => 'my-fb-email@gmail.com', 'pass' => 'my-fb-password'}
mf = MechanizeFacebook.new(auth,  true)
mf.login

Above you can see we’ve instantiated a new WWW::Mechanize object, set its user agent alias and redirect_ok to true. For more on user agent aliases available to Mechanize (it can masquerade as IE on Windows, Mac Safari, etc) see its documentation here on rubyforge.

Change the ‘auth’ hash credentials to your own if you’d like to tinker with this on your machine.

Now let’s add the login functionality.

To get a page via WWW::Mechanize and print its HTML/form/link/etc data, we can now do add this to our ‘login’ stub above:

page = @agent.get('http://www.facebook.com/')
pp page if @verbose

If @verbose is true, this will output something like:

#<WWW::Mechanize::Page
 {url #<URI::HTTP:0xa629c6 URL:http://www.facebook.com/>}
 {meta}
 {title "Facebook | Welcome to Facebook!"}
    ..
 {links
  #<WWW::Mechanize::Link " " "http://www.facebook.com">
    #<WWW::Mechanize::Link
   "Forgot Password?" 
   "http://www.facebook.com/reset.php">
  #<WWW::Mechanize::Link "Login" "http://www.facebook.com/login.php">
    ...

Next let’s grab the login form. In this particular case, it happens to be the first form:

login_form = page.forms.first

If it were the second form on the page, we could do:

"page.forms[1]"

Time to set some form variables:

login_form.email = @auth['email']
login_form.pass = @auth['pass']
pp login_form if @verbose
If everything is peachy at this point, the output here should be:
#<WWW::Mechanize::Form
 {name "loginform"}
 {method "POST"}
 {action "https://login.facebook.com/login.php"}
 {fields
    ...
    #<WWW::Mechanize::Field:0x1433590
   @name="email",
   @value="your-fb-email@gmail.com">
  #<WWW::Mechanize::Field:0x1432dd4 @name="noerror", @value="1">
  #<WWW::Mechanize::Field:0x1432618 @name="pass", @value="your-fb-pass">}
Now let’s submit this bad boy:
page = @agent.submit(login_form)
pp page if @verbose

The output from this should yield your main personal Facebook homepage (activity feed links, personal app links, etc).

Getting Sneaky

What we’ve done so far has been an interesting learning tool but relatively straightforward.

Let’s combine a few steps and post a random quote from Office Space to one of your friend’s walls (or your own wall, but that’s not nearly as fun, eh?). Example of the resulting wallpost:

First, see this article on how to parse Office Space quotes off of WikiQuote.org. The “randomquote.rb” file from that project is included in randposter.zip file. Let’s break down what our new post method in mechanize_facebook.rb will do:

friends_profile_url = FRIEND_PROFILE_URL
page = @agent.get(friends_profile_url)
pp page if @verbose

Now let’s grab some quotes and instantiate a quote factory :

qg = QuotationGrabber.new('Office_Space', 10)
rq = RandomQuote.new('Office_Space')

The following assumes your friend’s Wall form is the last form on their profile page. This is normally true unless they’ve heavily customized their profile :

# Get the post to Wall form
wall_form = page.forms.last
# Set the wall message post value to the following:
wall_form.text = "Random Office Space quote of the day:\n\n#{rq.random_quote}"

Next we inspect the output of the form to make sure it’s doing what we expect, then submit the form!

pp wall_form if @verbose
# Submit the form!
@agent.submit(wall_form)

Now if you log into Facebook and look at your buddy’s wall—your random Office Space quote should be listed at the top.

Comments
  1. Doug BromleySeptember 05, 2007 @ 02:58 PM
    Aaaah - two of my favourite things all rolled into one. I've been using Facebook far too much lately - you're giving me all sorts of ideas. Have you had a look at a library called scRubyt? It's a very high level library for web scraping. Uses Mechanize, Hpricot, etc but makes it all very very simple and clean to use. Great library and very actively developed.
  2. Shanti BrafordSeptember 06, 2007 @ 05:41 PM
    Doug - I have taken a look at scRubyt. It seems very powerful. One thing I couldn't figure out how to do with it -- let's say you have nested data within a page. I.e. -- Quotation Author - quote, quote, quote Quotation Author #2 - quote, quote, quote etc.. and you want to extract both the author name and their list of quotes. That's a tricky problem I know, but would love to be made easier somehow. =)
  3. Peter SzinekSeptember 29, 2007 @ 06:21 AM
    Hey Shanti, Could you elaborate some more on this? I believe it's actually doable in scRUBYt!. If you are interested, I can help you to make it work.
  4. Shanti BrafordOctober 02, 2007 @ 02:55 PM
    Peter - sorry, comments on my blog don't seem to respect line breaks!
    So, you see this page: http://en.wikiquote.org/wiki/Office_Space
    Is there a way, in a single pass, to build a data structure of say: Author => [Quote1, Quote2, Quote3, ...] Author2 => [Quote1, Quote2, Quote3, ...]