March 04, 2009

Posted by John

Tagged clogging, net, and redirects

Older: First Time in Print

Newer: 3 Simple Guidelines for Contributing

Following Redirects with Net/HTTP

The web is full of redirects. It isn’t that hard to figure out how to follow them using Ruby, but it always helps to have examples when you are learning. Not too long ago I was hacking on some feed auto discovery code and made a little class that, given a url, will find the endpoint and return the response from that endpoint.

I figured in the old spirit of clogging, I would post it here until I have time to package the full feed auto discovery library and release it on Github.

require 'logger'
require 'net/http'

class RedirectFollower
  class TooManyRedirects < StandardError; end
  
  attr_accessor :url, :body, :redirect_limit, :response
  
  def initialize(url, limit=5)
    @url, @redirect_limit = url, limit
    logger.level = Logger::INFO
  end
  
  def logger
    @logger ||= Logger.new(STDOUT)
  end

  def resolve
    raise TooManyRedirects if redirect_limit < 0
    
    self.response = Net::HTTP.get_response(URI.parse(url))

    logger.info "redirect limit: #{redirect_limit}"
    logger.info "response code: #{response.code}"
    logger.debug "response body: #{response.body}"

    if response.kind_of?(Net::HTTPRedirection)      
      self.url = redirect_url
      self.redirect_limit -= 1

      logger.info "redirect found, headed to #{url}"
      resolve
    end
    
    self.body = response.body
    self
  end

  def redirect_url
    if response['location'].nil?
      response.body.match(/<a href=\"([^>]+)\">/i)[1]
    else
      response['location']
    end
  end
end

You can then follow redirects as easily as this:

google = RedirectFollower.new('http://google.com').resolve
puts google.body

Which when run will output something like this:

I, [2009-03-04T17:16:58.879672 #69272]  INFO -- : redirect limit: 5
I, [2009-03-04T17:16:58.880669 #69272]  INFO -- : redirect found, headed to http://www.google.com/
I, [2009-03-04T17:16:58.987963 #69272]  INFO -- : redirect limit: 4

Followed by the html from www.google.com which I did not include. The logger method comes in ridiculously handy when following redirects as sometimes it is kind of hard to figure out what is going on. You can also optionally pass in a limit for how many times you would like to redirect.

RedirectFollower.new('http://foobar.com', 3).resolve

This would set the number of redirects to follow to 3, instead of the default 5. You always want to put some kind of limit on the number of redirects to follow or you could end up in infinite redirection.

Nothing fancy but it gets the job done. Maybe if I get a chance I’ll post on how to test this by stubbing responses.

2 Comments

  1. Nemanja Čorlija Nemanja Čorlija

    Mar 06, 2009

    Nice piece of code. But, isn’t there supposed to be a check for redirect_limit? I didn’t test it, but I don’t think redirect_limit is honored right now.

    Thanks for sharing.

  2. @Nemanja Good catch. I extracted this from some other code and must have accidently removed the limit code. I added something back in that should work. TooManyRedirects is now raised once the limit is met.

Sorry, comments are closed for this article to ease the burden of pruning spam.

About

Authored by John Nunemaker (Noo-neh-maker), a programmer who has fallen deeply in love with Ruby. Learn More.

Projects

Flipper
Release your software more often with fewer problems.
Flip your features.