March 04, 2009
Older: First Time in Print
Newer: 3 Simple Guidelines for Contributing
Following Redirects with Net/HTTP
The web is full of redirects. It isn’t that hard to figure out how to follow them using Ruby, but it always helps to have examples when you are learning. Not too long ago I was hacking on some feed auto discovery code and made a little class that, given a url, will find the endpoint and return the response from that endpoint.
I figured in the old spirit of clogging, I would post it here until I have time to package the full feed auto discovery library and release it on Github.
require 'logger'
require 'net/http'
class RedirectFollower
class TooManyRedirects < StandardError; end
attr_accessor :url, :body, :redirect_limit, :response
def initialize(url, limit=5)
@url, @redirect_limit = url, limit
logger.level = Logger::INFO
end
def logger
@logger ||= Logger.new(STDOUT)
end
def resolve
raise TooManyRedirects if redirect_limit < 0
self.response = Net::HTTP.get_response(URI.parse(url))
logger.info "redirect limit: #{redirect_limit}"
logger.info "response code: #{response.code}"
logger.debug "response body: #{response.body}"
if response.kind_of?(Net::HTTPRedirection)
self.url = redirect_url
self.redirect_limit -= 1
logger.info "redirect found, headed to #{url}"
resolve
end
self.body = response.body
self
end
def redirect_url
if response['location'].nil?
response.body.match(/<a href=\"([^>]+)\">/i)[1]
else
response['location']
end
end
end
You can then follow redirects as easily as this:
google = RedirectFollower.new('http://google.com').resolve
puts google.body
Which when run will output something like this:
I, [2009-03-04T17:16:58.879672 #69272] INFO -- : redirect limit: 5
I, [2009-03-04T17:16:58.880669 #69272] INFO -- : redirect found, headed to http://www.google.com/
I, [2009-03-04T17:16:58.987963 #69272] INFO -- : redirect limit: 4
Followed by the html from www.google.com which I did not include. The logger method comes in ridiculously handy when following redirects as sometimes it is kind of hard to figure out what is going on. You can also optionally pass in a limit for how many times you would like to redirect.
RedirectFollower.new('http://foobar.com', 3).resolve
This would set the number of redirects to follow to 3, instead of the default 5. You always want to put some kind of limit on the number of redirects to follow or you could end up in infinite redirection.
Nothing fancy but it gets the job done. Maybe if I get a chance I’ll post on how to test this by stubbing responses.
2 Comments
Mar 06, 2009
Nice piece of code. But, isn’t there supposed to be a check for redirect_limit? I didn’t test it, but I don’t think redirect_limit is honored right now.
Thanks for sharing.
Mar 08, 2009
@Nemanja Good catch. I extracted this from some other code and must have accidently removed the limit code. I added something back in that should work. TooManyRedirects is now raised once the limit is met.
Sorry, comments are closed for this article to ease the burden of pruning spam.