This isn’t Rails specific, but wow! Scraping just got a whole lot easier with Labnotes new library named Scrapi. An example from the post on how to use it…

ebay_auction = Scraper.define do
    process "h3.ens>a", :description=>:text,
                        :url=>"@href" 
    process "td.ebcPr>span", :price=>:text
    process "div.ebPicture >a>img", :image=>"@src" 

    result :description, :url, :price, :image
end

ebay = Scraper.define do
    array :auctions

    process "table.ebItemlist tr.single",
            :auctions => ebay_auction

    result :auctions
end

auctions = ebay.scrape(html)

Yeah. It’s that easy and that cool. The code is stable and currently being used in co.mments, a production app that does a lot of scraping (it keeps track of comments you leave at other’s sites).

You can grabe the code through svn right now and it will soon be available as a gem.

Sorry, comments are closed for this article.