A scraper is a program written to take content from a web page or other data source and turn it into some kind of other useful format such as an RSS feed or database. Buillding a scraper can be tricky as each site is different, ScraperWiki aims to fix this by creating a repository of useful scraper scripts.
An example use of a scraper: let’s say a government entity releases daily information regarding finances, and you want to graph or otherwise track this data for personal or business use. Going to the website each day and entering the data manually is certainly one labour-intensive way to do it, but any good hacker will tell you that if you have to do anything more than once, it is better to automate it.
ScraperWiki is a centralised location for these custom-built scrapers. Instead of writing your own from scratch, you can search their database to see if a scraper has already been written for a given source.
Scrapers are categories by language, with PHP, Python and Ruby on offer. The site is currently in beta, but seems like it could be a useful tool.
Leave a Reply
You must be logged in to post a comment.