Ebot
http://www.redaelli.org/matteo-blog/projects/ebot/
Erlang Bot (Ebot) is an opensource web crawler written on top of Erlang, a NOSQL database (Apache CouchDB or Riak), RabbitMQ, Webmachine (Mochiweb), RRDTOOL, .. Using a NOSQL instead of a Relational Database, Ebot can grow easily and cheaply… Ebot is a solid and highly scalable, distribuited and customizable web crawler.
The Ebot crawler project is hosted at http://github.com/matteoredaelli/ebot
Thanks to Ebot crawler I’ve been improving my knowledge about Erlang, the AMQP protocol (RabbitMQ) and NOSQL databases (Apache CouchDB and Riak) with the distribuited map/reduce queries
Below there is an example of a url document generated by the ebot crawler (with apache couchdb backend)
Below you find a sample image of Statistics generated by ebot web crawler using RRDTOOL