I am planning to write a webcrawler for a NLP project, that reads in the thread structure of a forum everytime in a specific interval and parses each thread with new content. Vi
I am also evaluating erlang for use as a web crawler and it looks good so far.
There are lots of existing helpful modules: HTML parser, HTTP client, XPath, regex, cache.
And other people are interested in the same use case, so you can learn from them.
However if this is just a one off project I recommend Python / Ruby / Perl because it will be easier to get started with.