I\'m writing a web crawler and want to ignore URLs which link to binary files:
$exclude = %w(flv swf png jpg gif asx zip rar tar 7z gz jar js css dtd xsd ico
use URI#path:
unless URI.parse(url).path =~ /\.(\w+)$/ && $exclude.include?($1) puts "downloading #{url}..." end