I have the following site http://www.asd.com.tr. I want to download all PDF files into one directory. I\'ve tried a couple of commands but am not having much luck.
First, verify that the TOS of the web site permit to crawl it. Then, one solution is :
mech-dump --links 'http://domain.com' |
grep pdf$ |
sed 's/\s+/%20/g' |
xargs -I% wget http://domain.com/%
The mech-dump
command comes with Perl's module WWW::Mechanize
(libwww-mechanize-perl
package on debian & debian likes distros)