This is a VERY strange wget
behavior. I'm on debian 7.2.
wget -r -O - www.blankwebsite.com
hangs forever. And I mean it hangs, it isn't searching through the internet,
I can verify it with a strace
.
If I do this:
while read R
do
wget -r -O - www.blankwebsite.com
done < smallfile
with smallfile
containing a single line, the command exits in a few seconds.
I tried also with
wget -r -O - localhost/test.html
with an empty test.html
file, same results. To me, it sounds like a bug.
Everything runs fine changing -O -
with -O myfile
or removing -r
.
I used -O -
because I was passing output to grep
.
Could anyone explain that? Have you seen anything similar?
Of course:
wget -r -O file www.blankwebsite.com
works, but the BUG is that:
wget -r -O - www.blankwebsite.com
hangs!
The same problem is if you create a FIFO
mkfifo /tmp/myfifo
wget -r -O /tmp/myfifo www.blankwebsite.com
wget, when called with -r option, will try to find HTML "a href=..." tags reading the output file. Since the output file is a FIFO or stdout (ex. HYPHEN char '-') it is not able to find any tag and waits for INPUT. Then you will have a wget process waintg forever on a read system call.
To resolve this you can: 1) Patch wget to handle this case 2) Patch wget to not allow "-r -O -" combination... (Just check that the argument of '-O' is a regular file) 3) Use a workaround like:
TMPFILE=$(mktemp /tmp/wget.XXXXXX)
wget -r -O $TMPFILE www.blankwebsite.com
grep STRING $TMPFILE
rm $TMPFILE
@tonjo : Can you please try using the following code.
wget -r -O file www.blankwebsite.com
instead of using
wget -r -O - www.blankwebsite.com
as stated in the documentation:
Similarly, using '-r' or '-p' with '-O' may not work as you expect:
Wget won't just download the first file to FILE and then download
the rest to their normal names: _all_ downloaded content will be
placed in FILE. This was disabled in version 1.11, but has been
reinstated (with a warning) in 1.11.2, as there are some cases
where this behavior can actually have some use.
That is a known problem, that is also downloaded somehow, using -r and -O with non seekable files doesn't work with the way wget serialize the data directly to the file.
来源:https://stackoverflow.com/questions/19681316/wget-hangs-with-r-and-o