Getting the jpg images from an HTML file

后端未结

关注

 1  1247

I\'m trying to use grep to get the full url addresses of jpg images in an HTML file. One problem is that there aren\'t many newlines in it, so when I use grep it gets the path,

相关标签:

1条回答

春和景丽

2021-02-11 02:05

One single sed command

sed -n '/<img/s/.*src="\([^"]*\)".*/\1/p' yourfile.html

_{or using ERE (extended regular expressions) to avoid backslashes from above expression:}

sed -E -n '/<img/s/.*src="([^"]*)".*/\1/p' yourfile.html

One basic grep command

grep -o '<img[^>]*src="[^"]*"' yourfile.html

Two successive basic grep commands

grep -o '<img[^>]*src="[^"]*"' yourfile.html | grep -o '"[^"]*"'

One single grep commands using Perl Regex (PER)

grep -Po '<img[^>]*src="\K[^"]*(?=")' yourfile.html

Using ack as a grep-like replacement

sudo apt install ack
ack -o '<img[^>]*src="\K[^"]*(?=")' yourfile.html

Downloading a web page as proposed by s-hunter

curl -s example.com/a.html | sed -En '/<img/s/.*src="([^"]*)".*/\1/p'

0 讨论(0)