Using wildcards in wget or curl query

前端 未结 3 1572
庸人自扰
庸人自扰 2020-12-02 13:34

Is it possible to use wildcards in wget queries when downloading from directories? Basically, I have a site, say, www.download.example.com/dir/version/package.rpm

相关标签:
3条回答
  • 2020-12-02 14:01

    If you are able to found a pattern in your query, you can use the bash brace expansion to do this task.

    For example, in your case, you may use something like:

    wget www.download.example.com/dir/{version,old}/package{00..99}.rpm
    

    Also, you may combine this with the -A and -R parameters to filter your results.

    0 讨论(0)
  • 2020-12-02 14:04

    You can't use wildcards in wget but the -A flag should work. From the wget manpage:

    You want to download all the gifs from a directory on an http server. You tried wget http://www.server.com/dir/*.gif, but that didn't work because http retrieval does not support globbing. In that case, use: wget -r -l1 --no-parent -A.gif http://www.server.com/dir/

    Edit: found a related question

    Regarding directories:

    There's a utility called LFTP, which has some support for globbing. Take a look at the manpage. There's another question on Linux & Unix that covers its usage in a scenario similar to yours.

    0 讨论(0)
  • 2020-12-02 14:10

    Although the above solution kind of works, it fails when you just want to download certain directories, but not all. For example if you have:

    http://site.io/like/
    http://site.io/like2/
    http://site.io/nolike/
    

    Instead put the directory names you want in a text file, e.g.: dirs.txt:

    like/
    like2/
    

    Then use wget with the following command options -i dirs.txt -B <base-URL> like so:

    wget -nH -nc -np -r -e robots=off -R "index.html*" -i dirs.txt -B http://site.io/
    

    Since, I don't think you can use directories in the -A and -R lists. (?)

    0 讨论(0)
提交回复
热议问题