How to mirror only a section of a website?

前端未结

关注

 4  783

I cannot get wget to mirror a section of a website (a folder path below root) - it only seems to work from the website homepage.

I\'ve tried many options - here is o

相关标签:

4条回答

长情又很酷

2020-12-22 16:09
Use the --mirror (-m) and --no-parent (-np) options, plus a few of cool ones, like in this example:
```
wget --mirror --page-requisites --adjust-extension --no-parent --convert-links
     --directory-prefix=sousers http://stackoverflow.com/users
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
南方客

2020-12-22 16:15
I usually use:
```
wget -m -np -p $url
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
死守一世寂寞

2020-12-22 16:17

Check out archivebox.io, it's an open-source, self-hosted tool that creates a local, static, browsable HTML clone of websites (it saves HTML, JS, media files, PDFs, screenshot, static assets and more).

By default, it only archives the URL you specify, but we're adding a --depth=n flag soon that will let you recursively archive links from the given URL.

0 讨论(0)
发布评论:

提交评论
- 加载中...

终归单人心

2020-12-22 16:22

I use pavuk to accomplish mirrors, as it seemed much better for this purpose just from the beginning. You can use something like this:

/usr/bin/pavuk -enable_js -fnrules F '*.php?*' '%o.php' -tr_str_str '?' '_questionmark_' \
               -norobots -dont_limit_inlines -dont_leave_dir \
               http://www.example.com/some_directory/ >OUT 2>ERR

0 讨论(0)