How to save HTML pages as one file?

后端 未结 4 652
谎友^
谎友^ 2021-02-12 12:21

I want to be able to save / archive HTML pages as one file (without those pesky external folders).

I want the resulting file to contain all styles, images, and links (vid

相关标签:
4条回答
  • 2021-02-12 13:01

    Viewing and creating MHTML files in current versions of Google Chrome is supported by toggling the "Save Page as MHTML" option on the chrome://flags page.

    type chrome://flags in your url box

    However, enabling this experimental option disables saving pages as HTML-only or HTML Complete files. From the chrome://flags page:

    0 讨论(0)
  • 2021-02-12 13:07

    Extending upon zTrix's answer, I would suggest avoiding the Chrome extension (which did not work for me at all) and instead going with one of these options:

    • Node.js: remy's inliner
      • Easy to install using npm
      • Many options, including flags for disabling minification/compression, maintaining external images, skipping videos, and more.
      • Caveat: (22 September 2017) fails to maintain styling and JavaScript functionality when compiling Slate builds. This won't affect most people directly, but it means that inliner will probably have issues with other pages. See this issue
      • Caveat: no options to "leave things alone": will either minify/uglify CSS/JS or beautify, but will not simply embed original source into HTML.
    • Python 2: zTrix's webpage2html
      • More conservative than inliner; works well for most cases.
      • zTrix fixed a bug (that inliner also seems to have) which ensures JavaScript/CSS functionality when compiling Slate builds. See this issue. (updated 29 September 2017)
      • Can be converted to Python 3 relatively painlessly
      • Caveat: cannot handle CSS @import
    0 讨论(0)
  • 2021-02-12 13:10

    Usually, it's possible to create one HTML file that contains all his common children files (css, jpg, js, svg, ...)
    You must rewrite the HTML file by replacing "src" attributes' value, "url()" functions and insert HTML tag like "<script></script>" for JavaScript files, "<style></style>" for CSS files and "<svg></svg>" for SVG image.

    For example a GIF image file in CSS called by the "url()" function.

    1. download the image from his URL.
    2. encode this image into Base64.
    3. replace "url('https://en.wikipedia.org/wiki/File:TPB_Magnet_Icon.gif')" by "url('data:image/gif;base64,R0lGODlhDAAMALMPAOXl5ewvErW1tebm5oocDkVFRePj47a2ts0WAOTk5MwVAIkcDesuEs0VAEZGRv///yH5BAEAAA8ALAAAAAAMAAwAAARB8MnnqpuzroZYzQvSNMroUeFIjornbK1mVkRzUgQSyPfbFi/dBRdzCAyJoTFhcBQOiYHyAABUDsiCxAFNWj6UbwQAOw')" with the Base64 encoded GIF image, prefixed by "data:image/gif;base64,"

    You can do the same thing for the "src" attribute's value. This solution may be used for other binary files. You must adapt the right "data" prefix to corresponding to the encoded object.

    0 讨论(0)
  • 2021-02-12 13:22

    The SingleFile chrome extension is a good solution.

    I have also written my own python tool to solve this problem which I would recommend giving a try: https://github.com/zTrix/webpage2html

    0 讨论(0)
提交回复
热议问题