How to Archive a Dynamic (PHP) Website as Static HTML? [closed]

谁都会走 提交于 2019-12-05 22:13:23

问题


We're in the process of shutting down The Conversations Network (including the IT Conversations podcast). The plan is to render a static-HTML version of our websites for permanent hosting at the Internet Archive.

What's the easiest way to generate static HTML from the roughly 5,000 dynamic pages currently generated dynamically from PHP?

I know we could tweak the code to cache the PHP output, write it to files, then walk the sitemaps to generate every page. But I wonder if there are any options we should consider. Any tools for doing this and scraping the HTML as-is? (Something other than Acrobat Pro?)

Unfortunately, we also have a fair number of Ajax calls, which are going to make this more difficult. I imagine we'll have to un-Ajax them first.


回答1:


There is a great piece of software called "Teleport Pro" (payware unfortunately), and it can create browsable/duplicated copies of a website. Which, once uploaded to a server, should work exactly the same as the original site.

Things to keep in mind though when your creating static html from dynamic pages are;

  • Your current ajax calls need to be un-ajaxed (as you said yourself)
  • .htaccess settings, mod_rewrite for example can make your static files worthless. Because links might not work.

But "Teleport pro" is a real solid program which is around for quite some time. I have used it in the past and will probably use it again.


Another approach might be the php module "php-apc" which creates a cache. In this case u would need to crawl the whole site, before a complete cache is created. Im not TOO familiar with it, but an install is easily done, and you could see if the generated files are of any use.




回答2:


It might not be what you are looking for; but HTTrack will browse your website for links and save the HTML-version of it. This mirror will include all static content that is linked, such as images, css and javascript.

The only problem I can think of is if your AJAX-script is pulling vital data from a server that, but perhaps HTTrack has a setting for that.



来源:https://stackoverflow.com/questions/12608622/how-to-archive-a-dynamic-php-website-as-static-html

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!