What's the least redundant way to make a site with JavaScript-generated HTML crawlable?

后端 未结 5 906
半阙折子戏
半阙折子戏 2021-01-30 09:30

After reading Google\'s policy on making Ajax-generated content crawlable, along with many developers\' blog posts and Stackoverflow Q&A threads on the subject, I\'m left wi

5条回答
  •  清酒与你
    2021-01-30 09:36

    Why didn't I think of this before! Just use http://phantomjs.org. It's a headless webkit browser. You'd just build a set of actions to crawl the UI and capture the html at every state you'd like. Phantom can turn the captured html into .html files for you and save them to your web server.

    The whole thing would be automated every build/commit (PhantomJS is command line driven). The JS code you write to crawl the UI would break as you change the UI, but it shouldn't be any worse than automated UI testing, and it's just Javascript so you can use jQuery selectors to grab buttons and click them.

    If I had to solve the SEO problem, this is definitely the first approach I'd prototype. Crawl and save, baby. Yessir.

提交回复
热议问题