reactjs - fetch as google displays blank page only

问题

I've just coded my first website using reactjs, but when I check how google sees my website, I receive the following result:

My HTML file looks like this:

<!DOCTYPE html>
<html>
<head>
    <title>MySite</title>
</head>
<body>
    <div id="root"></div>
    <script async type="text/javascript" src="index.browser.js"></script>
</body>
</html>

I've deactivated all AJAX-calls for testing, and the ReactDOM.render gets executed right after its js file was loaded. The JS file itself is compiled, compressed and less than 300 KB big (including all libraries like react itself).

At this point, I don't understand which changes I can do to make google render my page correctly? As far as I have understood, google rendering issues with reactjs commonly source from AJAX calls or other long work that is done in application code before the website itself gets rendered and the DOM changed. But after removing big libraries (apart from i18next and react itself), minimizing and compressing the code, I don't see what I could do to improve the performance or rendering time significantly. PageSpeed Insights is on 99/100 points (desktop, only complaining I could minimize the html to save 110 bytes).

Any ideas where my mistake could be? Server-side rendering is not really a suitable option for me.

You can inspect the demo page here: http://comparo.com.mx

As you can see, there is not much - but the HTML content displayed gets rendered right after loading index.browser.js, which is a file < 300KB and should therefore not hold google search console back from rendering the page correctly.

EDIT: My server is located in europe, and afaik the google servers crawl from the US. Could that be in any way an issue?

回答1:

Add babel polyfill to your project:

npm install --save babel-polyfill

And then import it in your index.js (entry point):

import 'babel-polyfill';

Hopefully, this will solve your problem.

回答2:

I would not be sure that is exactly how Google sees your website, as most simulators just strip off Javascript.

Did you use https://www.google.com/webmasters/tools/googlebot-fetch ?

In general Javascript support is limited for search engines so if you really want to have crawlers index your site you would have to implement server side rendering for React.

I've used https://github.com/kriasoft/react-starter-kit to generate http://gifhub.net It was a bit complicated experience but it worked at the end.

There are also frameworks like NextJS https://github.com/zeit/next.js/ that you can leverage to ensure you have server rendered content.

Third option is to use Google Headless Chrome browser to generate content for crawlers https://github.com/GoogleChrome/puppeteer

Having one of these options above implemented makes sure crawlers see everything you wanted. Relying on Javascript rendering will not give you expected results.

回答3:

In one of my legacy projects I run Angular.js to insert dynamic content into a backend-rendered page. Google crawler is smart enough to let it render the dynamic javascript content and index it (e.g. the table is completely dynamic rendered from Ajax data).

So I strongly double that it is related to Server-Side Rendering issues.

I wouldn't suggest spending time on doing SSR as @AlexGvozden suggested - it's quite tedious, especially the Webpack setup. Probably even with Next.js and Create React App.

回答4:

This appears to be a known issue with Google Bot's JS engine. I'm still trying to understand what exactly the problem is, but it seems that adding 'babel-polyfill' to your app solves the problem.

Medium post detailing a fix

回答5:

Had the same issue with blank pages at "Fetch as Google", the advice above with babel-polyfill didn't solve the trouble so I did more digging into it:

Spent hours searching for portable Google Chrome v.41 (which is claimed to be the rendering engine of Google Search Bot) to see what's the error halting Google Crawler. JIC, https://rutracker.org/forum/viewtopic.php?t=4817317
Chrome refused to run in Windows 10, so I had to find Windows 7 VM and finally I discovered there were 2 APIs which babel-polyfill didn't solve: URLSearchParams and fetch()
I accidentally discovered that exact same errors halted IE11 (part of Windows 10) and I could save couple of hours by debugging the site in IE11 right away, instead of searching/troubleshooting Chrome v.41.
Found and added all the polyfills required and made the app rendered under "Fetch as Google".

Long story short, here's the fix that worked for me:

Install 3 polyfills:

npm install --save babel-polyfill
npm install --save url-search-params-polyfill
npm install --save whatwg-fetch

Import those 3 at the top of my entry point JS file (index.js):

import 'babel-polyfill';
import 'url-search-params-polyfill';
import 'whatwg-fetch'

import React from 'react';
import ReactDOM from 'react-dom';*
...

回答6:

For google to see your page as is, you should be implementing server side rendering. Here by looking at your code it is client side rendering, here browser uses java script to load your DOM.

回答7:

I don't know if it still an issue, but...

For each project there could be different reasons. First of all I would recommend you try to run you project in Dev mode (including console logs) and test it with PhantomJS v2.1.1. Result can show you some useful errors.

next you can see my phantomjs sample (called website.js):

var system = require('system')
var page = require("webpage").create();
var homePage = "http://<link to your localhost>";
var captureName = "result.png";

page.onConsoleMessage = function(msg) {
  system.stderr.writeLine('console: ' + msg);
};

page.onError = function(msg, trace) {
  var msgStack = ['PHANTOM ERROR: ' + msg];
  if (trace && trace.length) {
    msgStack.push('TRACE:');
    trace.forEach(function(t) {
      msgStack.push(' -> ' + (t.file || t.sourceURL) + ': ' + t.line + (t.function ? ' (in function ' + t.function +')' : ''));
    });
  }
  console.log(msgStack.join('\n'));
  phantom.exit(1);
};

page.onLoadFinished = function(status) {
  var url = page.url;
  console.log("Status:  " + status);
  console.log("Loaded:  " + url);
  window.setTimeout(function () {
    page.render(captureName);
    phantom.exit();
  }, 5000);
};

page.open(homePage);

btw, as result you will get result.png snapshot in the same directory as website.js located

回答8:

Try adding browser shims. Note that it doesn’t matter if you use Babel to compile your code, you still need polyfills for older browsers and for headless browsers such as Google Bot or PhantomsJS.

npm install --save es5-shim es6-shim

// in your frontend/index.js, as early as possible
import 'es5-shim';
import 'es6-shim';

You can read more here

来源：https://stackoverflow.com/questions/48902026/reactjs-fetch-as-google-displays-blank-page-only

标签

javascript

reactjs

google-search

google-webmaster-tools

i18next