问题
For months, our IIS/ColdFusion server has been throwing 404 errors during Google crawler scans. Normally it's easy to track these down but in this case, Google is trying to scan our CFC files. The files do exist, but they are not exposed to the Internet - only to the ColdFusion server. Nevertheless, Google is seeing links to the CFCs somewhere on our site and is trying to follow them.
Below is a dump of our CGI structure during one of the 404's. baseCFC
is a CF Mapping to D:\Domains\[domain]\cfc
. All references to baseCFC
in our source code are either in a <cfajaxproxy>
tag, or a CreateObject()
call in Application.CFC (examples below).
Perhaps this is an important clue: baseCFC
refers to D:\Domains\[domain]\cfc
, but Google is trying to reach D:\Domains\[domain]\www\baseCFC
, which is our site's home directory. Apparently Google sees baseCFC
as a normal (unmapped) directory on the server and wants to scan it.
Here are examples of the two types of baseCFC
references in our code:
<cfajaxproxy>:
<cfajaxproxy cfc="baseCFC.Misc" jsclassname="ajxMisc">
CreateObject():
<cfscript>
request.Misc = CreateObject( "component", "baseCFC.Misc" );
</cfscript>
How do we troubleshoot these CFC-related 404 errors? Thank you!
回答1:
The javascript created by cfajaxproxy
includes the location of the cfc. Viewing the source of your page you should be able to find the string '/baseCFC/Statement.cfc'
. That is how Google is finding them.
A quick way to get Google to ignore them is to modify your robots.txt
file to exclude the baseCFC
"directory".
User-Agent: *
Disallow: /baseCFC/
来源:https://stackoverflow.com/questions/12899796/404-error-google-attempting-to-index-coldfusion-cfc