问题
I am trying to block anything external loaded by TEmbeddedWB or TWebBrowser (or TCppWebBrowser). I would like to block anything that is loaded from Internet including images, javascript, external CSS, external [embed] or [object] or [applet] or [frame] or [iframe], executing JavaScript that can load external content etc.
This problem consists of 2 parts:
- putting web browser into "restrict all" (except basic HTML without images) and detecting if such content exists
- if external content is not present ok, if it is, showing a "download bar" which after click puts web browser into "download all" mode and gets all content.
First item has issues. In TEmbeddedWB you can block almost anything using DownloadOptions switches and most important is ForceOffline switch but even with all of that turned off it still passes through some things like [object]
or [iframe]
tags. I know this is the case because I implemented OnBeforeNavigate2 event and it triggers for URLs contained in these tags and it also makes an entry in log of local server. Setting OfflineMode
and ForceOfflineMode
in TEmbeddedWB doesn't help for these items.
So how can I really block all? So it needs to start as basic HTML with blocked external elements including scripts and CSS. Is there a way to trigger an event every time it wants to download anything so it can be blocked or avoiding triggering such event in the first place by blocking all external downloads? Do I need to fiddle with Internet Explorer zones and security? Any pointer in right direction would be helpful.
Second item is also tricky because I need to detect if problematic tags are present (such as "applet", "script", "link" etc. This detection doesn't need to be perfect but it must at least be good enough to cover most of such tags. I've done it like this:
//----------------------------------------------------------------------
// Check for external content (images, scripts, ActiveX, frames...)
//----------------------------------------------------------------------
try
{
bool HasExternalContent = false;
DelphiInterface<IHTMLDocument2> diDoc; // Smart pointer wrapper - should automatically call release() and do reference counting
diDoc = TEmbeddedWB->Document;
DelphiInterface<IHTMLElementCollection> diColApplets; DelphiInterface<IDispatch> diDispApplets; DelphiInterface<IHTMLObjectElement> diObj;
DelphiInterface<IHTMLElementCollection> diColEmbeds; DelphiInterface<IDispatch> diDispEmbeds;
DelphiInterface<IHTMLFramesCollection2> diColFrames; DelphiInterface<IDispatch> diDispFrames;
DelphiInterface<IHTMLElementCollection> diColImages; DelphiInterface<IDispatch> diDispImages; DelphiInterface<IHTMLImgElement> diImg;
DelphiInterface<IHTMLElementCollection> diColLinks; DelphiInterface<IDispatch> diDispLinks;
DelphiInterface<IHTMLElementCollection> diColPlugins; DelphiInterface<IDispatch> diDispPlugins;
DelphiInterface<IHTMLElementCollection> diColScripts; DelphiInterface<IDispatch> diDispScripts;
DelphiInterface<IHTMLStyleSheetsCollection> diColStyleSheets; DelphiInterface<IDispatch> diDispStyleSheets;
OleCheck(diDoc->Get_applets (diColApplets));
OleCheck(diDoc->Get_embeds (diColEmbeds));
OleCheck(diDoc->Get_frames (diColFrames));
OleCheck(diDoc->Get_images (diColImages));
OleCheck(diDoc->Get_links (diColLinks));
OleCheck(diDoc->Get_plugins (diColPlugins));
OleCheck(diDoc->Get_scripts (diColScripts));
OleCheck(diDoc->Get_styleSheets (diColStyleSheets));
// Scan for applets external links
for (int i = 0; i < diColApplets->length; i++)
{
OleCheck(diColApplets->item(i,i,diDispApplets));
if (diDispApplets != NULL)
{
diDispApplets->QueryInterface(IID_IHTMLObjectElement, (void**)&diObj);
if (diObj != NULL)
{
UnicodeString s1 = Sysutils::Trim(diObj->data),
s2 = Sysutils::Trim(diObj->codeBase),
s3 = Sysutils::Trim(diObj->classid);
if (StartsText("http", s1) || StartsText("http", s2) || StartsText("http", s3))
{
HasExternalContent = true;
break; // At least 1 found, bar will be shown, no further search needed
}
}
}
}
// Scan for images external links
for (int i = 0; i < diColImages->length; i++)
{
OleCheck(diColImages->item(i,i,diDispImages));
if (diDispImages != NULL) // Unnecessary? OleCheck throws exception if this applies?
{
diDispImages->QueryInterface(IID_IHTMLImgElement, (void**)&diImg);
if (diImg != NULL)
{
UnicodeString s1 = Sysutils::Trim(diImg->src);
// Case insensitive check
if (StartsText("http", s1))
{
HasExternalContent = true;
break; // At least 1 found, bar will be shown, no further search needed
}
}
}
}
}
catch (Exception &e)
{
// triggered by OleCheck
ShowMessage(e.Message);
}
Is there an easier way to scan this or the only one is to run several loops using other interface functions such as Get_applets
, Get_embeds
, Get_stylesheets
etc. similar to code above? So far I found I'd have to call following functions to cover all of this:
OleCheck(diDoc->Get_applets (diColApplets));
OleCheck(diDoc->Get_embeds (diColEmbeds));
OleCheck(diDoc->Get_frames (diColFrames));
OleCheck(diDoc->Get_images (diColImages));
OleCheck(diDoc->Get_links (diColLinks));
OleCheck(diDoc->Get_plugins (diColPlugins));
OleCheck(diDoc->Get_scripts (diColScripts));
OleCheck(diDoc->Get_styleSheets (diColStyleSheets));
But I'd rather not implement that many loops if this can be handled easier. Can it?
回答1:
I suggest you this solution:
#include "html.h"
THTMLDocument doc;
void __fastcall TForm1::CppWebBrowser1DocumentComplete(TObject *Sender, LPDISPATCH pDisp,
Variant *URL)
{
doc.documentFromVariant(CppWebBrowser1->Document);
bool HasExternalContent = false;
for (int i=0; i<doc.images.length; i++) {
if(doc.images[i].src.SubString(1, 4) == "http")
{
HasExternalContent = true;
break;
}
}
for (int i=0; i<doc.applets.length; i++) {
THTMLObjectElement obj = doc.applets[i];
if(obj.data.SubString(1, 4) == "http")
HasExternalContent = true;
if(obj.codeBase.SubString(1, 4) == "http")
HasExternalContent = true;
if(obj.classid.SubString(1, 4) == "http")
HasExternalContent = true;
}
}
This greate wrapper classes can be downloaded from here.
来源:https://stackoverflow.com/questions/10637550/detecting-external-content-with-tembeddedwb-or-twebbrowser