问题
I'm trying to make image of webpage, but some pages shows me as white page.
In Registry editor browse \HKEY_CURRENT_USER\Software\Microsoft\Internet Explorer\Main\FeatureControl\FEATURE_BROWSER_EMULATION\ and add there this:
WindowsFormsApp1.exe with decimal value 11000
WindowsFormsApp1.vshost.exe with decimal value 11000
Here is my code:
using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;
namespace WindowsFormsApp1
{
public partial class Form1 : Form
{
Dictionary<Uri, Bitmap> browserShots = new Dictionary<Uri, Bitmap>();
WebBrowser browser = new WebBrowser();
public Form1()
{
InitializeComponent();
browser.DocumentCompleted += new WebBrowserDocumentCompletedEventHandler(browser_DocumentCompleted);
}
//=========================================MADE BY JIMY====================================
private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var browser = sender as WebBrowser;
if (browser.ReadyState != WebBrowserReadyState.Complete) return;
var bitmap = WebBrowserExtender.DrawContent(browser);
if (bitmap != null)
{
if (!browserShots.ContainsKey(browser.Url))
browserShots.Add(browser.Url, bitmap);
else
{
browserShots[browser.Url]?.Dispose();
browserShots[browser.Url] = bitmap;
}
// Show the Bitmap in a PictureBox control, eventually
pictureBox1.Image = browserShots[browser.Url];
}
}
public class WebBrowserExtender
{
public static Bitmap DrawContent(WebBrowser browser)
{
if (browser.Document == null) return null;
Size docSize = Size.Empty;
Graphics g = null;
var hDc = IntPtr.Zero;
try
{
docSize.Height = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollHeight;
docSize.Width = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollWidth;
docSize.Height = Math.Max(Math.Min(docSize.Height, 32750), 1);
docSize.Width = Math.Max(Math.Min(docSize.Width, 32750), 1);
var previousSize = browser.ClientSize;
browser.ClientSize = new Size(docSize.Width, docSize.Height);
var bitmap = new Bitmap(docSize.Width, docSize.Height, PixelFormat.Format32bppArgb);
g = Graphics.FromImage(bitmap);
var rect = new RECT(0, 0, bitmap.Width, bitmap.Height);
hDc = g.GetHdc();
var view = browser.ActiveXInstance as IViewObject;
view.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, hDc, ref rect, IntPtr.Zero, IntPtr.Zero, 0);
browser.ClientSize = previousSize;
return bitmap;
}
catch
{
// This catch block is like this on purpose: nothing to do here
return null;
}
finally
{
if (hDc != null) g?.ReleaseHdc(hDc);
g?.Dispose();
}
}
[ComImport]
[Guid("0000010D-0000-0000-C000-000000000046")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
interface IViewObject
{
void Draw(uint dwAspect, int lindex, IntPtr pvAspect, [In] IntPtr ptd,
IntPtr hdcTargetDev, IntPtr hdcDraw, ref RECT lprcBounds,
[In] IntPtr lprcWBounds, IntPtr pfnContinue, uint dwContinue);
}
[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct RECT
{
public int Left;
public int Top;
public int Right;
public int Bottom;
public RECT(int left, int top, int width, int height)
{
Left = left; Top = top; Right = width; Bottom = height;
}
}
}
//=========================================MADE BY JIMY====================================}
private void button1_Click(object sender, EventArgs e)
{
browser.Navigate(textBox1.Text, null, null, "User-Agent: User agent");
}
}
}
回答1:
In order to print the Html content of a WebBrowser Control, there are a few points that need to be considered:
- We need to use the WebBrowser's DocumentCompleted event to determine when the current Document is loaded and rendered
A single Document may (will) contain more that one sub-Document, usually contained inside Frames/IFrames. Each IFrame contains its own Document: when a Document contained in an IFrame is loaded, the
DocumentCompleted
is reaised. This means that the event can and will be raised multiple times when the WebBrowser navigates to a URL.The notes here explain more: How to get an HtmlElement value inside Frames/IFrames?
The managed properties of the WebBrowser don't always reflect the DOM's real values. For example, the actual dimensions of the Html Document, when the rendering is completed, are not reflected anywhere, so we need to get those measures from the DOM ourselves. The current DOM rendered dimensions are referenced by:
[WebBrowser].Document.DomDocument.documentElement.scrollHeight; [WebBrowser].Document.DomDocument.documentElement.scrollWidth;
See: Measuring Element Dimension and Location with CSSOM in Windows Internet Explorer
The WebBrowser Control DrawToBitmap() method is derived from
Control
but it's not actually implemented as we could expect. The same applies to other Controls: the RichTextBox is known to print blank content when this method is used.- A Html Document may be larger than the maximum Size supported by a Bitmap. There is also a more subtle memory limit: the Bitmap object needs to store its content in a contiguous memory space, so the limit in Size of a Bitmap is actually hard to pre-determine and may cause exceptions when we might not expect it.
- The WebBrowser control's Emulation Feature must be set to Internet Explorer 11. See:
How can I get the WebBrowser control to show modern contents?
Web browser control emulation issue (FEATURE_BROWSER_EMULATION)
To proceed, first subscribe to DocumentCompleted
event of the WebBrowser Control.
A Dictionary<Uri, Bitmap>
is used here to store the Bitmap representing the Html content of URLs visited in a session.
When the DocumentCompleted
event is raised, we add a new element to the Dictionary when the current URL has never been visited before.
If the Uri
is already stored, we updated the related Bitmap object, so only the most recent snapshot of a Html Document is present in the collection.
I'm using a support class to handle the Bitmaps creation and to declare the native COM Interface used to generate the Bitmap from the current ISurfacePresenter.
Since the WebBrowser control is forced to use VIEW_OBJECT_COMPOSITION_MODE_LEGACY
as the CompositionMode for all sites, the internal GetPrintBitmap method calls the IViewObject Interface Draw()
method in this situation, so do we.
To print the content (all the content) of the current Html Document, call the
DrawContent(WebBrowser browser)
static method of the WebBrowserExtender
class:
Dictionary<Uri, Bitmap> browserShots = new Dictionary<Uri, Bitmap>();
private void browser_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
{
var browser = sender as WebBrowser;
if (browser.ReadyState != WebBrowserReadyState.Complete) return;
var bitmap = WebBrowserExtender.DrawContent(browser);
if (bitmap != null) {
if (!browserShots.ContainsKey(browser.Url)) {
browserShots.Add(browser.Url, bitmap);
}
else {
browserShots[browser.Url]?.Dispose();
browserShots[browser.Url] = bitmap;
}
// Show the Bitmap in a PictureBox control, eventually
[PictureBox].Image = browserShots[browser.Url];
}
}
The WebBrowserExtender support class:
using System.Drawing;
using System.Drawing.Imaging;
using System.Runtime.InteropServices;
using System.Windows.Forms;
public class WebBrowserExtender
{
public static Bitmap DrawContent(WebBrowser browser)
{
if (browser.Document == null) return null;
Size docSize = Size.Empty;
Graphics g = null;
var hDc = IntPtr.Zero;
try {
docSize.Height = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollHeight;
docSize.Width = (int)((dynamic)browser.Document.DomDocument).documentElement.scrollWidth;
var screenWidth = Screen.FromHandle(browser.Handle).Bounds.Width;
docSize.Width = Math.Max(Math.Min(docSize.Width, screenWidth), 1);
docSize.Height = Math.Max(Math.Min(docSize.Height, 32750), 1);
var previousSize = browser.ClientSize;
browser.ClientSize = new Size(docSize.Width, docSize.Height);
var bitmap = new Bitmap(docSize.Width, docSize.Height, PixelFormat.Format32bppArgb);
g = Graphics.FromImage(bitmap);
var rect = new RECT(0, 0, bitmap.Width, bitmap.Height);
hDc = g.GetHdc();
var view = browser.ActiveXInstance as IViewObject;
view.Draw(1, -1, IntPtr.Zero, IntPtr.Zero, IntPtr.Zero, hDc, ref rect, IntPtr.Zero, IntPtr.Zero, 0);
browser.ClientSize = previousSize;
return bitmap;
}
catch {
// This catch block is like this on purpose: nothing to do here
return null;
}
finally {
if (hDc != null) g?.ReleaseHdc(hDc);
g?.Dispose();
}
}
[ComImport]
[Guid("0000010D-0000-0000-C000-000000000046")]
[InterfaceType(ComInterfaceType.InterfaceIsIUnknown)]
interface IViewObject
{
void Draw(uint dwAspect, int lindex, IntPtr pvAspect, [In] IntPtr ptd,
IntPtr hdcTargetDev, IntPtr hdcDraw, ref RECT lprcBounds,
[In] IntPtr lprcWBounds, IntPtr pfnContinue, uint dwContinue);
}
[StructLayout(LayoutKind.Sequential, Pack = 4)]
struct RECT
{
public int Left;
public int Top;
public int Right;
public int Bottom;
public RECT(int left, int top, int width, int height)
{
Left = left; Top = top; Right = width; Bottom = height;
}
}
}
This is how it works:
The full Document is captured. Of course, the Bitmap can also be limited to a specific maximum/minimum size, to capture just a section of the Html Document.
:
Sample WinForms Project on Google Drive.
回答2:
try to set User Agent like this
browser.Navigate(url, null, null, "User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:74.0) Gecko/20100101 Firefox/74.0");
来源:https://stackoverflow.com/questions/60722714/webbrowser-html-document-to-image