InternetExplorer.Application com object and windows 2012 in powershell

喜欢而已 提交于 2020-01-05 12:54:10

问题


I am trying to access the document of an internet explorer com object with windows 2012. The code works great in windows 2008 but as soon as I try to run it on windows 2012 (fresh install, tried on more than one server), the same code stops working. In other words, $ie.document.documentHtml returns as null.

Below is the code:

$ie = new-object -com "InternetExplorer.Application"
$ie.navigate2("http://www.example.com/") 
while($ie.busy) {start-sleep 1}
$ie.document.documentHtml.innerhtml

Has the interexplorer com object changed in windows 2012? and if yes, how do I do I retrieve the document contents in windows 2012?

Thanks in advance

edit: Added a bounty to sweeten things up. Invoke-WebRequest is nice but it works only on windows 2012 but I need to use internet explorer and have it work both on windows 2008 and windows 2012. I have read somewhere that installing microsoft office solves the issue. It is not an option either.

edit2: as I need to remotely invoke the script on multiple windows server (both 2008 and 2012), I would prefer not to copy files manually


回答1:


It's a know bug: http://connect.microsoft.com/PowerShell/feedback/details/764756/powershell-v3-internetexplorer-application-issue

An extract from the workaround:

So, here's a workaround:

  1. Copy Microsoft.html.dll from a location where it is installed (eg: from C:\Program Files(x86)\Microsoft.NET\Primary Interop Assemblies to your script's location (can be a network drive)
  2. Use the Load-Assembly.ps1 script (code provided below and at: http://sdrv.ms/U6j7Wn) to load the assembly types in memory eg: .\Load-Assembly.ps1 -Path .\microsoft.mshtml.dll

Then proceed as usual to create the IE object etc. Warning: when dealing with the write() and writeln() methods use the backward compatible methods: IHTMLDocument2_write() and IHTMLDocument2_writeln().




回答2:


    $ie.document.documentHtml.innerhtml

The bigger question is how this ever could have worked. The Document property returns a reference to the IHTMLDocument interface, it does not have a "documentHtml" property. It is never that clear what you might get back when you use late binding as was done in this code. There is an old documentHtml property supported by the DHTML Editing control, that has been firmly put to the pasture. Admittedly rather a wild guess.

Anyhoo, correct syntax is to use, say, the body property:

  $ie = new-object -com "InternetExplorer.Application"
  $ie.navigate2("http://www.example.com/") 
  while($ie.busy) {start-sleep 1}
  $txt = $ie.document.body.innerhtml
  Write-Output $txt

If you still have problems, Powershell does treat null references rather undiagnosably, then try running this C# code on the machine. Ought to give you a better message:

using System;

class Program {
    static void Main(string[] args) {
        try {
            var comType = Type.GetTypeFromProgID("InternetExplorer.Application");
            dynamic browser = Activator.CreateInstance(comType);
            browser.Navigate2("http://example.com");
            while (browser.Busy) System.Threading.Thread.Sleep(1);
            dynamic doc = browser.Document;
            Console.WriteLine(doc.Body.InnerHtml);
        }
        catch (Exception ex) {
            Console.WriteLine(ex.ToString());
        }
        Console.ReadLine();
    }
}



回答3:


As far as I can tell, on Windows Server 2012 to get the full html of a page:

$ie.document.documentElement.outerhtml

There is also an innerhtml property on the documentElement, which strips off the root <html> element.

Of course, if all you want to do is get the raw markup, consider using Invoke-WebRequest:

$doc = Invoke-WebRequest 'http://www.example.com'
$doc.Content


来源:https://stackoverflow.com/questions/21197141/internetexplorer-application-com-object-and-windows-2012-in-powershell

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!