问题
I have the following HTML table Link To the HTML
I want to parse it and convert it to XML/CSV/PS Object, I tried to do with HtmlAgilityPack.dll but no success. Can anybody give me any directions to do it?
I want to convert the table to a PSObject and export it to csv, I currently have just the beginning of the code, and access to the lines but i can't access to the values in the lines
Add-Type -Path C:\Windows\system32\HtmlAgilityPack.dll
$HTML = New-Object HtmlAgilityPack.HtmlDocument
$res = $HTML.Load("C:\Test\Test.html")
$table = $HTML.DocumentNode.SelectNodes("//table/tr/td/nobr")
when i access to $table[0..47].InnerHtml i get only the first ** column ** of the file, i can't access to the 2nd and etc
Thanks Ohad
回答1:
you can try this to get all the html in <nobr>
tags. I let you find the logic to output what you want...
$ie = new-object -com "InternetExplorer.Application"
$ie.navigate("http://urltoyourfile.html")
$doc = $ie.Document
($doc.getElementsByTagName("nobr"))|%{$_.innerHTML}
Output :
Lead User
Accesses
Last Accessed
Average
Max
Min
Total
amirt</NO br>
2
01/20/2013 09:40:47
04:18:17
06:19:26
02:17:09
08:36:35
andream
1
01/20/2013 10:33:01
02:34:37
02:34:37
02:34:37
02:34:37
avnerm
1
01/17/2013 11:34:16
00:30:44
00:30:44
00:30:44
00:30:44
brouria
a way to parse it :
($doc.getElementsByTagName("nobr"))|%{
write-host -nonew $_.innerHTML";"
$cpt++
if ($cpt % 8 -eq 0){$cpt=1;write-host ""}
}
来源:https://stackoverflow.com/questions/14496951/parse-html-table-in-powershell-v3