I\'m using the following powershell script to open a few thousand HTML files and \"save as...\" Word documents.
param([string]$htmpath,[string]$docpath = $d
I was having problems just converting the filename from .html
to .docx
. I took your code above and changed it to this:
function Convert-HTMLtoDocx {
param([string]$htmpath)
$srcfiles = Get-ChildItem $htmPath -filter "*.htm*"
$saveFormat = [Microsoft.Office.Interop.Word.WdSaveFormat]::wdFormatXMLDocument
$word = new-object -comobject word.application
$word.Visible = $False
ForEach ($doc in $srcfiles) {
Write-Host "Processing :" $doc.fullname
$name = Join-Path -Path $doc.DirectoryName -ChildPath $($doc.BaseName + ".docx")
$opendoc = $word.documents.open($doc.FullName)
$opendoc.saveas([ref]$name.Value,[ref]$saveFormat)
$opendoc.close()
$doc = $null
} #End ForEach
$word.quit()
} #End Function
The problem was the save format. For whatever reason, so save a document as a .docx
you need to specify the format at wdFormatXMLDocument
not wdFormatDocument
.