I am trying to figure out how to write a script that goes through a folder and grabs all word documents in the folder to search for a hyperlink and change the link to anothe
Hadn't done this before, so it was nice to figure it out. We both get to learn today! You were very close. Just needed a few adjustments and a loop for handling multiple files. I'm sure someone more knowledgeable will drop in but this should get you the desired result.
$NewDomain1 = "google"
$NewDomain2 = "hij"
$OurDocuments = Get-ChildItem -Path "C:\Apps\testing" -Filter "*.doc*" -Recurse
$Word = New-Object -ComObject word.application
$Word.Visible = $false
$OurDocuments | ForEach-Object {
$Document = $Word.documents.open($_.FullName)
"Processing file: {0}" -f $Document.FullName
$Document.Hyperlinks | ForEach-Object {
if ($_.Address -like "https://www.yahoo.com/*") {
$NewAddress = $_.Address -Replace "yahoo","google"
"Updating {0} to {1}" -f $_.Address,$NewAddress
$_.Address = $_.TextToDisplay = $NewAddress
} elseif ($_.Address -like "http://def.com/*") {
$NewAddress = $_.Address -Replace "def","hij"
"Updating {0} to {1}" -f $_.Address,$NewAddress
$_.Address = $_.TextToDisplay = $NewAddress
}
}
"Saving changes to {0}" -f $Document.Fullname
$Document.Save()
$Pdf = $Document.FullName -replace $_.Extension, '.pdf'
"Saving document {0} as PDF {1}" -f $Document.Fullname,$Pdf
$Document.ExportAsFixedFormat($Pdf,17)
"Completed processing {0} `r`n" -f $Document.Fullname
$Document.Close()
}
$Word.Quit()
Let's walk through it...
We'll first move your new addresses into a couple of variables for ease of referencing and changing in the future. You can also add the addresses that you're looking for here, replacing the hard-coded strings as needed. The third line uses a filter to grab all .DOC and .DOCX files in the directory, which we'll use to iterate over. Personally, I would be careful using the -Recurse
switch, as you run the risk of making unintended changes to a file deeper in the directory structure.
$NewAddress1 = "https://www.google.com/"
$NewAddress2 = "http://hij.com/"
$OurDocuments = Get-ChildItem -Path "C:\Apps\testing" -Filter "*.doc*" -Recurse
Instantiate our Word Com Object and keep it hidden from view.
$Word = New-Object -ComObject word.application
$Word.Visible = $false
Stepping into our ForEach-Object
loop...
For each document that we gathered in $OurDocuments
, we open it and pipe any hyperlinks into another ForEach-Object
, where we check the value of the Address
property. If there's a match that we want, we update the property with the new value. You'll notice that we're also updating the TextToDisplay
property. This is the text that you see in the document, as opposed to Address
which controls where the hyperlink actually goes.
This... $_.Address = $_.TextToDisplay = $NewAddress1
...is an example of multi-variable assignment. Since Address
and TextToDisplay
will be set to the same value, we'll assign them at the same time.
$Document = $Word.documents.open($_.FullName)
"Processing file: {0}" -f $Document.FullName
$Document.Hyperlinks | ForEach-Object {
if ($_.Address -like "https://www.yahoo.com/*") {
$NewAddress = $_.Address -Replace "yahoo","google"
"Updating {0} to {1}" -f $_.Address,$NewAddress
$_.Address = $_.TextToDisplay = $NewAddress
} elseif ($_.Address -like "http://def.com/*") {
$NewAddress = $_.Address -Replace "def","hij"
"Updating {0} to {1}" -f $_.Address,$NewAddress
$_.Address = $_.TextToDisplay = $NewAddress
}
}
Save any changes made...
"Saving changes to {0}" -f $Document.Fullname
$Document.Save()
Here we create the new filename for when we save as a PDF. Notice $_.Extension
in our first line. We switch to using the pipeline object for referencing the file extension since the current pipeline object is still the file info object from our Get-ChildItem
. Since the $Document
object doesn't have an extension property, you'd have to do some slicing of the file name to achieve the same result.
$Pdf = $Document.FullName -replace $_.Extension, '.pdf'
"Saving document {0} as PDF {1}" -f $Document.Fullname,$Pdf
$Document.ExportAsFixedFormat($Pdf,17)
Close the document up and the loop will move to the next file in $OurDocuments
.
"Completed processing {0} `r`n" -f $Document.Fullname
$Document.Close()
Once we run through all documents, we close Word.
$Word.Quit()
I hope that all makes sense!