I have the script below, which modifies href
attributes in an HTML file (in the future, it will be a list of HTML files in a directory). Using BeautifulSoup I m
newlink = link['href']
# .. make replacements
link['href'] = newlink # store it back
Now print(soup.prettify())
will show changed links. To save the changes to a file:
htmlDoc.close()
html = soup.prettify("utf-8")
with open("output.html", "wb") as file:
file.write(html)
To preserve original character encoding of the document, you could use soup.original_encoding
instead of "utf-8". See Encodings.