In Ruby you can parse the HTML in Nokogiri, which will let you check for errors, then have it output the HTML, which will clean up missing closing tags and such. Notice in the following HTML that the title and p tags are not closed correctly, but Nokogiri adds the ending tags.
require 'nokogiri'
html = '<html><head><title>the title</head><body><p>a paragraph</body></html>'
doc = Nokogiri::HTML(html)
puts "Errors found" if (doc.errors.any?)
puts doc.to_html
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
# >> <html>
# >> <head>
# >> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
# >> <title>the title</title>
# >> </head>
# >> <body><p>a paragraph</p></body>
# >> </html>
Alternately you can open a connection to /usr/bin/tidy
and tell it to do the dirty work:
require 'open3'
html = '<html><head><title>the title</head><body><p>a paragraph</body></html>'
stdin, stdout, stderr = Open3.popen3('/usr/bin/tidy -qi')
stdin.puts html
stdin.close
puts stdout.read
# >> <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 3.2//EN">
# >>
# >> <html>
# >> <head>
# >> <meta name="generator" content=
# >> "HTML Tidy for Mac OS X (vers 31 October 2006 - Apple Inc. build 15.3.6), see www.w3.org">
# >>
# >> <title>the title</title>
# >> </head>
# >>
# >> <body>
# >> <p>a paragraph</p>
# >> </body>
# >> </html>