How to strip ALL HTML tags using MSHTML Parser in VB6?

a 夏天 提交于 2020-01-14 14:49:22

问题


How to strip ALL HTML tags using MSHTML Parser in VB6?


回答1:


This is adapted from Code over at CodeGuru. Many Many thanks to the original author: http://www.codeguru.com/vb/vb_internet/html/article.php/c4815

Check the original source if you need to download your HTML from the web. E.g.:

Set objDocument = objMSHTML.createDocumentFromUrl("http://google.com", vbNullString)

I don't need to download the HTML stub from the web - I already had my stub in memory. So the original source didn't quite apply to me. My main goal is just to have a qualified DOM Parser strip the HTML from the User generated content for me. Some would say, "Why not just use some RegEx to strip the HTML?" Good luck with that!

Add a reference to: Microsoft HTML Object Library

This is the same HTML Parser that runs Internet Explorer (IE) - Let the heckling begin. Well, Heckle away...

Here's the code I used:

Dim objDocument As MSHTML.HTMLDocument
Set objDocument = New MSHTML.HTMLDocument

'NOTE: txtSource is an instance of a simple TextBox object
objDocument.body.innerHTML = "<p>Hello World!</p> <p>Hello Jason!</p> <br/>Hello Bob!"
txtSource.Text = objDocument.body.innerText

The resulting text in txtSource.Text is my User's Content stripped of all HTML. Clean and maintainable - No Cthulhu Way for me.




回答2:


One way:

Function strip(html As String) As String
    With CreateObject("htmlfile")
        .Open
        .write html
        .Close
        strip = .body.outerText
    End With
End Function

For

?strip("<strong>hello <i>wor<u>ld</u>!</strong><foo> 1234")
hello world! 1234



回答3:


Public Function ParseHtml(ByVal str As String) As String
    Dim Ret As String, TagOpenend As Boolean, TagClosed As Boolean
    Dim n As Long, sChar As String
    For n = 1 To Len(str)
        sChar = Mid(str, n, 1)
        Select Case sChar
            Case "<"
                TagOpenend = True
            Case ">"
                TagClosed = True
                TagOpenend = False
            Case Else
                If TagOpenend = False Then
                    Ret = Ret & sChar
                End If
        End Select
    Next
    ParseHtml = Ret
End Function

This is a simple function i mafe for my own use. use Debug window

?ParseHtml( "< div >test< /div >" )

test

I hope this will help without using external libraries



来源:https://stackoverflow.com/questions/5707709/how-to-strip-all-html-tags-using-mshtml-parser-in-vb6

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!