Getting the headings from a Word document

后端 未结 7 1097
挽巷
挽巷 2020-11-30 04:11

How do I get a list of all the headings in a word document by using VBA?

相关标签:
7条回答
  • 2020-11-30 04:30

    You mean like this createOutline function (which actually copy all headings from a source word document into a new word document):

    (I believe the astrHeadings = _docSource.GetCrossReferenceItems(wdRefTypeHeading) function is the key in this program, and should allow you to retrieve what you are asking for)

    Public Sub CreateOutline()
        Dim docOutline As Word.Document
        Dim docSource As Word.Document
        Dim rng As Word.Range
    
        Dim astrHeadings As Variant
        Dim strText As String
        Dim intLevel As Integer
        Dim intItem As Integer
    
        Set docSource = ActiveDocument
        Set docOutline = Documents.Add
    
        ' Content returns only the main body of the document, not the headers/footer.        
        Set rng = docOutline.Content
        ' GetCrossReferenceItems(wdRefTypeHeading) returns an array with references to all headings in the document
        astrHeadings = docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
        For intItem = LBound(astrHeadings) To UBound(astrHeadings)
            ' Get the text and the level.
            strText = Trim$(astrHeadings(intItem))
            intLevel = GetLevel(CStr(astrHeadings(intItem)))
    
            ' Add the text to the document.
            rng.InsertAfter strText & vbNewLine
    
            ' Set the style of the selected range and
            ' then collapse the range for the next entry.
            rng.Style = "Heading " & intLevel
            rng.Collapse wdCollapseEnd
        Next intItem
    End Sub
    
    Private Function GetLevel(strItem As String) As Integer
        ' Return the heading level of a header from the
        ' array returned by Word.
    
        ' The number of leading spaces indicates the
        ' outline level (2 spaces per level: H1 has
        ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
    
        Dim strTemp As String
        Dim strOriginal As String
        Dim intDiff As Integer
    
        ' Get rid of all trailing spaces.
        strOriginal = RTrim$(strItem)
    
        ' Trim leading spaces, and then compare with
        ' the original.
        strTemp = LTrim$(strOriginal)
    
        ' Subtract to find the number of
        ' leading spaces in the original string.
        intDiff = Len(strOriginal) - Len(strTemp)
        GetLevel = (intDiff / 2) + 1
    End Function
    

    UPDATE by @kol on March 6, 2018

    Although astrHeadings is an array (IsArray returns True, and TypeName returns String()) I get a type mismatch error when I try to access its elements in VBScript (v5.8.16384 on Windows 10 Pro 1709 16299.248). This must be a VBScript-specific problem, because I can access the elements if I run the same code in Word's VBA editor. I ended up iterating the lines of the TOC, because it works even from VBScript:

    For Each Paragraph In Doc.TablesOfContents(1).Range.Paragraphs
      WScript.Echo Paragraph.Range.Text
    Next
    
    0 讨论(0)
  • 2020-11-30 04:32

    Fastest method for extracting of all headings (to LEVEL5).

    Sub EXTRACT_HDNGS()
    Dim WDApp As Word.Application    'WORD APP
    Dim WDDoc As Word.Document       'WORD DOC
    
    Set WDApp = Word.Application
    Set WDDoc = WDApp.ActiveDocument
    
    For Head_n = 1 To 5
    Head = ("Heading " & Head_n)
    WDApp.Selection.HomeKey wdStory, wdMove
    
        Do
           With WDApp.selection
          .MoveStart Unit:=wdLine, Count:=1    
          .Collapse Direction:=wdCollapseEnd
           End with
            With WDApp.Selection.Find
              .ClearFormatting:          .text = "":     
              .MatchWildcards = False:   .Forward = True
              .Style = WDDoc.Styles(Head)
             If .Execute = False Then GoTo Level_exit
                .ClearFormatting
            End With
    
           Heading_txt = RemoveSpecialChar(WDApp.Selection.Range.text, 1):              Debug.Print Heading_txt
           Heading_lvl = WDApp.Selection.Range.ListFormat.ListLevelNumber:              Debug.Print Heading_lvl
           Heading_lne = WDDoc.Range(0, WDApp.Selection.Range.End).Paragraphs.Count:    Debug.Print Heading_lne
           Heading_pge = WDApp.Selection.Information(wdActiveEndPageNumber):            Debug.Print Heading_pge
    
           If Wdapp.Selection.Style = "Heading 1" Then GoTo Level_exit
           Wdapp.Selection.Collapse Direction:=wdCollapseStart
       Loop
    Level_exit:
    Next Head_n
    
    End Sub
    
    0 讨论(0)
  • 2020-11-30 04:39

    Following Wikis comment on VonC answer, here is the code that worked for me. It makes the function faster.

    Public Sub CopyHeadingsInNewDoc()
        Dim docOutline As Word.Document
        Dim docSource As Word.Document
        Dim rng As Word.Range
    
        Dim astrHeadings As Variant
        Dim strText As String
        Dim longLevel As Integer
        Dim longItem As Integer
    
        Set docSource = ActiveDocument
        Set docOutline = Documents.Add
    
        ' Content returns only the
        ' main body of the document, not
        ' the headers and footer.
        Set rng = docOutline.Content
        astrHeadings = _
         docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
        For intItem = LBound(astrHeadings) To UBound(astrHeadings)
            ' Get the text and the level.
            strText = Trim$(astrHeadings(intItem))
            intLevel = GetLevel(CStr(astrHeadings(intItem)))
    
            ' Add the text to the document.
            rng.InsertAfter strText & vbNewLine
    
            ' Set the style of the selected range and
            ' then collapse the range for the next entry.
            rng.Style = "Heading " & intLevel
            rng.Collapse wdCollapseEnd
        Next intItem
    End Sub
    
    Private Function GetLevel(strItem As String) As Integer
        ' Return the heading level of a header from the
        ' array returned by Word.
    
        ' The number of leading spaces indicates the
        ' outline level (2 spaces per level: H1 has
        ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
    
        Dim strTemp As String
        Dim strOriginal As String
        Dim longDiff As Integer
    
        ' Get rid of all trailing spaces.
        strOriginal = RTrim$(strItem)
    
        ' Trim leading spaces, and then compare with
        ' the original.
        strTemp = LTrim$(strOriginal)
    
        ' Subtract to find the number of
        ' leading spaces in the original string.
        longDiff = Len(strOriginal) - Len(strTemp)
        GetLevel = (longDiff / 2) + 1
    End Function
    
    0 讨论(0)
  • 2020-11-30 04:46

    You can also create a Table of Contents in the doc and copy that. This separates out the para ref from the title, which is handy if you need to present that in another context. If you do not want the ToC in your doc, just delete that after the Copy n Paste. JK.

    0 讨论(0)
  • 2020-11-30 04:50

    The easiest way to get a list of headings, is to loop through the paragraphs in the document, for example:

     Sub ReadPara()
    
        Dim DocPara As Paragraph
    
        For Each DocPara In ActiveDocument.Paragraphs
    
         If Left(DocPara.Range.Style, Len("Heading")) = "Heading" Then
    
           Debug.Print DocPara.Range.Text
    
         End If
    
        Next
    
    
    End Sub
    

    By the way, I find it is a good idea to remove the final character of the paragraph range. Otherwise, if you send the string to a message box or a document, Word displays an extra control character. For example:

    Left(DocPara.Range.Text, len(DocPara.Range.Text)-1)
    
    0 讨论(0)
  • 2020-11-30 04:50

    This macro worked beautifully for me (Word 2010). I've extended the functionality slightly: now it prompts the user to enter a minimum level, and supresses subheadings below that level.

    Public Sub CreateOutline()
    ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
        Dim docOutline As Word.Document
        Dim docSource As Word.Document
        Dim rng As Word.Range
    
        Dim astrHeadings As Variant
        Dim strText As String
        Dim intLevel As Integer
        Dim intItem As Integer
        Dim minLevel As Integer
    
        Set docSource = ActiveDocument
        Set docOutline = Documents.Add
    
        minLevel = 1  'levels above this value won't be copied.
        minLevel = CInt(InputBox("This macro will generate a new document that contains only the headers from the existing document. What is the lowest level heading you want?", "2"))
    
        ' Content returns only the
        ' main body of the document, not
        ' the headers and footer.
        Set rng = docOutline.Content
        astrHeadings = _
         docSource.GetCrossReferenceItems(wdRefTypeHeading)
    
        For intItem = LBound(astrHeadings) To UBound(astrHeadings)
            ' Get the text and the level.
            strText = Trim$(astrHeadings(intItem))
            intLevel = GetLevel(CStr(astrHeadings(intItem)))
    
            If intLevel <= minLevel Then
    
                ' Add the text to the document.
                rng.InsertAfter strText & vbNewLine
    
                ' Set the style of the selected range and
                ' then collapse the range for the next entry.
                rng.Style = "Heading " & intLevel
                rng.Collapse wdCollapseEnd
            End If
        Next intItem
    End Sub
    
    Private Function GetLevel(strItem As String) As Integer
        ' from http://stackoverflow.com/questions/274814/getting-the-headings-from-a-word-document
        ' Return the heading level of a header from the
        ' array returned by Word.
    
        ' The number of leading spaces indicates the
        ' outline level (2 spaces per level: H1 has
        ' 0 spaces, H2 has 2 spaces, H3 has 4 spaces.
    
        Dim strTemp As String
        Dim strOriginal As String
        Dim intDiff As Integer
    
        ' Get rid of all trailing spaces.
        strOriginal = RTrim$(strItem)
    
        ' Trim leading spaces, and then compare with
        ' the original.
        strTemp = LTrim$(strOriginal)
    
        ' Subtract to find the number of
        ' leading spaces in the original string.
        intDiff = Len(strOriginal) - Len(strTemp)
        GetLevel = (intDiff / 2) + 1
    End Function
    
    0 讨论(0)
提交回复
热议问题