Generate Word Documents (in Excel VBA) from a series of Document Templates

后端 未结 4 1005
闹比i
闹比i 2020-11-30 21:26

Hey all. I\'ll try to make this brief and simple. :)

I have

  1. 40 or so boilerplate word documents with a series of fields (Name, address
相关标签:
4条回答
  • 2020-11-30 22:07

    http://www.computorcompanion.com/LPMArticle.asp?ID=224 Describes the use of Word bookmarks

    A section of text in a document can be bookmarked, and given a variable name. Using VBA, this variable can be accessed and the content in the document can be replaced with alternate content. This is a solution to having placeholders such as Name and Address in the document.

    Furthermore, using bookmarks, documents can be modified to reference bookmarked text. If a name appears several times throughout a document, the first instance can be bookmarked, and additional instances can reference the bookmark. Now when the first instance is programatically changed, all other instances of the variable throughout the document are also automatically changed.

    Now all that's needed is to update all the documents by bookmarking the placeholder text and using a consistent naming convention throughout the documents, then iterate through each documents replacing the bookmark if it exists:

    document.Bookmarks("myBookmark").Range.Text = "Inserted Text"
    

    I can probably solve the problem of variables that don't appear in a given document using the on error resume next clause before attempting each replacement.

    Thanks to Doug Glancy for mentioning the existance of bookmarks in his comment. I had no knowledge of their existence beforehand. I will keep this topic posted on whether this solution suffices.

    0 讨论(0)
  • 2020-11-30 22:14

    It's been a long time since I asked this question, and my solution has undergone more and more refinement. I've had to deal with all sorts of special cases, such as values that come directly from the workbook, sections that need to be specially generated based on lists, and the need to do replacements in headers and footers.

    As it turns out, it did not suffice to use bookmarks, as it was possible for users to later edit documents to change, add, and remove placeholder values from the documents. The solution was in fact to use keywords such as this:

    enter image description here

    This is just a page from a sample document which uses some of the possible values that can get automatically inserted into a document. Over 50 documents exist with completely different structures and layouts, and using different parameters. The only common knowledge shared by the word documents and the excel spreadsheet is a knowledge of what these placeholder values are meant to represent. In excel, this is stored in a list of document generation keywords, which contain the keyword, followed by a reference to the range that actually contains this value:

    enter image description here

    These were the key two ingredients required. Now with some clever code, all I had to do was iterate over each document to be generated, and then iterate over the range of all known keywords, and do a search and replace for each keyword in each document.


    First, I have the wrapper method, which takes care of maintaining an instance of microsoft word iterating over all documents selected for generation, numbering the documents, and doing the user interface stuff (like handling errors, displaying the folder to the user, etc.)

    ' Purpose: Iterates over and generates all documents in the list of forms to generate
    '          Improves speed by creating a persistant Word application used for all generated documents
    Public Sub GeneratePolicy()
        Dim oWrd As New Word.Application
        Dim srcPath As String
        Dim cel As Range
    
        If ERROR_HANDLING Then On Error GoTo errmsg
        If Forms.Cells(2, FormsToGenerateCol) = vbNullString Then _
            Err.Raise 1, , "There are no forms selected for document generation."
        'Get the path of the document repository where the forms will be found.
        srcPath = FindConstant("Document Repository")
        'Each form generated will be numbered sequentially by calling a static counter function. This resets it.
        GetNextEndorsementNumber reset:=True
        'Iterate over each form, calling a function to replace the keywords and save a copy to the output folder
        For Each cel In Forms.Range(Forms.Cells(2, FormsToGenerateCol), Forms.Cells(1, FormsToGenerateCol).End(xlDown))
            RunReplacements cel.value, CreateDocGenPath(cel.Offset(0, 1).value), oWrd
        Next cel
        oWrd.Quit
        On Error Resume Next
        'Display the folder containing the generated documents
        Call Shell("explorer.exe " & CreateDocGenPath, vbNormalFocus)
        oWrd.Quit False
        Application.StatusBar = False
        If MsgBox("Policy generation complete. The reserving information will now be recorded.", vbOKCancel, _
                  "Policy Generated. OK to store reserving info?") = vbOK Then Push_Reserving_Requirements
        Exit Sub
    errmsg:
        MsgBox Err.Description, , "Error generating Policy Documents"
    End Sub
    

    That routine calls RunReplacements which takes care of opening the document, prepping the environment for a fast replacement, updating links once done, handling errors, etc:

    ' Purpose: Opens up a document and replaces all instances of special keywords with their respective values.
    '          Creates an instance of Word if an existing one is not passed as a parameter.
    '          Saves a document to the target path once the template has been filled in.
    '
    '          Replacements are done using two helper functions, one for doing simple keyword replacements,
    '          and one for the more complex replacements like conditional statements and schedules.
    Private Sub RunReplacements(ByVal DocumentPath As String, ByVal SaveAsPath As String, _
                                Optional ByRef oWrd As Word.Application = Nothing)
        Dim oDoc As Word.Document
        Dim oWrdGiven As Boolean
        If oWrd Is Nothing Then Set oWrd = New Word.Application Else oWrdGiven = True
    
        If ERROR_HANDLING Then On Error GoTo docGenError
        oWrd.Visible = False
        oWrd.DisplayAlerts = wdAlertsNone
    
        Application.StatusBar = "Opening " & Mid(DocumentPath, InStrRev(DocumentPath, "\") + 1)
        Set oDoc = oWrd.Documents.Open(Filename:=DocumentPath, Visible:=False)
        RunAdvancedReplacements oDoc
        RunSimpleReplacements oDoc
        UpdateLinks oDoc 'Routine which will update calculated statements in Word (like current date)
        Application.StatusBar = "Saving " & Mid(DocumentPath, InStrRev(DocumentPath, "\") + 1)
        oDoc.SaveAs SaveAsPath
    
        GoTo Finally
    docGenError:
        MsgBox "Un unknown error occurred while generating document: " & DocumentPath & vbNewLine _
                & vbNewLine & Err.Description, vbCritical, "Document Generation"
    Finally:
        If Not oDoc Is Nothing Then oDoc.Close False: Set oDoc = Nothing
        If Not oWrdGiven Then oWrd.Quit False
    End Sub
    

    That routine then invokes RunSimpleReplacements. and RunAdvancedReplacements. In the former, we iterate over the set of Document Generation Keywords and call WordDocReplace if the document contains our keyword. Note that it's much faster to try and Find a bunch of words to figure out that they don't exist, then to call replace indiscriminately, so we always check if a keyword exists before attempting to replace it.

    ' Purpose: While short, this short module does most of the work with the help of the generation keywords
    '          range on the lists sheet. It loops through every simple keyword that might appear in a document
    '          and calls a function to have it replaced with the corresponding data from pricing.
    Private Sub RunSimpleReplacements(ByRef oDoc As Word.Document)
        Dim DocGenKeys As Range, valueSrc As Range
        Dim value As String
        Dim i As Integer
    
        Set DocGenKeys = Lists.Range("DocumentGenerationKeywords")
        For i = 1 To DocGenKeys.Rows.Count
            If WordDocContains(oDoc, "#" & DocGenKeys.Cells(i, 1).Text & "#") Then
                'Find the text that we will be replacing the placeholder keyword with
                Set valueSrc = Range(Mid(DocGenKeys.Cells(i, 2).Formula, 2))
                If valueSrc.MergeCells Then value = valueSrc.MergeArea.Cells(1, 1).Text Else value = valueSrc.Text
                'Perform the replacement
                WordDocReplace oDoc, "#" & DocGenKeys.Cells(i, 1).Text & "#", value
            End If
        Next i
    End Sub
    

    This is the function used to detect whether a keyword exists in the document:

    ' Purpose: Function called for each replacement to first determine as quickly as possible whether
    '          the document contains the keyword, and thus whether replacement actions must be taken.
    Public Function WordDocContains(ByRef oDoc As Word.Document, ByVal searchFor As String) As Boolean
        Application.StatusBar = "Checking for keyword: " & searchFor
        WordDocContains = False
        Dim storyRange As Word.Range
        For Each storyRange In oDoc.StoryRanges
            With storyRange.Find
                .Text = searchFor
                WordDocContains = WordDocContains Or .Execute
            End With
            If WordDocContains Then Exit For
        Next
    End Function
    

    And this is where the rubber meets the road - the code that executes the replacement. This routine got more complicated as I encountered difficulties. Here are the lessons you will only learn from experience:

    1. You can set the replacement text directly, or you can use the clipboard. I found out the hard way that if you are doing a VBA replace in word using a string longer than 255 characters, the text will get truncated if you try to place it in the Find.Replacement.Text, but you can use "^c" as your replacement text, and it will get it directly from the clipboard. This was the workaround I got to use.

    2. Simply calling replace will miss keywords in some text areas like headers and footers. Because of this, you actually need to iterate over the document.StoryRanges and run the search and replace on each one to ensure that you catch all instances of the word you want to replace.

    3. If you're setting the Replacement.Text directly, you need to convert Excel line breaks (vbNewLine and Chr(10)) with a simple vbCr for them to appear properly in word. Otherwise, anywhere your replacement text has line breaks coming from an excel cell will end up inserting strange symbols into word. If you use the clipboard method however, you do not need to do this, as the line breaks get converted automatically when put in the clipboard.

    That explains everything. Comments should be pretty clear too. Here's the golden routine that executes the magic:

    ' Purpose: This function actually performs replacements using the Microsoft Word API
    Public Sub WordDocReplace(ByRef oDoc As Word.Document, ByVal replaceMe As String, ByVal replaceWith As String)
        Dim clipBoard As New MSForms.DataObject
        Dim storyRange As Word.Range
        Dim tooLong As Boolean
    
        Application.StatusBar = "Replacing instances of keyword: " & replaceMe
    
        'We want to use regular search and replace if we can. It's faster and preserves the formatting that
        'the keyword being replaced held (like bold).  If the string is longer than 255 chars though, the
        'standard replace method doesn't work, and so we must use the clipboard method (^c special character),
        'which does not preserve formatting. This is alright for schedules though, which are always plain text.
        If Len(replaceWith) > 255 Then tooLong = True
        If tooLong Then
            clipBoard.SetText IIf(replaceWith = vbNullString, "", replaceWith)
            clipBoard.PutInClipboard
        Else
            'Convert excel in-cell line breaks to word line breaks. (Not necessary if using clipboard)
            replaceWith = Replace(replaceWith, vbNewLine, vbCr)
            replaceWith = Replace(replaceWith, Chr(10), vbCr)
        End If
        'Replacement must be done on multiple 'StoryRanges'. Unfortunately, simply calling replace will miss
        'keywords in some text areas like headers and footers.
        For Each storyRange In oDoc.StoryRanges
            Do
                With storyRange.Find
                    .MatchWildcards = True
                    .Text = replaceMe
                    .Replacement.Text = IIf(tooLong, "^c", replaceWith)
                    .Wrap = wdFindContinue
                    .Execute Replace:=wdReplaceAll
                End With
                On Error Resume Next
                Set storyRange = storyRange.NextStoryRange
                On Error GoTo 0
            Loop While Not storyRange Is Nothing
        Next
        If tooLong Then clipBoard.SetText ""
        If tooLong Then clipBoard.PutInClipboard
    End Sub
    

    When the dust settles, we're left with a beautiful version of the initial document with production values in place of those hash marked keywords. I'd love to show an example, but of course every filled in document contain all-proprietary information.


    The only think left to mention I guess would be that RunAdvancedReplacements section. It does something extremely similar - it ends up calling the same WordDocReplace function, but what's special about the keywords used here is that they don't link to a single cell in the original workbook, they get generated in the code-behind from lists in the workbook. So for instance, one of the advanced replacements would look like this:

    'Generate the schedule of vessels
    If WordDocContains(oDoc, "#VESSELSCHEDULE#") Then _
        WordDocReplace oDoc, "#VESSELSCHEDULE#", GenerateVesselSchedule()
    

    And then there will be a corresponding routine which puts together a string containing all the vessel information as configured by the user:

    ' Purpose: Generates the list of vessels from the "Vessels" sheet based on the user's configuration
    '          in the booking tab. The user has the option to generate one or both of Owned Vessels
    '          and Chartered Vessels, as well as what fields to display. Uses a helper function.
    Public Function GenerateVesselSchedule() As String
        Dim value As String
    
        Application.StatusBar = "Generating Schedule of Vessels."
        If Booking.Range("ListVessels").value = "Yes" Then
            Dim VesselCount As Long
    
            If Booking.Range("ListVessels").Offset(1).value = "Yes" Then _
                value = value & GenerateVesselScheduleHelper("Vessels", VesselCount)
            If Booking.Range("ListVessels").Offset(1).value = "Yes" And _
               Booking.Range("ListVessels").Offset(2).value = "Yes" Then _
                value = value & "(Chartered Vessels)" & vbNewLine
            If Booking.Range("ListVessels").Offset(2).value = "Yes" Then _
                value = value & GenerateVesselScheduleHelper("CharteredVessels", VesselCount)
            If Len(value) > 2 Then value = Left(value, Len(value) - 2) 'Remove the trailing line break
        Else
            GenerateVesselSchedule = Booking.Range("VesselSchedAlternateText").Text
        End If
        GenerateVesselSchedule = value
    End Function
    
    ' Purpose: Helper function for the Vessel Schedule generation routine. Generates either the Owned or
    '          Chartered vessels based on the schedule parameter passed. The list is numbered and contains
    '          the information selected by the user on the Booking sheet.
    ' SENSITIVE: Note that this routine is sensitive to the layout of the Vessel Schedule tab and the
    '            parameters on the Configure Quotes tab. If either changes, it should be revisited.
    Public Function GenerateVesselScheduleHelper(ByVal schedule As String, ByRef VesselCount As Long) As String
        Dim value As String, nextline As String
        Dim numInfo As Long, iRow As Long, iCol As Long
        Dim Inclusions() As Boolean, Columns() As Long
    
        'Gather info about vessel info to display in the schedule
        With Booking.Range("VesselInfoToInclude")
            numInfo = Booking.Range(.Cells(1, 1), .End(xlToRight)).Columns.Count - 1
            ReDim Inclusions(1 To numInfo)
            ReDim Columns(1 To numInfo)
            On Error Resume Next 'Some columns won't be identified
            For iCol = 1 To numInfo
                Inclusions(iCol) = .Offset(0, iCol) = "Yes"
                Columns(iCol) = sumSchedVessels.Range(schedule).Cells(1).EntireRow.Find(.Offset(-1, iCol)).Column
            Next iCol
            On Error GoTo 0
        End With
    
        'Build the schedule
        With sumSchedVessels.Range(schedule)
            For iRow = .row + 1 To .row + .Rows.Count - 1
                If Len(sumSchedVessels.Cells(iRow, Columns(1)).value) > 0 Then
                    VesselCount = VesselCount + 1
                    value = value & VesselCount & "." & vbTab
                    nextline = vbNullString
                    'Add each property that was included to the description string
                    If Inclusions(1) Then nextline = nextline & sumSchedVessels.Cells(iRow, Columns(1)) & vbTab
                    If Inclusions(2) Then nextline = nextline & "Built: " & sumSchedVessels.Cells(iRow, Columns(2)) & vbTab
                    If Inclusions(3) Then nextline = nextline & "Length: " & _
                                          Format(sumSchedVessels.Cells(iRow, Columns(3)), "#'") & vbTab
                    If Inclusions(4) Then nextline = nextline & "" & sumSchedVessels.Cells(iRow, Columns(4)) & vbTab
                    If Inclusions(5) Then nextline = nextline & "Hull Value: " & _
                                          Format(sumSchedVessels.Cells(iRow, Columns(5)), "$#,##0") & vbTab
                    If Inclusions(6) Then nextline = nextline & "IV: " & _
                                          Format(sumSchedVessels.Cells(iRow, Columns(6)), "$#,##0") & vbTab
                    If Inclusions(7) Then nextline = nextline & "TIV: " & _
                                          Format(sumSchedVessels.Cells(iRow, Columns(7)), "$#,##0") & vbTab
                    If Inclusions(8) And schedule = "CharteredVessels" Then _
                        nextline = nextline & "Deductible: " & Format(bmCharterers.Range(schedule).Cells( _
                                   iRow - .row, 9), "$#,##0") & vbTab
                    nextline = Left(nextline, Len(nextline) - 1) 'Remove the trailing tab
                    'If more than 4 properties were included insert a new line after the 4th one
                    Dim tabloc As Long: tabloc = 0
                    Dim counter As Long: counter = 0
                    Do
                        tabloc = tabloc + 1
                        tabloc = InStr(tabloc, nextline, vbTab)
                        If tabloc > 0 Then counter = counter + 1
                    Loop While tabloc > 0 And counter < 4
                    If counter = 4 Then nextline = Left(nextline, tabloc - 1) & vbNewLine & Mid(nextline, tabloc)
                    value = value & nextline & vbNewLine
                End If
            Next iRow
        End With
    
        GenerateVesselScheduleHelper = value
    End Function
    

    the resulting string can be used just like the contents of any excel cell, and passed to the replacement function, which will appropriately use the clipboard method if it exceeds 255 characters.

    So this template:

    enter image description here

    Plus this spreadsheet data:

    enter image description here

    Becomes this document:

    enter image description here


    I sincerely hope that this helps someone out some day. It was definitely a huge undertaking and a complex wheel to have to re-invent. The application is huge, with over 50,000 lines of VBA code, so if I've referenced a crucial method in my code somewhere that someone needs, please leave a comment and I'll add it in here.

    0 讨论(0)
  • 2020-11-30 22:19

    You might consider an XML based approach.

    Word has a feature called Custom XML data-binding, or data-bound content controls. A content control is essentially a point in the document which can contain content. A "data-bound" content control gets its content from an XML document you include in the docx zip file. An XPath expression is used to say which bit of XML. So all you need to do is include your XML file, and Word will do the rest.

    Excel has ways to get data out of it as XML, so the whole solution should work nicely.

    There is plenty of information on content control data-binding on MSDN (some of which has been referenced in earlier SO questions) so I won't bother including them here.

    But you do need a way of setting up the bindings. You can either use the Content Control Toolkit, or if you want to do it from within Word, my OpenDoPE add-in.

    0 讨论(0)
  • 2020-11-30 22:24

    Having done a similar task I found that inserting values into tables was much quicker than searching for named tags - the data can then be inserted like this:

        With oDoc.Tables(5)
        For i = 0 To Data.InvoiceDictionary.Count - 1
            If i > 0 Then
                oDoc.Tables(5).rows.Add
            End If
             Set invoice = Data.InvoiceDictionary.Items(i)
            .Cell(i + 2, 1).Range.Text = invoice.InvoiceCCNumber
            .Cell(i + 2, 2).Range.Text = invoice.InvoiceDate
            .Cell(i + 2, 3).Range.Text = invoice.TransactionType
            .Cell(i + 2, 4).Range.Text = invoice.Description
            .Cell(i + 2, 5).Range.Text = invoice.SumOfValue
    
        Next i
    

    .Cell(i + 1, 4).Range.Text = "Total:" End With in this case row 1 of the table was the headers; row 2 was empty and there were no further rows - thus the rows.add applies once more than one row was attached. The tables can be very detailed documents and by hiding the borders and cell borders can be made to look like ordinary text. Tables are numbered sequentially following the document flow. (i.e. Doc.Tables(1) is the first table...

    0 讨论(0)
提交回复
热议问题