Passing strings from VBA to C++ DLL

后端 未结 5 711
太阳男子
太阳男子 2021-02-05 11:31

I\'m really confused about passing strings from VBA to C++. Here\'s the VBA code:

Private Declare Sub passBSTRVal Lib \"         


        
相关标签:
5条回答
  • 2021-02-05 11:52

    Here are some old reference articles, it's worth reading because it explains the root causes of all our problems:

    • How to: Access DLLs in Excel
    • And two older VB ones (VBA is an updated VB runtime): Passing Strings to a DLL Procedure Declaring a DLL Procedure

    To sum up:

    • VBA internal storage is BSTR with unicode characters in it.
    • VBA also uses BSTR for talking with the external world, but you don't have to use BSTR if you don't want to because from C/C++, you may choose to use only the pointer part of the BSTR (a BSTR is a LPWSTR, an LPWSTR is not a BSTR).
    • The content of the BSTR that VBA uses to communicate outside its world is not unicode but ANSI (VBA is still living in the 90s and thinks that, regarding the String data type, the outside world is always ANSI, ASCIIZ, CodePage, etc. ). So, even if it still uses a BSTR, that BSTR contains the ANSI equivalent of the internal Unicode storage, modulo the current locale (BSTR is like an envelope that can contain anything, including ANSI, including zero characters anywhere, provided the length matches the data).

    So when you use use Declare with argument of type String, the final binary layout will always match C's ANSI 'char *' (or LPSTR in windows macro parlance). Officially, you're still supposed to use VARIANTs if you want to pass full unicode string over interop barriers (read the links for more on this).

    But, not all is lost, as VBA (not VB) has been a bit improved over the years, mainly to support Office 64-bit versions.

    The LongPtr data type has been introduced. It's a type that will be a signed 32 bit integer on a 32-bit system and a signed 64 bit integer on a 64-bit system.

    Note it's the exact equivalent of .NET's IntPtr (VBA also still thinks a Long is 32-bit and an Integer is 16-bit, while .NET uses Long for 64-bit and Int for 32-bit...).

    Now, LongPtr would be useless w/o the help of VB's all-time undocumented function StrPtr that takes a string and returns a LongPtr. It's undocumented because officially VB doesn't know what a pointer is (actually, be cautious as this can crash your program at runtime if not used properly).

    So, let's suppose this C code:

      STDAPI ToUpperLPWSTR(LPCWSTR in, LPWSTR out, int cch)
      {
        // unicode version
        LCMapStringW(LOCALE_USER_DEFAULT, LCMAP_LINGUISTIC_CASING | LCMAP_UPPERCASE, in, lstrlenW(in), out, cch);
        return S_OK;
      }
    
      STDAPI ToUpperBSTR(BSTR in, BSTR out, int cch)
      {
        // unicode version
        // note the usage SysStringLen here. I can do it because it's a BSTR
        // and it's slightly faster than calling lstrlen...
        LCMapStringW(LOCALE_USER_DEFAULT, LCMAP_LINGUISTIC_CASING | LCMAP_UPPERCASE, in, SysStringLen(in), out, cch);
        return S_OK;
      }
    
      STDAPI ToUpperLPSTR(LPCSTR in, LPSTR out, int cch)
      {
        // ansi version
        LCMapStringA(LOCALE_USER_DEFAULT, LCMAP_LINGUISTIC_CASING | LCMAP_UPPERCASE, in, lstrlenA(in), out, cch);
        return S_OK;
      }
    

    Then you can call it with these VBA declares (note this code is 32 and 64-bit compatible):

      Private Declare PtrSafe Function ToUpperLPWSTR Lib "foo.dll" (ByVal ins As LongPtr, ByVal out As LongPtr, ByVal cch As Long) As Long
      Private Declare PtrSafe Function ToUpperBSTR Lib "foo.dll" (ByVal ins As LongPtr, ByVal out As LongPtr, ByVal cch As Long) As Long
      Private Declare PtrSafe Function ToUpperLPSTR Lib "foo.dll" (ByVal ins As String, ByVal out As String, ByVal cch As Long) As Long
    
      Sub Button1_Click()
    
          Dim result As String
          result = String(256, 0)
    
          // note I use a special character 'é' to make sure it works
          // I can't use any unicode character because VBA's IDE has not been updated and does not suppport the
          // whole unicode range (internally it does, but you'll have to store the texts elsewhere, and load it as an opaque thing w/o the IDE involved)
    
          ToUpperLPWSTR StrPtr("héllo world"), StrPtr(result), 256
          MsgBox result
          ToUpperBSTR StrPtr("héllo world"), StrPtr(result), 256
          MsgBox result
          ToUpperLPSTR "héllo world", result, 256
          MsgBox result
      End Sub
    

    They all work, however

    • the ToUpperLPSTR is an ANSI fonction, so it will not support the unicode range that most people use nowadays. It works for me because the special non ASCII 'é' character coded in the IDE will find a correspondance when I run it in my machine with my ANSI codepage. But it may not work depending on where it runs. With unicode, you don't have that kind of problems.
    • the ToUpperBSTR is kinda specialized for VBA (COM automation) clients. If this function is called from a C/C++ client, the C/C++ coder will have to create a BSTR to use it, so it may look funny and add more work. Note however it will support strings that contains the zero character in it, thanks to the way BSTR work. It can sometimes be useful for example to pass array of bytes or special strings.
    0 讨论(0)
  • 2021-02-05 11:52

    Ok, so I know I set bounty to draw out a fuller response on the IDL idea but I have had a go at this myself. So I opened a ATL project, changed the idl to the following

    // IDLForModules.idl : IDL source for IDLForModules
    //
    
    // This file will be processed by the MIDL tool to
    // produce the type library (IDLForModules.tlb) and marshalling code.
    
    import "oaidl.idl";
    import "ocidl.idl";
    
    [
        helpstring("Idl For Modules"),
        uuid(EA8C8803-2E90-45B1-8B87-2674A9E41DF1),
        version(1.0),
    ]
    library IDLForModulesLib
    {
        importlib("stdole2.tlb");
    
        [
            /* dllname attribute https://msdn.microsoft.com/en-us/library/windows/desktop/aa367099(v=vs.85).aspx */
            dllname("IdlForModules.dll"),
            uuid(4C1884B3-9C24-4B4E-BDF8-C6B2E0D8B695)
        ]
        module Math{
            /* entry attribute https://msdn.microsoft.com/en-us/library/windows/desktop/aa366815(v=vs.85).aspx */
            [entry(656)] /* map function by entry point ordinal */
            Long _stdcall Abs([in] Long Number);
        }
        module Strings{
            [entry("pUpper")] /* map function by entry point name */
            BSTR _stdcall Upper([in] BSTR Number);
        }
    };
    

    Then in the main cpp file I added

    #include <string>
    #include <algorithm>
    
    INT32 __stdcall _MyAbs(INT32 Number) {
        return abs(Number);
    }
    
    BSTR __stdcall pUpper(BSTR sBstr)
    {
        // Get the BSTR into the wonderful world of std::wstrings immediately
        std::wstring sStd(sBstr);
    
        // Do some "Mordern C++" iterator style op on the string
        std::transform(sStd.begin(), sStd.end(), sStd.begin(), ::toupper);
    
        // Dig out the char* and pass to create a return BSTR
        return SysAllocString(sStd.c_str());
    }
    

    And in the DEF file I edited it to be

    ; MidlForModules.def : Declares the module parameters.
    
    LIBRARY
    
    EXPORTS
        DllCanUnloadNow     PRIVATE
        DllGetClassObject   PRIVATE
        DllRegisterServer   PRIVATE
        DllUnregisterServer PRIVATE
        DllInstall      PRIVATE
        _MyAbs @656
        pUpper
    

    In a macro-enmabled workbook called TestClient.xlsm placed in same directory as the Debug output Dll I write the following in the ThisWorkbook module

    Option Explicit
    
    Private Declare Function LoadLibrary Lib "kernel32" Alias "LoadLibraryA" (ByVal lpLibFileName As String) As Long
    
    Private Sub Workbook_Open()
        '* next line establishes relative position of Dll
        Debug.Assert Dir(ThisWorkbook.Path & "\IDLForModules.dll") = "IDLForModules.dll"
    
        '* next line loads the Dll so we can avoid very long Lib "c:\foo\bar\baz\barry.dll"
        LoadLibrary ThisWorkbook.Path & "\IDLForModules.dll"
    
        '* next go to  Tools References are check "Idl For Modules"
        '* "Idl For Modules" Iis set in the IDL with helpstring("Idl For Modules")
    
    End Sub
    

    I then add a Tools Reference to the newly created Type library and now I can complete by adding a standard module and adding the following

    Option Explicit
    
    Sub TestAbs()
        Debug.Print IDLForModulesLib.Math.Abs(-5)
    End Sub
    
    Sub TestUpper()
        Debug.Print IDLForModulesLib.Strings.Upper("foobar")
    End Sub
    

    This works for me on Windows 8.1 Professional 64 bit, VS2013, Excel 15. Fuller instructions for C++ newbies can found here Throw away Declare Function with IDL for Modules.

    0 讨论(0)
  • 2021-02-05 11:52

    This is just an illustration to Simon's answer. It shows how to call a native DLL that has parameters of type LPWSTR. As a simple example, I use GetWindowsDirectoryW. As Simon pointed out, always use the "W" version of native DLLs.

    Declare PtrSafe Function GetWindowsDirectoryW Lib "kernel32" _ 
       (ByVal lpBuffer As LongPtr, ByVal nSize As Long) As Long
    
    Sub TestGetWindowsDirectoryW()
      Dim WindowsDir As String
      WindowsDir = Space$(256)
      GetWindowsDirectoryW StrPtr(WindowsDir), 256
      MsgBox WindowsDir
    End Sub
    
    0 讨论(0)
  • 2021-02-05 11:58

    BIG HUGE NOTE: I'm not a programmer, I just really enjoy programming, so please be kind to me. I want to improve, so suggestions and comments from people more skilled than me (basically, everyone) are VERY welcomed!

    Ben, if you're reading this, I think you opened my eyes to what's happening. MIDL sounds like the proper way of doing this, and I intend on learning it, but this seemed like a good learning opportunity, and I never let those pass me by!

    I think what's happening is that narrow characters are getting marshalled into a wide character storage. For example, the string "hello" stored with narrow characters looks like:

    |h |e |l |l |o |\0 |
    

    and stored with wide characters, looks like:

    |h   |e   |l   |l   |o   |\0   |
    

    But when you pass a string from VBA to C++, something really strange happens. You get narrow characters marshalled into a wide character, like this:

    |h e |l l |o \0 |    |    |    |
    

    This is why using LPCSTR / LPCSTR* works. Yes, BSTR uses a string of wchar_t, but this marshalling makes it look like a string of char. Accessing with char* alternately points to the first and second characters in each half of the wchar_t (h, then e. l, then l. o, then \0). Even though the pointer arithmetic for char* and wchar_t* is different, it works because of the funny way the characters are marshalled. In fact, we're passed a pointer to the data string, but if you wanted to access the length of the BSTR, 4 bytes before the data string, you can play games with pointer arithmetic to get where you want to go. Assuming the BSTR is passed in as LPCSTR s,

    char* ptrToChar;      // 1 byte
    wchar_t* ptrToWChar;  // 2 bytes
    int* ptrToInt;        // 4 bytes
    size_t strlen;
    
    ptrToChar = (char *) s;
    strlen = ptrToChar[-4];
    
    ptrToWChar = (wchar_t *) s;
    strlen = ptrToWChar[-2];
    
    ptrToInt = (int *) s;
    strlen = ptrToInt[-1];
    

    Of course, if the string got passed in as LPCSTR* s, then of course you need to dereference s first by accessing via something like:

    ptrToChar = (char *)(*s);
    

    and so on.

    If one wants to use LPCWSTR or BSTR to receive the VBA string, you have to dance around this marshalling. So for example, to create a C++ DLL that converts a VBA string to uppercase, I did the following:

    BSTR __stdcall pUpper( LPCWSTR* s )
    {
        // Get String Length (see previous discussion)
        int strlen = (*s)[-2];
    
        // Allocate space for the new string (+1 for the NUL character).
        char *dest = new char[strlen + 1];
    
        // Accessing the *LPCWSTR s using a (char *) changes what we mean by ptr arithmetic,
        // e.g. p[1] hops forward 1 byte.  s[1] hops forward 2 bytes.
        char *p = (char *)(*s);
    
        // Copy the string data
        for( int i = 0; i < strlen; ++i )
            dest[i] = toupper(p[i]);
    
        // And we're done!
        dest[strlen] = '\0';
    
        // Create a new BSTR using our mallocated string.
        BSTR bstr = SysAllocStringByteLen(dest, strlen);
    
        // dest needs to be garbage collected by us.  COM will take care of bstr.
        delete dest;
        return bstr;
    }
    

    As far as I can tell, receiving the BSTR as a BSTR is equivalent to receiving it as a LPCWSTR, and receiving it as a BSTR* is equivalent to receiving it as a LPCWSTR*.

    OK, I am 100% certain there are a ton of mistakes here, but I believe the underlying ideas are correct. If there are mistakes or even better ways of thinking of something, I will gladly accept corrections / explanations, and fix them for Google, posterity, and future programmers.

    It sounds like the BEST way to do this is with Ben's MIDL suggestion (and maybe MIDL will make Safearrays and Variants less complicated?), and after I hit enter, I'm going to start learning that method. But this method works too and was an excellent learning opportunity for me.

    0 讨论(0)
  • 2021-02-05 12:09

    This form of external function call exists to be compatible with earlier versions of Visual Basic, and inherits their semantics. In particular, VB3 ran on 16-bit windows and dealt only with ANSI (i.e. MBCS) strings.

    The Declare syntax has the same restriction. VBA converts your string on the assumption that it is converting it from UTF-16 to ASCII. This allows code written in VB3 to work unchanged in VB4, VB5 and VB6.

    So for example "AZ" begins as \u0041\u005A, is converted to ANSI and becomes \x41\x5A which is reinterpreted as \u5A41 which is "婁".

    (With VB4, Microsoft merged WordBasic, Excel Basic and Visual basic into a single language, VBA.)

    The "new" way to call functions from VBA, is to create a type library for the external functions you need to use, using MIDL, and add it as a reference to the project. Type libraries can describe the exact signature of the function, (e.g. BSTR, LPCSTR, LPCWSTR, [out]BSTR*, etc.) In particular it is not necessary to wrap the functions in a COM object to call them from VBA (though it is if you wish to call them from VBScript).

    • A set of DLL functions is described as a MIDL module: https://msdn.microsoft.com/en-us/library/windows/desktop/aa367099(v=vs.85).aspx

    Alternatively you can't be bothered to fire up midl for a single function, you can use the VarPtr/StrPtr/CopyMemory hack. This is pretty much equivalent to PEEK and POKE.

    0 讨论(0)
提交回复
热议问题