I\'m really confused about passing strings from VBA to C++. Here\'s the VBA code:
Private Declare Sub passBSTRVal Lib \"
Here are some old reference articles, it's worth reading because it explains the root causes of all our problems:
To sum up:
String
data type, the outside world is always ANSI, ASCIIZ, CodePage, etc. ). So, even if it still uses a BSTR, that BSTR contains the ANSI equivalent of the internal Unicode storage, modulo the current locale (BSTR is like an envelope that can contain anything, including ANSI, including zero characters anywhere, provided the length matches the data).So when you use use Declare
with argument of type String
, the final binary layout will always match C's ANSI 'char *' (or LPSTR in windows macro parlance). Officially, you're still supposed to use VARIANTs if you want to pass full unicode string over interop barriers (read the links for more on this).
But, not all is lost, as VBA (not VB) has been a bit improved over the years, mainly to support Office 64-bit versions.
The LongPtr data type has been introduced. It's a type that will be a signed 32 bit integer on a 32-bit system and a signed 64 bit integer on a 64-bit system.
Note it's the exact equivalent of .NET's IntPtr (VBA also still thinks a Long is 32-bit and an Integer is 16-bit, while .NET uses Long for 64-bit and Int for 32-bit...).
Now, LongPtr
would be useless w/o the help of VB's all-time undocumented function StrPtr
that takes a string and returns a LongPtr
. It's undocumented because officially VB doesn't know what a pointer is (actually, be cautious as this can crash your program at runtime if not used properly).
So, let's suppose this C code:
STDAPI ToUpperLPWSTR(LPCWSTR in, LPWSTR out, int cch)
{
// unicode version
LCMapStringW(LOCALE_USER_DEFAULT, LCMAP_LINGUISTIC_CASING | LCMAP_UPPERCASE, in, lstrlenW(in), out, cch);
return S_OK;
}
STDAPI ToUpperBSTR(BSTR in, BSTR out, int cch)
{
// unicode version
// note the usage SysStringLen here. I can do it because it's a BSTR
// and it's slightly faster than calling lstrlen...
LCMapStringW(LOCALE_USER_DEFAULT, LCMAP_LINGUISTIC_CASING | LCMAP_UPPERCASE, in, SysStringLen(in), out, cch);
return S_OK;
}
STDAPI ToUpperLPSTR(LPCSTR in, LPSTR out, int cch)
{
// ansi version
LCMapStringA(LOCALE_USER_DEFAULT, LCMAP_LINGUISTIC_CASING | LCMAP_UPPERCASE, in, lstrlenA(in), out, cch);
return S_OK;
}
Then you can call it with these VBA declares (note this code is 32 and 64-bit compatible):
Private Declare PtrSafe Function ToUpperLPWSTR Lib "foo.dll" (ByVal ins As LongPtr, ByVal out As LongPtr, ByVal cch As Long) As Long
Private Declare PtrSafe Function ToUpperBSTR Lib "foo.dll" (ByVal ins As LongPtr, ByVal out As LongPtr, ByVal cch As Long) As Long
Private Declare PtrSafe Function ToUpperLPSTR Lib "foo.dll" (ByVal ins As String, ByVal out As String, ByVal cch As Long) As Long
Sub Button1_Click()
Dim result As String
result = String(256, 0)
// note I use a special character 'é' to make sure it works
// I can't use any unicode character because VBA's IDE has not been updated and does not suppport the
// whole unicode range (internally it does, but you'll have to store the texts elsewhere, and load it as an opaque thing w/o the IDE involved)
ToUpperLPWSTR StrPtr("héllo world"), StrPtr(result), 256
MsgBox result
ToUpperBSTR StrPtr("héllo world"), StrPtr(result), 256
MsgBox result
ToUpperLPSTR "héllo world", result, 256
MsgBox result
End Sub
They all work, however
Ok, so I know I set bounty to draw out a fuller response on the IDL idea but I have had a go at this myself. So I opened a ATL project, changed the idl to the following
// IDLForModules.idl : IDL source for IDLForModules
//
// This file will be processed by the MIDL tool to
// produce the type library (IDLForModules.tlb) and marshalling code.
import "oaidl.idl";
import "ocidl.idl";
[
helpstring("Idl For Modules"),
uuid(EA8C8803-2E90-45B1-8B87-2674A9E41DF1),
version(1.0),
]
library IDLForModulesLib
{
importlib("stdole2.tlb");
[
/* dllname attribute https://msdn.microsoft.com/en-us/library/windows/desktop/aa367099(v=vs.85).aspx */
dllname("IdlForModules.dll"),
uuid(4C1884B3-9C24-4B4E-BDF8-C6B2E0D8B695)
]
module Math{
/* entry attribute https://msdn.microsoft.com/en-us/library/windows/desktop/aa366815(v=vs.85).aspx */
[entry(656)] /* map function by entry point ordinal */
Long _stdcall Abs([in] Long Number);
}
module Strings{
[entry("pUpper")] /* map function by entry point name */
BSTR _stdcall Upper([in] BSTR Number);
}
};
Then in the main cpp file I added
#include <string>
#include <algorithm>
INT32 __stdcall _MyAbs(INT32 Number) {
return abs(Number);
}
BSTR __stdcall pUpper(BSTR sBstr)
{
// Get the BSTR into the wonderful world of std::wstrings immediately
std::wstring sStd(sBstr);
// Do some "Mordern C++" iterator style op on the string
std::transform(sStd.begin(), sStd.end(), sStd.begin(), ::toupper);
// Dig out the char* and pass to create a return BSTR
return SysAllocString(sStd.c_str());
}
And in the DEF file I edited it to be
; MidlForModules.def : Declares the module parameters.
LIBRARY
EXPORTS
DllCanUnloadNow PRIVATE
DllGetClassObject PRIVATE
DllRegisterServer PRIVATE
DllUnregisterServer PRIVATE
DllInstall PRIVATE
_MyAbs @656
pUpper
In a macro-enmabled workbook called TestClient.xlsm placed in same directory as the Debug output Dll I write the following in the ThisWorkbook module
Option Explicit
Private Declare Function LoadLibrary Lib "kernel32" Alias "LoadLibraryA" (ByVal lpLibFileName As String) As Long
Private Sub Workbook_Open()
'* next line establishes relative position of Dll
Debug.Assert Dir(ThisWorkbook.Path & "\IDLForModules.dll") = "IDLForModules.dll"
'* next line loads the Dll so we can avoid very long Lib "c:\foo\bar\baz\barry.dll"
LoadLibrary ThisWorkbook.Path & "\IDLForModules.dll"
'* next go to Tools References are check "Idl For Modules"
'* "Idl For Modules" Iis set in the IDL with helpstring("Idl For Modules")
End Sub
I then add a Tools Reference to the newly created Type library and now I can complete by adding a standard module and adding the following
Option Explicit
Sub TestAbs()
Debug.Print IDLForModulesLib.Math.Abs(-5)
End Sub
Sub TestUpper()
Debug.Print IDLForModulesLib.Strings.Upper("foobar")
End Sub
This works for me on Windows 8.1 Professional 64 bit, VS2013, Excel 15. Fuller instructions for C++ newbies can found here Throw away Declare Function with IDL for Modules.
This is just an illustration to Simon's answer. It shows how to call a native DLL that has parameters of type LPWSTR
. As a simple example, I use GetWindowsDirectoryW
. As Simon pointed out, always use the "W" version of native DLLs.
Declare PtrSafe Function GetWindowsDirectoryW Lib "kernel32" _
(ByVal lpBuffer As LongPtr, ByVal nSize As Long) As Long
Sub TestGetWindowsDirectoryW()
Dim WindowsDir As String
WindowsDir = Space$(256)
GetWindowsDirectoryW StrPtr(WindowsDir), 256
MsgBox WindowsDir
End Sub
BIG HUGE NOTE: I'm not a programmer, I just really enjoy programming, so please be kind to me. I want to improve, so suggestions and comments from people more skilled than me (basically, everyone) are VERY welcomed!
Ben, if you're reading this, I think you opened my eyes to what's happening. MIDL sounds like the proper way of doing this, and I intend on learning it, but this seemed like a good learning opportunity, and I never let those pass me by!
I think what's happening is that narrow characters are getting marshalled into a wide character storage. For example, the string "hello" stored with narrow characters looks like:
|h |e |l |l |o |\0 |
and stored with wide characters, looks like:
|h |e |l |l |o |\0 |
But when you pass a string from VBA to C++, something really strange happens. You get narrow characters marshalled into a wide character, like this:
|h e |l l |o \0 | | | |
This is why using LPCSTR / LPCSTR* works. Yes, BSTR uses a string of wchar_t, but this marshalling makes it look like a string of char. Accessing with char* alternately points to the first and second characters in each half of the wchar_t (h, then e. l, then l. o, then \0). Even though the pointer arithmetic for char* and wchar_t* is different, it works because of the funny way the characters are marshalled. In fact, we're passed a pointer to the data string, but if you wanted to access the length of the BSTR, 4 bytes before the data string, you can play games with pointer arithmetic to get where you want to go. Assuming the BSTR is passed in as LPCSTR s,
char* ptrToChar; // 1 byte
wchar_t* ptrToWChar; // 2 bytes
int* ptrToInt; // 4 bytes
size_t strlen;
ptrToChar = (char *) s;
strlen = ptrToChar[-4];
ptrToWChar = (wchar_t *) s;
strlen = ptrToWChar[-2];
ptrToInt = (int *) s;
strlen = ptrToInt[-1];
Of course, if the string got passed in as LPCSTR* s, then of course you need to dereference s first by accessing via something like:
ptrToChar = (char *)(*s);
and so on.
If one wants to use LPCWSTR or BSTR to receive the VBA string, you have to dance around this marshalling. So for example, to create a C++ DLL that converts a VBA string to uppercase, I did the following:
BSTR __stdcall pUpper( LPCWSTR* s )
{
// Get String Length (see previous discussion)
int strlen = (*s)[-2];
// Allocate space for the new string (+1 for the NUL character).
char *dest = new char[strlen + 1];
// Accessing the *LPCWSTR s using a (char *) changes what we mean by ptr arithmetic,
// e.g. p[1] hops forward 1 byte. s[1] hops forward 2 bytes.
char *p = (char *)(*s);
// Copy the string data
for( int i = 0; i < strlen; ++i )
dest[i] = toupper(p[i]);
// And we're done!
dest[strlen] = '\0';
// Create a new BSTR using our mallocated string.
BSTR bstr = SysAllocStringByteLen(dest, strlen);
// dest needs to be garbage collected by us. COM will take care of bstr.
delete dest;
return bstr;
}
As far as I can tell, receiving the BSTR as a BSTR is equivalent to receiving it as a LPCWSTR, and receiving it as a BSTR* is equivalent to receiving it as a LPCWSTR*.
OK, I am 100% certain there are a ton of mistakes here, but I believe the underlying ideas are correct. If there are mistakes or even better ways of thinking of something, I will gladly accept corrections / explanations, and fix them for Google, posterity, and future programmers.
It sounds like the BEST way to do this is with Ben's MIDL suggestion (and maybe MIDL will make Safearrays and Variants less complicated?), and after I hit enter, I'm going to start learning that method. But this method works too and was an excellent learning opportunity for me.
This form of external function call exists to be compatible with earlier versions of Visual Basic, and inherits their semantics. In particular, VB3 ran on 16-bit windows and dealt only with ANSI (i.e. MBCS) strings.
The Declare
syntax has the same restriction. VBA converts your string on the assumption that it is converting it from UTF-16 to ASCII. This allows code written in VB3 to work unchanged in VB4, VB5 and VB6.
So for example "AZ" begins as \u0041\u005A
, is converted to ANSI and becomes \x41\x5A
which is reinterpreted as \u5A41
which is "婁".
(With VB4, Microsoft merged WordBasic, Excel Basic and Visual basic into a single language, VBA.)
The "new" way to call functions from VBA, is to create a type library for the external functions you need to use, using MIDL, and add it as a reference to the project. Type libraries can describe the exact signature of the function, (e.g. BSTR
, LPCSTR
, LPCWSTR
, [out]BSTR*
, etc.) In particular it is not necessary to wrap the functions in a COM object to call them from VBA (though it is if you wish to call them from VBScript).
module
: https://msdn.microsoft.com/en-us/library/windows/desktop/aa367099(v=vs.85).aspxAlternatively you can't be bothered to fire up midl
for a single function, you can use the VarPtr
/StrPtr
/CopyMemory
hack. This is pretty much equivalent to PEEK
and POKE
.