I\'m really confused about passing strings from VBA to C++. Here\'s the VBA code:
Private Declare Sub passBSTRVal Lib \"
BIG HUGE NOTE: I'm not a programmer, I just really enjoy programming, so please be kind to me. I want to improve, so suggestions and comments from people more skilled than me (basically, everyone) are VERY welcomed!
Ben, if you're reading this, I think you opened my eyes to what's happening. MIDL sounds like the proper way of doing this, and I intend on learning it, but this seemed like a good learning opportunity, and I never let those pass me by!
I think what's happening is that narrow characters are getting marshalled into a wide character storage. For example, the string "hello" stored with narrow characters looks like:
|h |e |l |l |o |\0 |
and stored with wide characters, looks like:
|h |e |l |l |o |\0 |
But when you pass a string from VBA to C++, something really strange happens. You get narrow characters marshalled into a wide character, like this:
|h e |l l |o \0 | | | |
This is why using LPCSTR / LPCSTR* works. Yes, BSTR uses a string of wchar_t, but this marshalling makes it look like a string of char. Accessing with char* alternately points to the first and second characters in each half of the wchar_t (h, then e. l, then l. o, then \0). Even though the pointer arithmetic for char* and wchar_t* is different, it works because of the funny way the characters are marshalled. In fact, we're passed a pointer to the data string, but if you wanted to access the length of the BSTR, 4 bytes before the data string, you can play games with pointer arithmetic to get where you want to go. Assuming the BSTR is passed in as LPCSTR s,
char* ptrToChar; // 1 byte
wchar_t* ptrToWChar; // 2 bytes
int* ptrToInt; // 4 bytes
size_t strlen;
ptrToChar = (char *) s;
strlen = ptrToChar[-4];
ptrToWChar = (wchar_t *) s;
strlen = ptrToWChar[-2];
ptrToInt = (int *) s;
strlen = ptrToInt[-1];
Of course, if the string got passed in as LPCSTR* s, then of course you need to dereference s first by accessing via something like:
ptrToChar = (char *)(*s);
and so on.
If one wants to use LPCWSTR or BSTR to receive the VBA string, you have to dance around this marshalling. So for example, to create a C++ DLL that converts a VBA string to uppercase, I did the following:
BSTR __stdcall pUpper( LPCWSTR* s )
{
// Get String Length (see previous discussion)
int strlen = (*s)[-2];
// Allocate space for the new string (+1 for the NUL character).
char *dest = new char[strlen + 1];
// Accessing the *LPCWSTR s using a (char *) changes what we mean by ptr arithmetic,
// e.g. p[1] hops forward 1 byte. s[1] hops forward 2 bytes.
char *p = (char *)(*s);
// Copy the string data
for( int i = 0; i < strlen; ++i )
dest[i] = toupper(p[i]);
// And we're done!
dest[strlen] = '\0';
// Create a new BSTR using our mallocated string.
BSTR bstr = SysAllocStringByteLen(dest, strlen);
// dest needs to be garbage collected by us. COM will take care of bstr.
delete dest;
return bstr;
}
As far as I can tell, receiving the BSTR as a BSTR is equivalent to receiving it as a LPCWSTR, and receiving it as a BSTR* is equivalent to receiving it as a LPCWSTR*.
OK, I am 100% certain there are a ton of mistakes here, but I believe the underlying ideas are correct. If there are mistakes or even better ways of thinking of something, I will gladly accept corrections / explanations, and fix them for Google, posterity, and future programmers.
It sounds like the BEST way to do this is with Ben's MIDL suggestion (and maybe MIDL will make Safearrays and Variants less complicated?), and after I hit enter, I'm going to start learning that method. But this method works too and was an excellent learning opportunity for me.