If I run my C++ application with the following main() method everything is OK:
int main(int argc, char *argv[])
{
cout << \"There are \" << a
Ok, the question seems to have been answered fairly well, the UNICODE overload should take a wide character array as its second parameter. So if the command line parameter is "Hello"
that would probably end up as "H\0e\0l\0l\0o\0\0\0"
and your program would only print the 'H'
before it sees what it thinks is a null terminator.
So now you may wonder why it even compiles and links.
Well it compiles because you are allowed to define an overload to a function.
Linking is a slightly more complex issue. In C, there is no decorated symbol information so it just finds a function called main. The argc and argv are probably always there as call-stack parameters just in case even if your function is defined with that signature, even if your function happens to ignore them.
Even though C++ does have decorated symbols, it almost certainly uses C-linkage for main, rather than a clever linker that looks for each one in turn. So it found your wmain and put the parameters onto the call-stack in case it is the int wmain(int, wchar_t*[])
version.
_tmain is a macro that gets redefined depending on whether or not you compile with Unicode or ASCII. It is a Microsoft extension and isn't guaranteed to work on any other compilers.
The correct declaration is
int _tmain(int argc, _TCHAR *argv[])
If the macro UNICODE is defined, that expands to
int wmain(int argc, wchar_t *argv[])
Otherwise it expands to
int main(int argc, char *argv[])
Your definition goes for a bit of each, and (if you have UNICODE defined) will expand to
int wmain(int argc, char *argv[])
which is just plain wrong.
std::cout works with ASCII characters. You need std::wcout if you are using wide characters.
try something like this
#include <iostream>
#include <tchar.h>
#if defined(UNICODE)
#define _tcout std::wcout
#else
#define _tcout std::cout
#endif
int _tmain(int argc, _TCHAR *argv[])
{
_tcout << _T("There are ") << argc << _T(" arguments:") << std::endl;
// Loop through each argument and print its number and value
for (int i=0; i<argc; i++)
_tcout << i << _T(" ") << argv[i] << std::endl;
return 0;
}
Or you could just decide in advance whether to use wide or narrow characters. :-)
Updated 12 Nov 2013:
Changed the traditional "TCHAR" to "_TCHAR" which seems to be the latest fashion. Both work fine.
End Update
the _T convention is used to indicate the program should use the character set defined for the application (Unicode, ASCII, MBCS, etc.). You can surround your strings with _T( ) to have them stored in the correct format.
cout << _T( "There are " ) << argc << _T( " arguments:" ) << endl;
_tmain
does not exist in C++. main
does.
_tmain
is a Microsoft extension.
main
is, according to the C++ standard, the program's entry point.
It has one of these two signatures:
int main();
int main(int argc, char* argv[]);
Microsoft has added a wmain which replaces the second signature with this:
int wmain(int argc, wchar_t* argv[]);
And then, to make it easier to switch between Unicode (UTF-16) and their multibyte character set, they've defined _tmain
which, if Unicode is enabled, is compiled as wmain
, and otherwise as main
.
As for the second part of your question, the first part of the puzzle is that your main function is wrong. wmain
should take a wchar_t
argument, not char
. Since the compiler doesn't enforce this for the main
function, you get a program where an array of wchar_t
strings are passed to the main
function, which interprets them as char
strings.
Now, in UTF-16, the character set used by Windows when Unicode is enabled, all the ASCII characters are represented as the pair of bytes \0
followed by the ASCII value.
And since the x86 CPU is little-endian, the order of these bytes are swapped, so that the ASCII value comes first, then followed by a null byte.
And in a char string, how is the string usually terminated? Yep, by a null byte. So your program sees a bunch of strings, each one byte long.
In general, you have three options when doing Windows programming:
-W
version of the function. Instead of CreateWindow, call CreateWindowW). And instead of using char
use wchar_t
, and so onchar
for strings.The same applies to the string types defined by windows.h: LPCTSTR resolves to either LPCSTR or LPCWSTR, and for every other type that includes char or wchar_t, a -T- version always exists which can be used instead.
Note that all of this is Microsoft specific. TCHAR is not a standard C++ type, it is a macro defined in windows.h. wmain and _tmain are also defined by Microsoft only.
With a little effort of templatizing this, it wold work with any list of objects.
#include <iostream>
#include <string>
#include <vector>
char non_repeating_char(std::string str){
while(str.size() >= 2){
std::vector<size_t> rmlist;
for(size_t i = 1; i < str.size(); i++){
if(str[0] == str[i]) {
rmlist.push_back(i);
}
}
if(rmlist.size()){
size_t s = 0; // Need for terator position adjustment
str.erase(str.begin() + 0);
++s;
for (size_t j : rmlist){
str.erase(str.begin() + (j-s));
++s;
}
continue;
}
return str[0];
}
if(str.size() == 1) return str[0];
else return -1;
}
int main(int argc, char ** args)
{
std::string test = "FabaccdbefafFG";
test = args[1];
char non_repeating = non_repeating_char(test);
Std::cout << non_repeating << '\n';
}