问题
I'm once again messing around with the java natve interface, and I've runned into another interesting problem. I'm sending a filepath to c via jni and then doing some I/O. So the most common chars I have troubles with is 'äåö'. Here is a short demo of a program with the exact same problem:
Java:
public class java {
private static native void printBytes(String text);
static{
System.loadLibrary("dll");
}
public static void main(String[] args){
printBytes("C:/Users/ä-å-ö/Documents/Bla.txt");
}
}
C:
#include "java.h"
#include <jni.h>
JNIEXPORT void JNICALL Java_java_printBytes(JNIEnv *env, jclass class, jstring text){
const jbyte* text_input = (*env)->GetStringUTFChars(env, text, 0);
jsize size = (*env)->GetStringUTFLength(env, text);
int i = 0;
printf("%s\n",text_input);
(*env)->ReleaseStringUTFChars(env, text, text_input);
}
Output: C:/Users/├ñ-├Ñ-├Â/Documents/Bla.txt
This is NOT my desired result, I would like it to output the same string as in java.
回答1:
You are dealing with platform specific character encoding issues. Although the standard c printf should be able to handle multibyte (utf-8) encoded strings the windows/msvc provided one is anything but standard and cannot. On a non-windows standard conforming platform would expect your code would work. The string coming from java is in UTF-8 (multibyte char) and the MS printf is expecting a ASCII (single byte per char). This is working for ASCII characters because in UTF-8 those characters have the same value. It does not work for characters outside of ASCII.
Basically you need to either convert your string to wide characters (text.getBytes(Charset.forName(UTF-16LE"))
) and pass it as an array from java to c or convert the multibyte string to wide characters in c after receiving it (MultiByteToWideChar(CP_UTF8, ...)
). Then you can use printf("%S") or wprintf("%s") to output it.
See Printing UTF-8 strings with printf - wide vs. multibyte string literals for more information. Also note that the answer says you have to set unicode output mode with _setmode
if you want unicode output on the windows console.
Also note that I don't believe GetStringUTFLength
guarantees a NUL terminator but it's been too long.
来源:https://stackoverflow.com/questions/22054617/java-jni-passing-multibyte-characters-from-java-to-c