本文讨论C#同ue4进行通信,出现的中文乱码情况,其他语言类似。
本文分两种情况讨论,
1.C#向ue4发送string
在C#发送string时,要先区分发送的字符串中是否包含中文,判断方法很简单,如下:
private bool IsPureAnsi(string str) { for (int i = 0; i < str.Length; ++i) { if ((int)str[i] > 127) { return false; } } return true; }当string中不包含中文时,每个字符占1个字节,string前要先发送string的长度,占4个字节,string结束要加'\0'结尾(c传统)
当string中包含中文时,每个字节占用2个字节,格式同上(注意string长度无需*2,而且string长度要取反,string结尾需要两个字节的'\0')
具体代码如下:
public byte[] StringToBytes(string str) { Stream stream = new MemoryStream(); if (this.IsPureAnsi(str)) { byte[] strLenBytes = System.BitConverter.GetBytes(str.Length + 1); stream.Write(strLenBytes, 0, strLenBytes.Length); byte[] strBytes = Encoding.ASCII.GetBytes(str); stream.Write(strBytes, 0, strBytes.Length); stream.WriteByte(0); } else { int strLen = str.Length + 1; byte[] strLenBytes = System.BitConverter.GetBytes(-strLen); stream.Write(strLenBytes, 0, strLenBytes.Length); byte[] strBytes = Encoding.Unicode.GetBytes(str); stream.Write(strBytes, 0, strBytes.Length); stream.WriteByte(0); stream.WriteByte(0); } return StreamToBytes(stream); }2.ue4向C#发送string
这种情况,ue4端照常发送,c#解析的时候需要做一定处理
c#在接收到string数据时,先取出string长度信息,如果string长度>=0,则说明c#接收到的string是ascii编码,如果小于0,说明接收到的string是unicode编码
具体代码如下
public String GetString(byte[] data, int index, out int outIndex) { int strLen = System.BitConverter.ToInt32(data, index); bool Ascii = strLen >= 0; if (Ascii) { outIndex = index + 4 + strLen; int begin = index + 4; int end = outIndex; byte[] byteStr = data.Skip(begin).Take(end - begin - 1).ToArray(); string str = Encoding.Default.GetString(byteStr); return str; } else { strLen = -strLen; outIndex = index + 4 + strLen*2; int begin = index + 4; int end = outIndex; byte[] byteStr = data.Skip(begin).Take(end - begin - 2).ToArray(); string str = Encoding.Unicode.GetString(byteStr); return str; } }以上内容主要参考的是ue4中的源码,具体源码如下:
FArchive& operator<<( FArchive& Ar, FString& A ) { // > 0 for ANSICHAR, < 0 for UCS2CHAR serialization if (Ar.IsLoading()) { int32 SaveNum; Ar << SaveNum; bool LoadUCS2Char = SaveNum < 0; if (LoadUCS2Char) { SaveNum = -SaveNum; } // If SaveNum is still less than 0, they must have passed in MIN_INT. Archive is corrupted. if (SaveNum < 0) { Ar.ArIsError = 1; Ar.ArIsCriticalError = 1; UE_LOG(LogNetSerialization, Error, TEXT("Archive is corrupted")); return Ar; } auto MaxSerializeSize = Ar.GetMaxSerializeSize(); // Protect against network packets allocating too much memory if ((MaxSerializeSize > 0) && (SaveNum > MaxSerializeSize)) { Ar.ArIsError = 1; Ar.ArIsCriticalError = 1; UE_LOG( LogNetSerialization, Error, TEXT( "String is too large" ) ); return Ar; } // Resize the array only if it passes the above tests to prevent rogue packets from crashing A.Data.Empty (SaveNum); A.Data.AddUninitialized(SaveNum); if (SaveNum) { if (LoadUCS2Char) { // read in the unicode string and byteswap it, etc auto Passthru = StringMemoryPassthru<UCS2CHAR>(A.Data.GetData(), SaveNum, SaveNum); Ar.Serialize(Passthru.Get(), SaveNum * sizeof(UCS2CHAR)); // Ensure the string has a null terminator Passthru.Get()[SaveNum-1] = '\0'; Passthru.Apply(); INTEL_ORDER_TCHARARRAY(A.Data.GetData()) // Since Microsoft's vsnwprintf implementation raises an invalid parameter warning // with a character of 0xffff, scan for it and terminate the string there. // 0xffff isn't an actual Unicode character anyway. int Index = 0; if(A.FindChar(0xffff, Index)) { A[Index] = '\0'; A.TrimToNullTerminator(); } } else { auto Passthru = StringMemoryPassthru<ANSICHAR>(A.Data.GetData(), SaveNum, SaveNum); Ar.Serialize(Passthru.Get(), SaveNum * sizeof(ANSICHAR)); // Ensure the string has a null terminator Passthru.Get()[SaveNum-1] = '\0'; Passthru.Apply(); } // Throw away empty string. if (SaveNum == 1) { A.Data.Empty(); } } } else { bool SaveUCS2Char = Ar.IsForcingUnicode() || !FCString::IsPureAnsi(*A); int32 Num = A.Data.Num(); int32 SaveNum = SaveUCS2Char ? -Num : Num; Ar << SaveNum; A.Data.CountBytes( Ar ); if (SaveNum) { if (SaveUCS2Char) { // TODO - This is creating a temporary in order to byte-swap. Need to think about how to make this not necessary. #if !PLATFORM_LITTLE_ENDIAN FString ATemp = A; FString& A = ATemp; INTEL_ORDER_TCHARARRAY(A.Data.GetData()); #endif Ar.Serialize((void*)StringCast<UCS2CHAR>(A.Data.GetData(), Num).Get(), sizeof(UCS2CHAR)* Num); } else { Ar.Serialize((void*)StringCast<ANSICHAR>(A.Data.GetData(), Num).Get(), sizeof(ANSICHAR)* Num); } } } return Ar; }注意,ue4,发送和接收数据都是用这段代码,Ar.IsLoading()为true代表接收数据,否则为发送数据
文章来源: ue4同c#通信时的中文乱码问题