Converting string to byte[] creates zero character

后端未结

关注

 5  2887

In this convert function

public static byte[] GetBytes(string str)
{
    byte[] bytes = new byte[str.Length * sizeof(char)];
    System.Buffer.BlockCopy(str.ToCh


                      
              相关标签:


      
      
        
          5条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2021-02-20 16:36
              
            
            
                                                                       
In reality .net (at least for 4.0) automatically changes size of char when serialized with BinaryWriter

UTF-8 chars have variable length (might not be 1 byte), ASCII chars have 1 byte

'ē' = 2 bytes

'e' = 1 byte

It must be kept in mind when using 

BinaryReader.ReadChars(stream)


In case of word "ēvalds" = 7 bytes  size will be different than "evalds" = 6 bytes
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2021-02-20 16:40
              
            
            
                                                                       
Try to specify Encoding explicitly. You can use next code to convert string to bytes with specified encoding

byte[] bytes = System.Text.Encoding.ASCII.GetBytes("abc");


if you print contents of bytes, you will get { 97, 
98, 
99 } which doesn't contain zeros, as in your example
In your example default encoding using 16 bits per symbol. It can be observer by printing the results of 

System.Text.Encoding.Unicode.GetBytes("abc"); // { 97, 0, 98, 0, 99, 0 }


Then while converting it back, you should select the appropriate encoding:

string str = System.Text.Encoding.ASCII.GetString(bytes);
Console.WriteLine (str);


Prints "abc" as you might expected
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  爱一瞬间的悲伤        
                
              
                            
                2021-02-20 16:49
              
            
            
                                                                       
First let's look at what your code does wrong. char is 16-bit (2 byte) in .NET framework. Which means when you write sizeof(char), it returns 2. str.Length is 1, so actually your code will be byte[] bytes = new byte[2] is the same byte[2]. So when you use Buffer.BlockCopy() method, you actually copy 2 bytes from a source array to a destination array. Which means your GetBytes() method returns bytes[0] = 32 and bytes[1] = 0 if your string is " ". 

Try to use Encoding.ASCII.GetBytes() instead.


  When overridden in a derived class, encodes all the characters in the
  specified string into a sequence of bytes.


const string input = "Soner Gonul";

byte[] array = Encoding.ASCII.GetBytes(input);

foreach ( byte element in array )
{
     Console.WriteLine("{0} = {1}", element, (char)element);
}


Output:

83 = S
111 = o
110 = n
101 = e
114 = r
32 =
71 = G
111 = o
110 = n
117 = u
108 = l

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  半阙折子戏        
                
              
                            
                2021-02-20 16:49
              
            
            
                                                                       
(97,0) is Unicode representation of 'a'. Unicode represents each character in two bytes. So you can not remove zeros. But you can change Encoding to ASCII. Try following for Converting string to byte[].

byte[] array = Encoding.ASCII.GetBytes(input);

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  情话喂你        
                
              
                            
                2021-02-20 16:52
              
            
            
                                                                       
Just to clear the confusion about your answer, char type in C# takes 2 bytes. So, string.toCharArray() returns an array in which each item takes 2 bytes of storage. While copying to byte array where each item takes 1 byte storage, there occurs a data loss. Hence the zeroes showing up in result.

 As suggested, Encoding.ASCII.GetBytes is a safer option to use.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复