Setting a buffer of char* with intermediate casting to int*

前端未结

关注

 6  490

I could not fully understand the consequences of what I read here: Casting an int pointer to a char ptr and vice versa

In short, would this work?

set


                      
              相关标签:


      
      
        
          6条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  感情败类        
                
              
                            
                2021-01-11 17:42
              
            
            
                                                                       
In addition to the endian issue, which has already been mentioned here:

CHAR_BIT - the number of bits per char - should also be considered.

It is 8 on most platforms, where for (int i=0; i<4; i++) should work fine.

A safer way of doing it would be for (int i=0; i<sizeof(uint32_t); i++).

Alternatively, you can include <limits.h> and use for (int i=0; i<32/CHAR_BIT; i++).
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤街浪徒        
                
              
                            
                2021-01-11 17:46
              
            
            
                                                                       
Use reinterpret_cast<>() if you want to ensure the underlying data does not "change shape".

As Learner has mentioned, when you store data in machine memory endianess becomes a factor. If you know how the data is stored correctly in memory (correct endianess) and you are specifically testing its layout as an alternate representation, then you would want to use reinterpret_cast<>() to test that memory, as a specific type, without modifying the original storage.

Below, I've modified your example to use reinterpret_cast<>():

void set4Bytes(unsigned char* buffer) {
  const uint32_t MASK = 0xffffffff;
  if (*reinterpret_cast<unsigned int *>(buffer) % 4) {//misaligned
     for (int i = 0; i < 4; i++) {
       buffer[i] = 0xff;
     } 
  } else {//4-byte alignment
    *reinterpret_cast<unsigned int *>(buffer) = MASK;
  }
}


It should also be noted, your function appears to set the buffer (32-bytes of contiguous memory) to 0xFFFFFFFF, regardless of which branch it takes.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  青春惊慌失措        
                
              
                            
                2021-01-11 18:00
              
            
            
                                                                       
This conversion is safe if you are filling same value in all 4 bytes. If byte order matters then this conversion is not safe.
Because when you use integer to fill 4 Bytes at a time it will fill 4 Bytes but order depends on the endianness.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  佛祖请我去吃肉        
                
              
                            
                2021-01-11 18:02
              
            
            
                                                                       
Your code is perfect for working with any architecture with 32bit and up. There is no issue with byte ordering since all your source bytes are 0xFF.

At x86 or x64 machines, the extra work necessary to deal with eventually unaligned access to RAM are managed by the CPU and transparent to the programmer (since Pentium II), with some performance cost at each access. So, if you are just setting the first four bytes of a buffer a few times, you are good to simplify your function:

void set4Bytes(unsigned char* buffer) {
  const uint32_t MASK = 0xffffffff;
  *((uint32_t *)buffer) = MASK;
}


Some readings:


A Linux kernel doc about UNALIGNED MEMORY ACCESSES
Intel Architecture Optimization Manual, section 3.4
Windows Data Alignment on IPF, x86, and x64
A Practical 'Aligned vs. unaligned memory access', by Alexander Sandler

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  灰色年华        
                
              
                            
                2021-01-11 18:05
              
            
            
                                                                       
No, it won't work in every case. Aside from endianness, which may or may not be an issue, you assume that the alignment of uint32_t is 4. But this quantity is implementation-defined (C11 Draft N1570 Section 6.2.8). You can use the _Alignof operator to get the alignment in a portable way.

Second, the effective type (ibid. Sec. 6.5) of the location pointed to by buffer may not be compatible to uint32_t (e.g. if buffer points to an unsigned char array). In that case you break strict aliasing rules once you try reading through the array itself or through a pointer of different type. 

Assuming that the pointer actually points to an array of unsigned char, the following code will work

typedef union { unsigned char chr[sizeof(uint32_t)]; uint32_t u32; } conv_t;

void set4Bytes(unsigned char* buffer) {
  const uint32_t MASK = 0xffffffffU;
  if ((uintptr_t)buffer % _Alignof(uint32_t)) {// misaligned
    for (size_t i = 0; i < sizeof(uint32_t); i++) {
      buffer[i] = 0xffU;
    } 
  } else { // correct alignment
    conv_t *cnv = (conv_t *) buffer; 
    cnv->u32 = MASK;
  }
}

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2021-01-11 18:07
              
            
            
                                                                       
This code might be of help to you. It shows a 32-bit number being built by assigning its contents a byte at a time, forcing misalignment. It compiles and works on my machine.

#include<stdint.h>
#include<stdio.h>
#include<inttypes.h>
#include<stdlib.h>

int main () {
    uint32_t *data = (uint32_t*)malloc(sizeof(uint32_t)*2);
    char *buf = (char*)data;
    uintptr_t addr = (uintptr_t)buf;
    int i,j;
    i = !(addr%4) ? 1 : 0;
    uint32_t x = (1<<6)-1;
    for( j=0;j<4;j++ ) buf[i+j] = ((char*)&x)[j];

    printf("%" PRIu32 "\n",*((uint32_t*) (addr+i)) );
}


As mentioned by @Learner, endianness must be obeyed. The code above is not portable and would break on a big endian machine.

Note that my compiler throws the error "cast from ‘char*’ to ‘unsigned int’ loses precision [-fpermissive]" when trying to cast a char* to an unsigned int, as done in the original post. This post explains that uintptr_t should be used instead.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复