Converting a Vec to Vec in-place and with minimal overhead

前端 未结 2 1207
独厮守ぢ
独厮守ぢ 2020-12-03 22:36

I\'m trying to convert a Vec of u32s to a Vec of u8s, preferably in-place and without too much overhead.

My curre

相关标签:
2条回答
  • 2020-12-03 23:07

    If in-place convert is not so mandatory, something like this manages bytes order control and avoids the unsafe block:

    extern crate byteorder;
    
    use byteorder::{WriteBytesExt, BigEndian};
    
    fn main() {
        let vec32: Vec<u32> = vec![0xaabbccdd, 2];
        let mut vec8: Vec<u8> = vec![];
    
        for elem in vec32 {
            vec8.write_u32::<BigEndian>(elem).unwrap();
        }
    
        println!("{:?}", vec8);
    }
    
    0 讨论(0)
  • 2020-12-03 23:33
    1. Whenever writing an unsafe block, I strongly encourage people to include a comment on the block explaining why you think the code is actually safe. That type of information is useful for the people who read the code in the future.

    2. Instead of adding comments about the "magic number" 4, just use mem::size_of::<u32>. I'd even go so far as to use size_of for u8 and perform the division for maximum clarity.

    3. You can return the newly-created Vec from the unsafe block.

    4. As mentioned in the comments, "dumping" a block of data like this makes the data format platform dependent; you will get different answers on little endian and big endian systems. This can lead to massive debugging headaches in the future. File formats either encode the platform endianness into the file (making the reader's job harder) or only write a specific endinanness to the file (making the writer's job harder).

    5. I'd probably move the whole unsafe block to a function and give it a name, just for organization purposes.

    6. You don't need to import Vec, it's in the prelude.

    use std::mem;
    
    fn main() {
        let mut vec32 = vec![1u32, 2];
    
        // I copy-pasted this code from StackOverflow without reading the answer 
        // surrounding it that told me to write a comment explaining why this code 
        // is actually safe for my own use case.
        let vec8 = unsafe {
            let ratio = mem::size_of::<u32>() / mem::size_of::<u8>();
    
            let length = vec32.len() * ratio;
            let capacity = vec32.capacity() * ratio;
            let ptr = vec32.as_mut_ptr() as *mut u8;
    
            // Don't run the destructor for vec32
            mem::forget(vec32);
    
            // Construct new Vec
            Vec::from_raw_parts(ptr, length, capacity)
        };
    
        println!("{:?}", vec8)
    }
    

    Playground

    My biggest unknown worry about this code lies in the alignment of the memory associated with the Vec.

    Rust's underlying allocator allocates and deallocates memory with a specific Layout. Layout contains such information as the size and alignment of the pointer.

    I'd assume that this code needs the Layout to match between paired calls to alloc and dealloc. If that's the case, dropping the Vec<u8> constructed from a Vec<u32> might tell the allocator the wrong alignment since that information is based on the element type.

    Without better knowledge, the "best" thing to do would be to leave the Vec<u32> as-is and simply get a &[u8] to it. The slice has no interaction with the allocator, avoiding this problem.

    Even without interacting with the allocator, you need to be careful about alignment!

    See also:

    • How to slice a large Vec<i32> as &[u8]?
    • https://stackoverflow.com/a/48309116/155423
    0 讨论(0)
提交回复
热议问题