How to read a struct from a file in Rust?

前端 未结 3 534
心在旅途
心在旅途 2020-12-01 12:23

Is there a way I can read a structure directly from a file in Rust? My code is:

use std::fs::File;

struct Configuration {
    item1: u8,
    item2: u16,
            


        
相关标签:
3条回答
  • 2020-12-01 12:58

    The following code does not take into account any endianness or padding issues and is intended to be used with POD types. struct Configuration should be safe in this case.


    Here is a function that can read a struct (of a POD type) from a file:

    use std::io::{self, Read};
    use std::slice;
    
    fn read_struct<T, R: Read>(mut read: R) -> io::Result<T> {
        let num_bytes = ::std::mem::size_of::<T>();
        unsafe {
            let mut s = ::std::mem::uninitialized();
            let buffer = slice::from_raw_parts_mut(&mut s as *mut T as *mut u8, num_bytes);
            match read.read_exact(buffer) {
                Ok(()) => Ok(s),
                Err(e) => {
                    ::std::mem::forget(s);
                    Err(e)
                }
            }
        }
    }
    
    // use
    // read_struct::<Configuration>(reader)
    

    If you want to read a sequence of structs from a file, you can execute read_struct multiple times or read all the file at once:

    use std::fs::{self, File};
    use std::io::BufReader;
    use std::path::Path;
    
    fn read_structs<T, P: AsRef<Path>>(path: P) -> io::Result<Vec<T>> {
        let path = path.as_ref();
        let struct_size = ::std::mem::size_of::<T>();
        let num_bytes = fs::metadata(path)?.len() as usize;
        let num_structs = num_bytes / struct_size;
        let mut reader = BufReader::new(File::open(path)?);
        let mut r = Vec::<T>::with_capacity(num_structs);
        unsafe {
            let buffer = slice::from_raw_parts_mut(r.as_mut_ptr() as *mut u8, num_bytes);
            reader.read_exact(buffer)?;
            r.set_len(num_structs);
        }
        Ok(r)
    }
    
    // use
    // read_structs::<StructName, _>("path/to/file"))
    
    0 讨论(0)
  • 2020-12-01 13:01

    Here you go:

    use std::io::Read;
    use std::mem;
    use std::slice;
    
    #[repr(C, packed)]
    #[derive(Debug, Copy, Clone)]
    struct Configuration {
        item1: u8,
        item2: u16,
        item3: i32,
        item4: [char; 8],
    }
    
    const CONFIG_DATA: &[u8] = &[
        0xfd, // u8
        0xb4, 0x50, // u16
        0x45, 0xcd, 0x3c, 0x15, // i32
        0x71, 0x3c, 0x87, 0xff, // char
        0xe8, 0x5d, 0x20, 0xe7, // char
        0x5f, 0x38, 0x05, 0x4a, // char
        0xc4, 0x58, 0x8f, 0xdc, // char
        0x67, 0x1d, 0xb4, 0x64, // char
        0xf2, 0xc5, 0x2c, 0x15, // char
        0xd8, 0x9a, 0xae, 0x23, // char
        0x7d, 0xce, 0x4b, 0xeb, // char
    ];
    
    fn main() {
        let mut buffer = CONFIG_DATA;
    
        let mut config: Configuration = unsafe { mem::zeroed() };
    
        let config_size = mem::size_of::<Configuration>();
        unsafe {
            let config_slice = slice::from_raw_parts_mut(&mut config as *mut _ as *mut u8, config_size);
            // `read_exact()` comes from `Read` impl for `&[u8]`
            buffer.read_exact(config_slice).unwrap();
        }
    
        println!("Read structure: {:#?}", config);
    }
    

    Try it here (Updated for Rust 1.38)

    You need to be careful, however, as unsafe code is, well, unsafe. After the slice::from_raw_parts_mut() invocation, there exist two mutable handles to the same data at the same time, which is a violation of Rust aliasing rules. Therefore you would want to keep the mutable slice created out of a structure for the shortest possible time. I also assume that you know about endianness issues - the code above is by no means portable, and will return different results if compiled and run on different kinds of machines (ARM vs x86, for example).

    If you can choose the format and you want a compact binary one, consider using bincode. Otherwise, if you need e.g. to parse some pre-defined binary structure, byteorder crate is the way to go.

    0 讨论(0)
  • 2020-12-01 13:03

    As Vladimir Matveev mentions, using the byteorder crate is often the best solution. This way, you account for endianness issues, don't have to deal with any unsafe code, or worry about alignment or padding:

    use byteorder::{LittleEndian, ReadBytesExt}; // 1.2.7
    use std::{
        fs::File,
        io::{self, Read},
    };
    
    struct Configuration {
        item1: u8,
        item2: u16,
        item3: i32,
    }
    
    impl Configuration {
        fn from_reader(mut rdr: impl Read) -> io::Result<Self> {
            let item1 = rdr.read_u8()?;
            let item2 = rdr.read_u16::<LittleEndian>()?;
            let item3 = rdr.read_i32::<LittleEndian>()?;
    
            Ok(Configuration {
                item1,
                item2,
                item3,
            })
        }
    }
    
    fn main() {
        let file = File::open("/dev/random").unwrap();
    
        let config = Configuration::from_reader(file);
        // How to read struct from file?
    }
    

    I've ignored the [char; 8] for a few reasons:

    1. Rust's char is a 32-bit type and it's unclear if your file has actual Unicode code points or C-style 8-bit values.
    2. You can't easily parse an array with byteorder, you have to parse N values and then build the array yourself.
    0 讨论(0)
提交回复
热议问题