Read an arbitrary number of bytes from type implementing Read

后端 未结 3 1378
南旧
南旧 2021-01-13 08:33

I have something that is Read; currently it\'s a File. I want to read a number of bytes from it that is only known at runtime (length prefix in a b

相关标签:
3条回答
  • 2021-01-13 09:12

    You can always use a bit of unsafe to create a vector of uninitialized memory. It is perfectly safe to do with primitive types:

    let mut v: Vec<u8> = Vec::with_capacity(length);
    unsafe { v.set_len(length); }
    let count = file.read(vec.as_mut_slice()).unwrap();
    

    This way, vec.len() will be set to its capacity, and all bytes in it will be uninitialized (likely zeros, but possibly some garbage). This way you can avoid zeroing the memory, which is pretty safe for primitive types.

    Note that read() method on Read is not guaranteed to fill the whole slice. It is possible for it to return with number of bytes less than the slice length. There are several RFCs on adding methods to fill this gap, for example, this one.

    0 讨论(0)
  • 2021-01-13 09:25

    1. Fill-this-vector version

    Your first solution is close to work. You identified the problem but did not try to solve it! The problem is that whatever the capacity of the vector, it is still empty (vec.len() == 0). Instead, you could actually fill it with empty elements, such as:

    let mut vec = vec![0u8; length];
    

    The following full code works:

    #![feature(convert)] // needed for `as_mut_slice()` as of 2015-07-19
    
    use std::fs::File;
    use std::io::Read;
    
    fn main() {
        let mut file = File::open("/usr/share/dict/words").unwrap();
        let length: usize = 100;
        let mut vec = vec![0u8; length];
        let count = file.read(vec.as_mut_slice()).unwrap();
        println!("read {} bytes.", count);
        println!("vec = {:?}", vec);
    }
    

    Of course, you still have to check whether count == length, and read more data into the buffer if that's not the case.


    2. Iterator version

    Your second solution is better because you won't have to check how many bytes have been read, and you won't have to re-read in case count != length. You need to use the bytes() function on the Read trait (implemented by File). This transform the file into a stream (i.e an iterator). Because errors can still happen, you don't get an Iterator<Item=u8> but an Iterator<Item=Result<u8, R::Err>>. Hence you need to deal with failures explicitly within the iterator. We're going to use unwrap() here for simplicity:

    use std::fs::File;
    use std::io::Read;
    
    fn main() {
        let file = File::open("/usr/share/dict/words").unwrap();
        let length: usize = 100;
        let vec: Vec<u8> = file
            .bytes()
            .take(length)
            .map(|r: Result<u8, _>| r.unwrap()) // or deal explicitly with failure!
            .collect();
        println!("vec = {:?}", vec);
    }
    
    0 讨论(0)
  • 2021-01-13 09:39

    Like the Iterator adaptors, the IO adaptors take self by value to be as efficient as possible. Also like the Iterator adaptors, a mutable reference to a Read is also a Read.

    To solve your problem, you just need Read::by_ref:

    use std::io::Read;
    use std::fs::File;
    
    fn main() {
        let mut file = File::open("/etc/hosts").unwrap();
        let length = 5;
    
        let mut vec = Vec::with_capacity(length);
        file.by_ref().take(length as u64).read_to_end(&mut vec).unwrap();
    
        let mut the_rest = Vec::new();
        file.read_to_end(&mut the_rest).unwrap();
    }
    
    0 讨论(0)
提交回复
热议问题