Compile-time generic type size check

前端 未结 2 507
终归单人心
终归单人心 2020-12-10 11:40

I\'m attempting to write Rust bindings for a C collection library (Judy Arrays [1]) which only provides itself room to store a pointer-width value. My company has a fair amo

相关标签:
2条回答
  • 2020-12-10 12:02

    Compile-time check?

    Is there a better way to do this, or is this run-time check the best Rust 1.0 supports?

    In general, there are some hacky solutions to do some kind of compile time testing of arbitrary conditions. For example, there is the static_assertions crate which offers some useful macros (including one macro similar to C++'s static_assert). However, this is hacky and very limited.

    In your particular situation, I haven't found a way to perform the check at compile time. The root problem here is that you can't use mem::size_of or mem::transmute on a generic type. Related issues: #43408 and #47966. For this reason, the static_assertions crate doesn't work either.

    If you think about it, this would also allow a kind of error very unfamiliar to Rust programmers: an error when instantiating a generic function with a specific type. This is well known to C++ programmers -- Rust's trait bounds are used to fix those often very bad and unhelpful error messages. In the Rust world, one would need to specify your requirement as trait bound: something like where size_of::<T> == size_of::<usize>().

    However, this is currently not possible. There once was a fairly famous "const-dependent type system" RFC which would allow these kinds of bounds, but got rejected for now. Support for these kinds of features are slowly but steadily progressing. "Miri" was merged into the compiler some time ago, allowing much more powerful constant evaluation. This is an enabler for many things, including the "Const Generics" RFC, which was actually merged. It is not yet implemented, but it is expected to land in 2018 or 2019.

    Unfortunately, it still doesn't enable the kind of bound you need. Comparing two const expressions for equality, was purposefully left out of the main RFC to be resolved in a future RFC.

    So it is to be expected that a bound similar to where size_of::<T> == size_of::<usize>() will eventually be possible. But this shouldn't be expected in the near future!


    Workaround

    In your situation, I would probably introduce an unsafe trait AsBigAsUsize. To implement it, you could write a macro impl_as_big_as_usize which performs a size check and implements the trait. Maybe something like this:

    unsafe trait AsBigAsUsize: Sized {
        const _DUMMY: [(); 0];
    }
    
    macro_rules! impl_as_big_as_usize {
        ($type:ty) => {
            unsafe impl AsBigAsUsize for $type {
                const _DUMMY: [(); 0] = 
                    [(); (mem::size_of::<$type>() == mem::size_of::<usize>()) as usize];
                // We should probably also check the alignment!
            }
        }
    }
    

    This uses basically the same trickery as static_assertions is using. This works, because we never use size_of on a generic type, but only on concrete types of the macro invocation.

    So... this is obviously far from perfect. The user of your library has to invoke impl_as_big_as_usize once for every type they want to use in your data structure. But at least it's safe: as long as programmers only use the macro to impl the trait, the trait is in fact only implemented for types that have the same size as usize. Also, the error "trait bound AsBigAsUsize is not satisfied" is very understandable.


    What about the run-time check?

    As bluss said in the comments, in your assert! code, there is no run-time check, because the optimizer constant-folds the check. Let's test that statement with this code:

    #![feature(asm)]
    
    fn main() {
        foo(3u64);
        foo(true);
    }
    
    #[inline(never)]
    fn foo<T>(t: T) {
        use std::mem::size_of;
    
        unsafe { asm!("" : : "r"(&t)) }; // black box
        assert!(size_of::<usize>() == size_of::<T>());
        unsafe { asm!("" : : "r"(&t)) }; // black box
    }
    

    The crazy asm!() expressions serve two purposes:

    • “hiding” t from LLVM, such that LLVM can't perform optimizations we don't want (like removing the whole function)
    • marking specific spots in the resulting ASM code we'll be looking at

    Compile it with a nightly compiler (in a 64 bit environment!):

    rustc -O --emit=asm test.rs
    

    As usual, the resulting assembly code is hard to read; here are the important spots (with some cleanup):

    _ZN4test4main17he67e990f1745b02cE:  # main()
        subq    $40, %rsp
        callq   _ZN4test3foo17hc593d7aa7187abe3E
        callq   _ZN4test3foo17h40b6a7d0419c9482E
        ud2
    
    _ZN4test3foo17h40b6a7d0419c9482E: # foo<bool>()
        subq    $40, %rsp
        movb    $1, 39(%rsp)
        leaq    39(%rsp), %rax
        #APP
        #NO_APP
        callq   _ZN3std9panicking11begin_panic17h0914615a412ba184E
        ud2
    
    _ZN4test3foo17hc593d7aa7187abe3E: # foo<u64>()
        pushq   %rax
        movq    $3, (%rsp)
        leaq    (%rsp), %rax
        #APP
        #NO_APP
        #APP
        #NO_APP
        popq    %rax
        retq
    

    The #APP-#NO_APP pair is our asm!() expression.

    • The foo<bool> case: you can see that our first asm!() instruction is compiled, then an unconditioned call to panic!() is made and afterwards comes nothing (ud2 just says “the program can never reach this spot, panic!() diverges”).
    • The foo<u64> case: you can see both #APP-#NO_APP pairs (both asm!() expressions) without anything in between.

    So yes: the compiler removes the check completely.

    It would be way better if the compiler would just refuse to compile the code. But this way we at least know, that there's no run-time overhead.

    0 讨论(0)
  • 2020-12-10 12:06

    Contrary to the accepted answer, you can check at compile-time!

    The trick is to insert, when compiling with optimizations, a call to an undefined C function in the dead-code path. You will get a linker error if your assertion would fail.

    0 讨论(0)
提交回复
热议问题