I\'m attempting to write Rust bindings for a C collection library (Judy Arrays [1]) which only provides itself room to store a pointer-width value. My company has a fair amo
Is there a better way to do this, or is this run-time check the best Rust 1.0 supports?
In general, there are some hacky solutions to do some kind of compile time testing of arbitrary conditions. For example, there is the static_assertions crate which offers some useful macros (including one macro similar to C++'s static_assert
). However, this is hacky and very limited.
In your particular situation, I haven't found a way to perform the check at compile time. The root problem here is that you can't use mem::size_of
or mem::transmute
on a generic type. Related issues: #43408 and #47966. For this reason, the static_assertions
crate doesn't work either.
If you think about it, this would also allow a kind of error very unfamiliar to Rust programmers: an error when instantiating a generic function with a specific type. This is well known to C++ programmers -- Rust's trait bounds are used to fix those often very bad and unhelpful error messages. In the Rust world, one would need to specify your requirement as trait bound: something like where size_of::<T> == size_of::<usize>()
.
However, this is currently not possible. There once was a fairly famous "const-dependent type system" RFC which would allow these kinds of bounds, but got rejected for now. Support for these kinds of features are slowly but steadily progressing. "Miri" was merged into the compiler some time ago, allowing much more powerful constant evaluation. This is an enabler for many things, including the "Const Generics" RFC, which was actually merged. It is not yet implemented, but it is expected to land in 2018 or 2019.
Unfortunately, it still doesn't enable the kind of bound you need. Comparing two const expressions for equality, was purposefully left out of the main RFC to be resolved in a future RFC.
So it is to be expected that a bound similar to where size_of::<T> == size_of::<usize>()
will eventually be possible. But this shouldn't be expected in the near future!
In your situation, I would probably introduce an unsafe trait AsBigAsUsize
. To implement it, you could write a macro impl_as_big_as_usize
which performs a size check and implements the trait. Maybe something like this:
unsafe trait AsBigAsUsize: Sized {
const _DUMMY: [(); 0];
}
macro_rules! impl_as_big_as_usize {
($type:ty) => {
unsafe impl AsBigAsUsize for $type {
const _DUMMY: [(); 0] =
[(); (mem::size_of::<$type>() == mem::size_of::<usize>()) as usize];
// We should probably also check the alignment!
}
}
}
This uses basically the same trickery as static_assertions
is using. This works, because we never use size_of
on a generic type, but only on concrete types of the macro invocation.
So... this is obviously far from perfect. The user of your library has to invoke impl_as_big_as_usize
once for every type they want to use in your data structure. But at least it's safe: as long as programmers only use the macro to impl the trait, the trait is in fact only implemented for types that have the same size as usize
. Also, the error "trait bound AsBigAsUsize
is not satisfied" is very understandable.
As bluss said in the comments, in your assert!
code, there is no run-time check, because the optimizer constant-folds the check. Let's test that statement with this code:
#![feature(asm)]
fn main() {
foo(3u64);
foo(true);
}
#[inline(never)]
fn foo<T>(t: T) {
use std::mem::size_of;
unsafe { asm!("" : : "r"(&t)) }; // black box
assert!(size_of::<usize>() == size_of::<T>());
unsafe { asm!("" : : "r"(&t)) }; // black box
}
The crazy asm!()
expressions serve two purposes:
t
from LLVM, such that LLVM can't perform optimizations we don't want (like removing the whole function)Compile it with a nightly compiler (in a 64 bit environment!):
rustc -O --emit=asm test.rs
As usual, the resulting assembly code is hard to read; here are the important spots (with some cleanup):
_ZN4test4main17he67e990f1745b02cE: # main()
subq $40, %rsp
callq _ZN4test3foo17hc593d7aa7187abe3E
callq _ZN4test3foo17h40b6a7d0419c9482E
ud2
_ZN4test3foo17h40b6a7d0419c9482E: # foo<bool>()
subq $40, %rsp
movb $1, 39(%rsp)
leaq 39(%rsp), %rax
#APP
#NO_APP
callq _ZN3std9panicking11begin_panic17h0914615a412ba184E
ud2
_ZN4test3foo17hc593d7aa7187abe3E: # foo<u64>()
pushq %rax
movq $3, (%rsp)
leaq (%rsp), %rax
#APP
#NO_APP
#APP
#NO_APP
popq %rax
retq
The #APP
-#NO_APP
pair is our asm!()
expression.
foo<bool>
case: you can see that our first asm!()
instruction is compiled, then an unconditioned call to panic!()
is made and afterwards comes nothing (ud2
just says “the program can never reach this spot, panic!()
diverges”).foo<u64>
case: you can see both #APP
-#NO_APP
pairs (both asm!()
expressions) without anything in between. So yes: the compiler removes the check completely.
It would be way better if the compiler would just refuse to compile the code. But this way we at least know, that there's no run-time overhead.
Contrary to the accepted answer, you can check at compile-time!
The trick is to insert, when compiling with optimizations, a call to an undefined C function in the dead-code path. You will get a linker error if your assertion would fail.