How does the Rust compiler know `Cell` has internal mutability?

前端 未结 3 438
温柔的废话
温柔的废话 2021-01-18 11:57

Consider the following code (Playground version):

use std::cell::Cell;

struct Foo(u32);

#[derive(Clone, Copy)]
struct FooRef<\'a>(&\'a Foo);

//          


        
3条回答
  •  迷失自我
    2021-01-18 12:25

    The reason the code with Cell compiles (ignoring the u2) and mutates is Cell's whole API takes & pointers:

    impl Cell where T: Copy {
        fn new(value: T) -> Cell { ... }
    
        fn get(&self) -> T { ... }
    
        fn set(&self, value: T) { ... }
    }
    

    It is carefully written to allow mutation while shared, i.e. interior mutability. This allows it to expose these mutating methods behind a & pointer. Conventional mutation requires a &mut pointer (with its associated non-aliasing restrictions) because having unique access to a value is the only way to ensure that mutating it will be safe, in general.

    So, the way to create types that allow mutation while shared is to ensure that their API for mutation uses & pointers instead of &mut. Generally speaking this should be done by having the type contain pre-written types like Cell, i.e. use them as building blocks.

    The reason later use of u2 fails is a longer story...

    UnsafeCell

    At a lower level, mutating a value while it is shared (e.g. has multiple & pointers to it) is undefined behaviour, except for when the value is contained in an UnsafeCell. This is the very lowest level of interior mutability, designed to be used as a building block for building other abstractions.

    Types that allow safe interior mutability, like Cell, RefCell (for sequential code), the Atomic*s, Mutex and RwLock (for concurrent code) all use UnsafeCell internally and impose some restrictions around it to ensure that it is safe. For example, the definition of Cell is:

    pub struct Cell {
        value: UnsafeCell,
    }
    

    Cell ensures that mutations are safe by carefully restricting the API it offers: the T: Copy in the code above is key.

    (If you wish to write your own low-level type with interior mutability, you just need to ensure that the things that are mutated while being shared are contained in an UnsafeCell. However, I recommended not doing this: Rust has several existing tools (the ones I mentioned above) for interior mutability that are carefully vetted to be safe and correct within Rust's aliasing and mutation rules; breaking the rules is undefined behaviour and can easily result in miscompiled programs.)

    Lifetime Variance

    Anyway, the key that makes the compiler understand that the &u2 is borrowed for the cell case is variance of lifetimes. Typically, the compiler will shorten lifetimes when you pass things to functions, which makes things work great, e.g. you can pass a string literal (&'static str) to a function expecting &'a str, because the long 'static lifetime is shortened to 'a. This is happening for testa: the testa(&a, &u2) call is shortening the lifetimes of the references from the longest they could possibly be (the whole of the body of main) to just that function call. The compiler is free to do this because normal references are variant1 in their lifetimes, i.e. it can vary them.

    However, for testa_mut, the &mut FooRef<'a> stops the compiler being able to shorten that lifetime (in technical terms &mut T is "invariant in T"), exactly because something like testa_mut can happen. In this case, the compiler sees the &mut FooRef<'a> and understand that the 'a lifetime can't be shorted at all, and so in the call testa_mut(&mut a, &u2) it has to take the true lifetime of the u2 value (the whole function) and hence causes u2 to be borrowed for that region.

    So, coming back to interior mutability: UnsafeCell not only tells the compiler that a thing may be mutated while aliased (and hence inhibits some optimisations that would be undefined), it is also invariant in T, i.e. it acts like a &mut T for the purposes of this lifetime/borrowing analysis, exactly because it allows code like testb.

    The compiler infers this variance automatically; it becomes invariant when some type parameter/lifetime is contained in UnsafeCell or &mut somewhere in the type (like FooRef in Cell>).

    The Rustonomicon talks about this and other detailed considerations like it.

    1 Strictly speaking, there's four levels of variance in type system jargon: bivariance, covariance, contravariance and invariance. I believe Rust really only has invariance and covariance (there is some contravariance, but it caused problems and is removed/in the process of being removed). When I say "variant" it really means "covariant". See the Rustonomicon link above for more detail.

提交回复
热议问题