Is it undefined behavior to do runtime borrow management with the help of raw pointers in Rust?

…衆ロ難τιáo~ 提交于 2020-01-14 19:50:51

问题


As part of binding a C API to Rust, I have a mutable reference ph: &mut Ph, a struct struct EnsureValidContext<'a> { ph: &'a mut Ph }, and some methods:

impl Ph {
    pub fn print(&mut self, s: &str) {
        /*...*/
    }
    pub fn with_context<F, R>(&mut self, ctx: &Context, f: F) -> Result<R, InvalidContextError>
    where
        F: Fn(EnsureValidContext) -> R,
    {
        /*...*/
    }
    /* some others */
}

impl<'a> EnsureValidContext<'a> {
    pub fn print(&mut self, s: &str) {
        self.ph.print(s)
    }
    pub fn close(self) {}
    /* some others */
}

I don't control these. I can only use these.

Now, the closure API is nice if you want the compiler to force you to think about performance (and the tradeoffs you have to make between performance and the behaviour you want. Context validation is expensive). However, let's say you just don't care about that and want it to just work.

I was thinking of making a wrapper that handles it for you:

enum ValidPh<'a> {
    Ph(&'a mut Ph),
    Valid(*mut Ph, EnsureValidContext<'a>),
    Poisoned,
}

impl<'a> ValidPh<'a> {
    pub fn print(&mut self) {
        /* whatever the case, just call .print() on the inner object */
    }
    pub fn set_context(&mut self, ctx: &Context) {
        /*...*/
    }
    pub fn close(&mut self) {
        /*...*/
    }
    /* some others */
}

This would work by, whenever necessary, checking if we're a Ph or a Valid, and if we're a Ph we'd upgrade to a Valid by going:

fn upgrade(&mut self) {
    if let Ph(_) = self { // don't call mem::replace unless we need to
        if let Ph(ph) = mem::replace(self, Poisoned) {
            let ptr = ph as *mut _;
            let evc = ph.with_context(ph.get_context(), |evc| evc);
            *self = Valid(ptr, evc);
        }
    }
}

Downgrading is different for each method, as it has to call the target method, but here's an example close:

pub fn close(&mut self) {
    if let Valid(_, _) = self {
        /* ok */
    } else {
        self.upgrade()
    }
    if let Valid(ptr, evc) = mem::replace(self, Invalid) {
        evc.close(); // consume the evc, dropping the borrow.

        // we can now use our original borrow, but since we don't have it anymore, bring it back using our trusty ptr
        *self = unsafe { Ph(&mut *ptr) };
    } else {
        // this can only happen due to a bug in our code
        unreachable!();
    }
}

You get to use a ValidPh like:

/* given a &mut vph */
vph.print("hello world!");
if vph.set_context(ctx) {
    vph.print("closing existing context");
    vph.close();
}
vph.print("opening new context");
vph.open("context_name");
vph.print("printing in new context");

Without vph, you'd have to juggle &mut Ph and EnsureValidContext around on your own. While the Rust compiler makes this trivial (just follow the errors), you may want to let the library handle it automatically for you. Otherwise you might end up just calling the very expensive with_context for every operation, regardless of whether the operation can invalidate the context or not.

Note that this code is rough pseudocode. I haven't compiled or tested it yet.

One might argue I need an UnsafeCell or a RefCell or some other Cell. However, from reading this it appears UnsafeCell is only a lang item because of interior mutability — it's only necessary if you're mutating state through an &T, while in this case I have &mut T all the way.

However, my reading may be flawed. Does this code invoke UB?

(Full code of Ph and EnsureValidContext, including FFI bits, available here.)


回答1:


Taking a step back, the guarantees upheld by Rust are:

  • &T is a reference to T which is potentially aliased,
  • &mut T is a reference to T which is guaranteed not to be aliased.

The crux of the question therefore is: what does guaranteed not to be aliased means?


Let's consider a safe Rust sample:

struct Foo(u32);

impl Foo {
    fn foo(&mut self) { self.bar(); }
    fn bar(&mut self) { *self.0 += 1; }
}

fn main() { Foo(0).foo(); }

If we take a peek at the stack when Foo::bar is being executed, we'll see at least two pointers to Foo: one in bar and one in foo, and there may be further copies on the stack or in other registers.

So, clearly, there are aliases in existence. How come! It's guaranteed NOT to be aliased!


Take a deep breath: how many of those aliases can you access at the time?

Only 1. The guarantee of no aliasing is not spatial but temporal.

I would think, therefore, that at any point in time, if a &mut T is accessible, then no other reference to this instance must be accessible.

Having a raw pointer (*mut T) is perfectly fine, it requires unsafe to access; however forming a second reference may or may not be safe, even without using it, so I would avoid it.




回答2:


Rust's memory model is not rigorously defined yet, so it's hard to say for sure, but I believe it's not undefined behavior to:

  1. carry a *mut Ph around while a &'a mut Ph is also reachable from another path, so long as you don't dereference the *mut Ph, even just for reading, and don't convert it to a &Ph or &mut Ph, because mutable references grant exclusive access to the pointee.
  2. cast the *mut Ph back to a &'a mut Ph once the other &'a mut Ph falls out of scope.


来源:https://stackoverflow.com/questions/49503331/is-it-undefined-behavior-to-do-runtime-borrow-management-with-the-help-of-raw-po

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!