Moving out of a Drop struct in Rust?

Moving out of a Drop struct in Rust?

Rust, Drop, forget, unsafe

2018-05-22 22:22:00 UTC, by Dimitri Sabadie


Today, I’ve come across a very interesting problem in Rust. This problem comes from a very specific part of luminance.

Rust has this feature called move semantics. By default, the move semantics are enforced. For instance:

fn foo_move<T>(x: T)

This function says that it expects a T and need to move it in. If you don’t want to move it, you have to borrow it. Either immutable or mutably.

fn foo_borrow<T>(x: &T)
fn foo_borrow_mut<T>(x: &mut T)

Now, when you move a value around, it’s possible that you don’t use that value anymore. If you look at the foo_move function, is very likely that we’ll not use that T object after the function has returned. So Rust knows that x will go out of scope and then its memory must be invalidated and made available to future values. This is very simple as it will happen on the stack, but it can get a bit tricky.

Rust doesn’t have the direct concept of destructors. Or does it? Actually, it does. And it’s called the Drop trait. If a type implements Drop, any value of this type will call a specific function for disposal when it goes out of scope. This is akin to running a destructor.

struct Test;

impl Drop for Test {
  fn drop(&mut self) {
    println!("drop called!");
  }
}

Whenever a value of type Test goes out of scope / must die, it will call its Drop::drop implementation and then its memory will be invalidated.

Ok, now let’s get into my problem.

Moving out of a field

Consider the following code:

struct A;

struct B;

struct Foo {
  a: A,
  b: B
}

impl Foo {
  fn take(self) -> (A, B) {
    (self.a, self.b)
  }
}

fn main() {
  let foo = Foo { a: A, b: B };
  let (a, b) = foo.take();
}

The implementation of Foo::take shows that we move out of Foo – two, actually. We move the a: A field out of Foo and b: B out of Foo. Which means that we destructured Foo. The call to Foo::take in the main function demonstrates how you would use that function.

Now consider this. A and B here are opaque type we don’t want to know the implementation. But keep in mind they represent scarce resources – like UUID to important resources we must track in the Foo object. Consider this new code:

struct A;

struct B;

struct Foo {
  a: A,
  b: B
}

impl Drop for Foo {
  fn drop(&mut self) {
    // something tricky with self.a and self.b
  }
}

impl Foo {
  fn take(self) -> (A, B) {
    (self.a, self.b) // wait, that doesn’t compile anymore!
  }
}

You would get the following error:

error[E0509]: cannot move out of type Foo, which implements the Drop trait

As you can see, Rust doesn’t allow you to move out of a value which type implements Drop, and this is quite logical. When Foo::take returns, because of self going out of scope, it must call its Drop::drop implementation. If you have moved out of it – both a: A and b: B fields, the Drop::drop implementation is now a complete UB. So Rust is right here and doesn’t allow you to do this.

But imagine that we have to do this. For insance, we need to hand over both the scarce resources a and b to another struct (in our case, a (A, B), but you could easily imagine a better type for this).

There’s a way to, still, implement Foo::take with Foo implementing Drop. Here’s how:

impl Foo {
  fn take(mut self) -> (A, B) {
    let a = mem::replace(&mut self.a, unsafe { mem::uninitialized() });
    let b = mem::replace(&mut self.b, unsafe { mem::uninitialized() });
    
    mem::forget(self);
    
    (a, b)
  }
}

There’s a few new things going on here. First, the mem::replace function takes as first parameter a mutable reference to something we want to replace by its second argument and returns the initial value the mutable reference pointed to. Because mem::replace doesn’t deinitialize any of its argument, I like to think of that function as a move-swap. It just moves out the first argument and put something in the hole it has made with its second argument.

The unsafe function, mem::uninitialized, is a bit special as it bypasses Rust’s normal memory-initialization and actually doesn’t do anything. However, there’s no specification about what this function really does and you shouldn’t be concerned. What you just need to know is that this function just gives you uninitialized memory.

However, if we just returned (a, b) right after the two mem::replace calls, the Drop::drop implementation would run at the end of the function, and it would run badly, because of the now uninitialized data you find in the self.a and self.b fields. It could still be the old values. Or it could be garbage. We don’t care. We mustn’t read from there. Rust provide a solution to this, and this is the mem::forget function. This function is like dropping your value… without calling its Drop::drop implementation. It’s an escape hatch. It’s a way to invalidate a value and in our case that function is super handy because self now contains uninitialized data!

And here we have a complete working Foo::take function.

So what’s the problem again?

The problem is that I used to function I wish I didn’t:

The problem is that Rust doesn’t allow you to move out of a value which implement Drop even if you know that you’ll forget it. Rust doesn’t currently have a primitive / function to express both principles at the same time: forgetting and moving out of a Drop value.

This had me to come to the realization that I would like an interface like this:

fn nodrop<T, F, A>(
  x: T,
  f: F
) -> A
where T: Drop,
      F: FnOnce(T) -> A

This function would give you the guarantee that the closure will not drop its argument but would still leak it, temporarily giving you the power to move out of it. You will also notice that if you want to have A = T, it’s possible: that would mean you destructure a value and re-build it back again inside the same type, which is still sound.

However, this function would make it very easy to leak any field you don’t move out. This is actually the same situation as with mem::forget: if you forget to mem::replace a field before calling mem::forget, you end up in the exact same situation.

I agree it’s a very niche need, and if you’re curious about why I need this, it’s because of how buffers are handled in luminance. The Buffer<T> type has an internal RawBuffer as field that holds GPU handles and internal details and a PhantomData<T>. It’s possible to convert from one type to the other but both implement Drop, so you have to implement the hack I just presented in this blog post to make both conversions. With my nodrop function, the code would be way easier, without needing to unfold unsafe blocks. Something like this would be possible:

pub fn to_raw(self) -> RawBuffer {
  mem::nodrop(self, |buffer| buffer.raw) // combine mem::forget and mem::replace + uninitialized
}

What do you peeps think of such a function? For sure that would add possible easy leaks, that’s a drawback I find very difficult not to agree with. However, I just wanted to share my little experience of Tuesday evening.

That’s all for me. Thanks for having read me, I love you. Keep the vibes!