co_await appears to be suboptimal?

问题

I have an async function

void async_foo(A& a, B& b, C&c, function<void(X&, Y&)> callback);

I want to use it in a stackless coroutine so I write

auto coro_foo(A& a, B& b, C& c, X& x) /* -> Y */ {
  struct Awaitable {
    bool await_ready() const noexcept { return false; }
    bool await_suspend(coroutine_handle<> h) {
      async_foo(*a_, *b_, *c_, [this, h](X& x, Y& y){
        *x_ = std::move(x);
        y_ = std::move(y);
        h.resume();
      });
    }
    Y await_resume() {
      return std::move(y);
    }
    A* a_; B* b_; C* c_; X* x_; Y y_;
  };
  return Awaitable{&a, &b, &c, &x};
}

then I can use it like this:

Y y = co_await coro_foo(a, b, c, x);

and compiler would rewrite it to this:

  auto e = coro_foo(a, b, c, x);
  if (!e.await_ready()) {
    <suspend>
    if (e.await_suspend(h)) return;
resume-point:
    <resume>
  }
  Y y = e.await_resume();

With this, the coroutine would keep a_, b_, and c_ when it's suspended, when it only have to keep them until we get coroutine_handle in await_suspend(h).
(Btw I'm not sure if I can keep references to the arguments here.)

It would be much more efficient if the wrapper function could directly get coroutine_handle as an argument.

It could be an implicit argument:

Promise f(coroutine_handle<> h);
co_await f();

Or it could be a special keyword-argument:

Promise f(coroutine_handle<> h);
f(co_await);

Am I missing something here? (Other that the overhead is not that big.)

回答1:

The "coroutine" system defined by the Coroutine TS is designed to handle asynchronous functions which:

Return a future-like object (an object which represents a delayed return value).
The future-like object has the ability to be associated with a continuation function.

async_foo doesn't fulfill these requirements. It doesn't return a future-like object; it "returns" a value via a continuation function. And this continuation is passed as a parameter, rather than being something you do with the object's return type.

By the time the co_await happens at all, the potentially asynchronous process that generated the future is expected to have already started. Or at least, the co_await machinery makes it possible for it to have started.

Your proposed version loses out on the await_ready feature, which is what allows co_await to handle potentially-asynchronous processes. Between the time the future is generated and await_ready is called, the process may have finished. If it has, there is no need to schedule the resumption of the coroutine. It should therefore happen right here, on this thread.

If that minor stack inefficiency bothers you, then you would have to do things the way the Coroutine TS wants you to.

The general way to handle this is where coro_foo would directly execute async_foo and return a future-like object with a .then-like mechanism. Your problem is that async_foo itself doesn't have a .then-like mechanism, so you have to create one.

That means coro_foo must pass async_foo a functor that stores a coroutine_handle<>, one that can be updated by the future's continuation mechanism. Of course, you'll also need synchronization primitives. If the handle has been initialized by the time the functor has been executed, then the functor calls it, resuming the coroutine. If the functor completes without resuming a coroutine, the functor will set a variable to let the await machinery know that the value is ready.

Since the handle and this variable is shared between the await machinery and the functor, you'll need to ensure synchronization between the two. That's a fairly complex thing, but it's whatever .then-style machinery requires.

Or you could just live with the minor inefficiency.

回答2:

Current design has an important future that co_await takes a general expression and not a call expression.

This allows us to write code like this:

auto f = coro_1();
co_await coro_2();
co_await f;

We can run two or more asynchronous tasks in-parallel, and then wait for both of them.

Consequently, the implementation of coro_1 should start its work in its call, and not in await_suspend.

This also means that there should be a pre-allocated memory where coro_1 would put its result, and where it would take the coroutine_handle.

We can use non-copyable Awaitable and guaranteed copy elision.
async_foo would be called from constructor of Awaitable:

auto coro_foo(A& a, B& b, C& c, X& x) /* -> Y */ {
  struct Awaitable {
    Awaitable(A& a, B& b, C& c, X& x) : x_(x) {
      async_foo(a, b, c, [this](X& x, Y& y){
        *x_ = std::move(x);
        y_ = &y;
        if (done_.exchange(true)) {
          h.resume();  // Coroutine resumes inside of resume()
        }
      });
    }
    bool await_ready() const noexcept {
      return done_;
    }
    bool await_suspend(coroutine_handle<> h) {
      h_ = h;
      return !done_.exchange(true);
    }
    Y await_resume() {
      return std::move(*y_);
    }
    atomic<bool> done_;
    coroutine_handle<> h_;
    X* x_;
    Y* y_;
  };
  return Awaitable(a, b, c, &x);
}

回答3:

async_foo could be called directly from coro_foo if we use a future-like class.
It would cost us a single allocation and an atomic variable:

static char done = 0;

template<typename T>
struct Future {
  T t_;
  std::atomic<void*> addr_;

  template<typename X>
  void SetResult(X&& r) {
    t_ = std::move(r);
    void* h = addr_.exchange(&done);
    if (h) std::experimental::coroutine_handle<>::from_address(h).resume();
  }

  bool await_ready() const noexcept { return false; }
  bool await_suspend(std::experimental::coroutine_handle<> h) noexcept {
    return addr_.exchange(h.address()) != &done;
  }
  auto await_resume() noexcept {
    auto t = std::move(t_);
    delete this;  // unsafe, will be leaked on h.destroy()
    return t;
  }
};

Future<Y>& coro_foo(A& a, B& b, C& c, X& x) {
  auto* p = new Future<Y>;
  async_foo(a, b, c, [p, &x](X& x_, Y& y_) {
        x = std::move(x_);
        p->SetResult(y_);
  });
  return *p;
}

It doesn't look very expensive,
but it doesn't significantly improve the code in the question.
(It also hurts my eyes)

来源：https://stackoverflow.com/questions/45311488/co-await-appears-to-be-suboptimal

标签

c++

c++-coroutine