Multi-thread rendering vs command pools

帅比萌擦擦* 提交于 2019-12-03 14:25:18

If you intend to record in parallel, then you would better be off having a separate pool for each thread, isn't that right?

I don't see how having a separate pool per thread "kills the whole purpose of command pools when it comes to recording in parallel". Indeed, it helps it quite a bit, since each thread can manage its own command pool as it sees fit.

Consider the structural difference between, say, a descriptor pool and a command pool. With a descriptor pool, you basically tell it exactly what you will allocate from it. VkDescriptorPoolCreateInfo provides detailed information which allows implementations to allocate up-front exactly how much memory you'll use for each pool. And you cannot allocate more than this from a descriptor pool.

By contrast, VkCommandPoolCreateInfo contains... nothing. Oh, you tell it if the command buffers can be primary or secondary. You say whether the command buffers will be frequently reset or persistent. And a couple of other things. But other than that, you say nothing about the contents of the command buffers. You don't even give it information on how many buffers you'll allocate.

Descriptor pools are intended to be fixed: allocated as needed, but up to a quantity set at construction time. Command buffers are intended to be very dynamic: allocated from as needed for your particular use cases.

Think of it as each pool having its own malloc/free. Since the user is forced to synchronize access to pools and their buffers, that means that every vkCmd* function is not required to do so when they allocate memory. That makes command building faster. That helps threading. When a thread decides to reset its command pool, it doesn't have to lock any mutexes or any other such stuff to do that.

There's nothing conceptually wrong with having one command pool per thread. Indeed, having two per thread (double-buffering) makes even more sense.

I don't personally know why you wouldn't just pre-record everything.

Because you're not making a static tech demo.

I guess this comes from lack of experience, but I imagined the parallel-recording would look like "threads 2-N record secondary command buffers, thread 1 calls all of them in one primary command buffer", in which case there is only one command buffer per thread. That was why I said it kills the purpose of command pools, because you are only making a single allocation per pool.

That's certainly a viable form of recording command buffers in parallel. But there are two things you've missed.

While that is certainly one form of parallel recording, it is not the only one. If you're doing deferred rendering, the thread that builds the CB for the lighting passes will be finished with its work much sooner than one of the threads that's responsible for (part of) the geometry pass. So a well-designed multithreaded system will have to apportion out work to threads based on need, not based on some fixed arrangement of stuff. So an individual thread will often end up building multiple command buffers.

And even if that were not the case, you forget about buffering. When it comes time to build the CBs for the next frame, you can't just overwrite the existing ones. After all, they're probably still in the queue doing work. So each thread will need at least two CBs; the one that's currently being executed and the one that's currently being built.

And even if that were not the case, command pools allocate all memory associated with a CB. There's a reason why I analogized them to malloc/free. Even if you only use a single CB with a particular pool, the fact that this CB's allocations (which can happen due to any vkCmd* function) never have to synchronize with another thread is a good thing.

So no, this does not in any way inhibit the ability to use multiple threads to build CBs.

If you intend to record in parallel, then you would better be off having a separate pool for each thread, isn't that right?

It is exactly right. That is what your spec quote implies.

I would understand it if if you pre-record command buffers allocated all from the same pool (in one thread) and then execute them in parallel.

Vulkan does one better. You can pre-record command buffers (allocated from per-thread pools) in parallel and then execute them in parallel too (if your workload is conducive to that).

I don't personally know why you wouldn't just pre-record everything. So why is building command buffers in parallel so needed?

Because it's hard (especially as your app grows in complexity). At some point even contra-productive (when you twist the CmBs to be pre-recordable - e.g. filling it with empty placeholder bindings from which 80 % of them won't be used).
It is not necessarily "needed", Vulkan just lets you choose what you deem is best for your App (or part of it).

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!