The performance of vkCmdDrawIndexedIndirect|Count function is terrible. It's slower than serial vkCmdDrawIndexed and even slower than serial OpenGLES API.
Vulkan app with 16641 DIPs (32k tris) per frame:
Single call of vkCmdDrawIndexed: 16M tris/sec
16641 calls of vkCmdDrawIndexed: 1.45M tris/sec
16641 calls of vkCmdDrawIndexedIndirect: 169K tris/sec
Single call of vkCmdDrawIndexedIndirect with 16641 draws: 216K tris/sec
Single call of vkCmdDrawIndexedIndirectCount with 16641 draws from buffer: 216K tris/sec
OpenGLES app with 16641 DIPs (32k tris) per frame:
Single call of glDrawElements: 16M tris/sec
16641 calls of glDrawElements: 5.8M tris/sec
16641 calls of glDrawElementsIndirect: 4.8M tris/sec
https://drive.google.com/file/d/18uOBLEjQXwMa7JV-1jZUK5iVkB1FWwX1/view
https://drive.google.com/file/d/1EMlW_h4c7MCT6jDvyaCnsvGWY8-JMn40/view
You were the chosen one! It was said that you would destroy the Sith, not join them! Bring balance to the Force, not leave it in darkness!
Thank you!
Adreno 610(Android 10)
OpenGLES app with 16641 DIPs (32k tris) per frame:
Single call of glDrawElements: 7.35M tris/sec
16641 calls of glDrawElements: 3.4M tris/sec
16641 calls of glDrawElementsIndirect: 2.83M tris/sec
Vulkan app with 16641 DIPs (32k tris) per frame:
Single call of vkCmdDrawIndexed: 7M tris/sec
16641 calls of vkCmdDrawIndexed: 1.12M tris/sec
16641 calls of vkCmdDrawIndexedIndirect: 83.56K tris/sec
Single call of vkCmdDrawIndexedIndirect with 16641 draws: 97.2K tris/sec
Single call of vkCmdDrawIndexedIndirectCount with 16641 draws from buffer: 97.2K tris/sec