These tile based deferred rendering GPUs/drivers have problems with VBOs that are updated frequently either via glBufferSubData or glBufferData. The updates trigger shadow copy code paths in the driver which leads to poor performance and potentially OOM errors with content that uses a lot of dynamic vertex data.
The proposed fix is to make the gpu process server internally handle VBOs marked STREAM and/or DYNAMIC as client-side vertex arrays transparently to the command buffer client. In Skia standalone testing using content that is a close mirror of the GUIMark2 Vector test client-side vertex arrays have shown 40-50% reduction in frametime and elimination of OOM errors.
Some of the other ideas that have tried but either didn't fix the OOMs or had other problems:
*Tried all 3 usage params to glBufferData. [No difference]
*Deleting old VBO ID and using a new VBO IDs rather than calling glBufferData with a recycled ID. [OOM fixed but there's a huge hitch on Mali/Adreno when deleted VBOs are reclaimed by the driver]
*Using a ring with various numbers of VBOs to give the driver time to have fully consumed a VBO before it is reused.
It afflicts Adreno devices as well as Mali
glBufferSubData/glMapBufferRange stall driver
Posted: Mon, 2014-05-19 17:00
Thanks for the write up on your performance issues with glBufferSubData and glMapBufferRange
1) Are these specific to an Adreno Driver and/or device?
2) Is there any sample app or benchmark you can refer to that demonstrates the performance issues or Out of Memory messages?
thanks...
I'll generate some examples for this once I get off of vacation this weekend.