I am trying to use UBOs in my current framework and I am seeing some peformance drop with them as compared to glUniform* calls.
My implementation looks like this:
Lets say we have 5 meshes in a scene. For each mesh I upload fix number of matrices lets say 4 even if it doesn't use all of them. Now all the 5 matrices are lying next to each other in array. Now I pass starting index of matrices for each mesh to the shader. for first object index will be 0 for 2nd it will be 5, 3rd -> 10 ...
My application is running with 44-52 fps ( fps is varying as the light is rotating) with glUniform* calls. I am trying this experiment only with transform ubos and for light I am not using UBOs.
Here are few cases I tried:
1. Use same UBO in each frame, with/without unsynchronized maps, update UBO for initial few frames only. Still after this my applications runs with 42-47 fps. No idea why less fps even when i am not uploading UBO data in each frame.
2. Use 2 UBOs, with/without unsynchronized maps, update UBOs in each frame, fps is 42-47.
I am orphaning a buffer in both the cases.
What is puzzling here is only difference is I am using transform UBOs, even when I dont update them, I get lesser fps, so is it due to data access in shader? cache misses? or just binding of UBO ?
I tried taking binds out, but didn't help. It looks like bottleneck lies in accessing data in shader. I am passing "uniform int index" for each mesh in scene to offset matrix in array.
I am having UBO size with max capacity so even if I dont use all the buffer I still allocate that much memory. Now, even if I allocate minimal memory ( 12800 bytes) for the buffer, it doesn't improve. I was thinking if I was hitting register size limit.
When I removed one of the model in scene which has light with it, UBO works better with chain of UBOs as compared to glUniform. Now the interesting thing to note is that in this case my UBO size is (12800) which is same as above.
I disabled the updates to ubo after certain frames now just to eliminate the other bottleneck.
Adreno 530 - G9350.