Which metrics to capture?
Posted: Thu, 2020-06-25 12:14
I am profiling an application on an Oculus Quest (Adreno 540). I want to improve the framerate of the application and I have determined with Quest performance tools that the application is GPU-limited.
I am learning to use the Snapdragon Profiler and I understand how to take a trace. However, when I trace, there are many dozens of metrics I can choose from. The user guide lists the names of the metrics but does not explain exactly what they do. Simply, I do not know which metrics are the "useful" ones. Some of them, such as "GPU general->Clocks->Number of GPU clocks" I do not even understand what the metric is measuring.
How should I proceed? I can imagine various things that might be taking a lot of time (my meshes are too big, I am doing too much alpha shading, I need to coalesce into fewer draw calls, my shaders are too complicated, I use a compute shader and switching compute->draw->compute takes a long time, I use a stencil buffer and maybe this is forcing tiling flushes). Are there ways I could use the metrics in trace mode to tell which of the above scenarios might be happening, or what GPU activiites are taking up the most time? Which metrics?*
I see in the "snapshot view" I can get a listing of my draw calls; is there any way to tell what percentage of frame render time on the GPU came from each of these draw calls, or to line up in the Trace view which draw call is being processed by the GPU at a given moment in the trace?
If I trace "OpenGL ES->Rendering Stages" I get a very complicated diagram saying things like "IB1 start markers" "Flush markers" "Render' "Blit" "Binning". How do I interpret this diagram?
I see the website has a couple videos on simple usage of the profiler. Are there any videos or blog posts about how to use the profiler in a real world optimization workflow?
I am using profiler 2020.1.0 on Mac OS X 10.13.6.
* Percent Time Shading Fragments and Percent Time Shading Vertices seem maybe useful but when I measure them they only bounce between 100% and 0% on a microsecond-to-microsecond basis, no matter what averaging options I pick, so I can't seem to get actionable data out of this.