Remove Unused Render Targets
Graphics memory (GMEM) is a precious resource inside the Qualcomm® Adreno™ GPU. The GPU generates tiles based on frame buffer size, then reconstructs surfaces in main memory by resolving tiles. The operation is known as a GMEM Store. More render targets mean more tiles, which mean more GMEM Store operations and greater potential for lost performance.
A suitable analogy is that GMEM is like a high-speed L1 cache for the GPU. Loading anything into that cache is expensive and should be avoided unless necessary. Storing anything from that cache into non-tiled memory is also expensive and should be avoided unless necessary.
Using Snapdragon Profiler in Trace Capture mode, you can allow the Rendering Stages metric to highlight GMEM Stores in their own track.
The screenshot below is based on the Depth of Field demo application. The purple blocks at the bottom illustrate that GMEM Stores for Depth Stencil and Color are taking place as three different surfaces (the downscale and blur frame buffer objects) are rendered:
Clicking on those surfaces opens the Inspector view, with the properties for the surface block. As shown below, Color, Depth and Stencil attachments will appear in this view if they are present:
The Inspector view is a quick way to traverse all the drawn surfaces to find the ones attaching render targets that aren’t needed. Here, the view for this frame indicates that the downscale and blur frame buffer objects have both Depth and Stencil attachments that are not needed. They are merely being worked on and carried along.
(“Render target” is a term used in high-level graphics APIs. The process of “binning” is how the Adreno GPU divides a render target of a specific size into one or more bins that can then be worked on by hardware targeted at GMEM. A tile of GMEM should be roughly equivalent to 1 bin.)
After optimizing the code for the selected surfaces, you can validate the results in Trace Capture mode:
The GMEM Stores for Depth and Stencil stop, so the rendering for those three surfaces has fewer interruptions. The Inspector view for those surfaces still shows the Color attachment, but the Depth and Stencil attachments are now absent:
Also, the number of bins rendered has dropped from 8 to 4. (The 8-to-4 reduction in this example is specific to this particular app and workload. In general, it’s not possible — or even easy — to predict the exact reduction, since several platform-specific variables are at work, including the size and configuration of GMEM tiles and the particulars of the binning process.)
In this case, render time for the surface decreased from 601ms to 475ms, shaving about 5 percent off of overall frame time.