Improving Foveated Rendering with the Fragment Density Map Offset Extension for Vulkan

Tuesday 8/9/22 08:15am
Posted By Jonathan Tinkham
  • Up0
  • Down0

Qualcomm products mentioned within this post are offered by
Qualcomm Technologies, Inc. and/or its subsidiaries.

Co-written by Jonathan Wicks and Sam Holmes

In The Evolution of High Performance Foveated Rendering on Adreno, we detailed how foveated rendering reduces the rendering workload in XR devices. We introduced Vulkan and OpenGL extensions and recent improvements in the Snapdragon XR2 platform that reduce bandwidth, increase performance, and reduce power consumption.

Let’s see how our newly developed fragment density map (FDM) offset extension for Vulkan provides more control to help you achieve better visual quality for foveated rendering, while maintaining framerate performance.

Optimizing Rendering
To understand how this new extension works, let’s first review how rendering is typically optimized on a mobile device. A typical mobile game is played at arms’ length and runs at about 60 fps, leaving just 16ms to render each frame. While some framerate drops are acceptable on this medium, players still demand the best overall framerates possible.

One technique to satisfy this is tile-based rendering, like that used on our Qualcomm Adreno GPU. It divides a frame into tiles, where each is rendered in sequence to high-speed memory on the GPU and then transferred to system memory for sampling and final display. A key part of this process is binning, where the geometry to render is mapped to the tiles so that each tile renders the appropriate pixels.

In XR, additional parameters increase the potential for framerate drops, which makes the effects more noticeable. XR devices bring the viewports close to the user, and stereoscopic rendering involves one frame per eye, typically at 90 FPS. This leaves 11ms to render the scene twice from two angles. As well, framerate drops must be avoided to help prevent cybersickness. Needless to say, XR requires its own set of techniques to maximize rendering performance.

One common approach is foveated rendering. Foveation is often discussed in the context of eye tracking, but it’s not required. In fact, foveation can be very beneficial for countering the over-rendering that can occur in the periphery due to the characteristics of the lens. Adreno foveated rendering provides control over the number of pixels shaded per tile and can focus shading on the areas that matter most. This results in large savings on the fragment end of the graphics pipeline.

This functionality is available in Vulkan and OpenGL via the VK_EXT_fragment_density_map (FDM) and QCOM_texture_foveated_subsampled_layout extensions respectively.

Adding an Offset to Provide Finer Control
For static foveation maps, this works well as the map is written once and used repeatedly. However, if the map needs to be updated often, it can require additional CPU time to generate the maps and the alignment of the map to the tile grid can cause areas to pop back and forth between different quality levels.

Animation showing a visualization of how the foveated area can appear to pop as the focal point moves.

To address this, we published the VK_QCOM_fragment_density_map_offset extension for Vulkan. This extension allows you to update foveated rendering by fine-grained amounts without updating the underlying FDM attachment. In other words, it changes the framebuffer location where density values are applied without having to regenerate the FDM. This effectively moves the content behind the tile grid, resulting in versatile tweaks to foveated rendering and quick and smooth transitions for applications like eye-tracking to update the fovea region’s position.

The following diagram illustrates how our new extension enables smooth control over the foveation map:

Animation showing how FDM offsetting improves translation of the foveated area as the focal point moves.

Usage of the Extension
VK_QCOM_fragment_density_map_offset requires our aforementioned VK_EXT_fragment_density_map extension, and either Vulkan 1.2 or the VK_KHR_create_renderpass2 extension to access the VkEndSubpassInfo struct. Hardware wise, it targets and has been optimized for the XR2 platform, but is available from the Snapdragon 865 mobile platform onwards with updated drivers.

Offsets are specified in the last subpass of a renderpass which updates the offset as late as possible to reduce latency. This is accomplished by populating the VkSubpassFragmentDensityMapOffsetEndInfoQCOM structure with the desired offsets and passing them to vkCmdEndRenderPass2 via the VkSubpassEndInfo struct. This struct takes an array of VkOffset2D values where each describes the desired offset of the FDM in the X and Y direction. A VkOffset2D must be instantiated for each layer of the FDM used in the renderpass. Each offset applies to the corresponding layer. The following code shows how to set this up:

VkOffset2D offsets[2] = {};
offsets[0].x = //…
VkSubpassFragmentDensityMapOffsetEndInfoQCOM offsetInfo =
 2, // fragmentDensityOffsetCount; 1 for each layer/multiview view
 offsets, // offsets are aligned to fragmentDensityOffsetGranularity
VkSubpassEndInfo subpassEndinfo = {};
subpassEndInfo.pNext = &offsetInfo;
// Only offets given to the last subpass are used for the whole renderpass
// Offsets given in other subpasses are ignored
vkCmdEndRenderPass2(VkCommandBuffer commandBuffer, &subpassEndInfo);

Note: There is an alignment restriction for the offset values. For current Adreno GPUs, particularly in the Snapdragon 865, this is 8 x 8 pixels in screen/framebuffer space. You must programmatically fetch this limit at runtime by querying the physical device properties using vkGetPhysicalDeviceProperties2() to populate the VkPhysicalDeviceFragmentDensityMapOffsetPropertiesQCOM struct. The VkExtent2D field will be filled with the supported granularity in the X and Y directions, respectively, and offsets must be aligned to this limit. The following code shows how to get this information at runtime:

VkPhysicalDeviceFragmentDensityMapOffsetPropertiesQCOM offsetProperties = {}
VkPhysicalDeviceProperties2 properties = {};
Properties.pNext = &offsetProperties;
vkGetPhysicalDeviceProperties2(pDevice, &properties);
offsets[0].x = align(offsets[0].x, offsetsProperties. fragmentDensityOffsetGranularity.width);

Lastly, all images used in the framebuffer MUST be created with the VK_IMAGE_CREATE_FRAGMENT_DENSITY_MAP_OFFSET_BIT_QCOM flag. This is required on all image attachments, including FDM attachments, depth buffers, color attachments, and resolve attachments.

Developer Tips
The sizing of the FDM doesn’t change with this extension, and the FDM must be sized to fit the framebuffer extent. Shifting the FDM requires that tile values are populated for the portion of the framebuffer no longer covered. The behavior of this extension is to clamp to the edge value of the FDM. This fits typical foveation patterns that have lower-quality values on the periphery. However, other strategies are possible, such as having a single-tile edge value of 0 to take advantage of the tile-exclusion behavior of Adreno foveation. This fills in missing portions with 0, thus causing the GPU to skip rendering of these portions of the screen.

Stay Focused with FDM Offsetting!
FDM offsetting is a great way to improve the visual performance of foveated rendering as the focal point moves. This is especially true for cases like eye-tracking, where quick shifts of the fovea demand quick changes to quality levels in periphery regions to ensure smooth shifting of detail on the edge of the fovea.

For more information about Vulkan, be sure to check out the following resources:

Qualcomm Adreno, and Snapdragon are products of Qualcomm Technologies, Inc. and/or its subsidiaries.