Qualcomm products mentioned within this post are offered by
Qualcomm Technologies, Inc. and/or its subsidiaries.
Wikipedia’s definition of augmented reality (AR) includes terms like “sensory modalities” and “somatosensory”, but don’t let that terminology intimidate you. AR development really boils down to a few building blocks like real-time video capture, rendering, sensors, some form of input, and code to tie it together.
Let’s review some of the basics to help you get started.
Anchors and Scenes
A common feature of many AR apps is the ability for users to move and place virtual objects. This involves the use of anchors, metadata which store the real-world positions and orientations of virtual objects and are often persisted across sessions. For example, a virtual object placed at a location in one session should be visible in a subsequent session, even if viewed from a different location or orientation.
Similarly, a trackable is a point or plane to which anchors can be attached. For example, a trackable associated with a moving surface in a dynamic environment causes all anchored objects to reposition/reorient accordingly.
Scene (aka environment) understanding is another fundamental aspect of AR. It involves analyzing your user’s physical environment to create a digital twin that maps the virtual world to the physical world. It enables objects to be anchored in space and users to move around the objects as if they were real. It also defines where/how the device (smartphone or headworn) is located or oriented relative to the environment. An app can analyze the surroundings once (e.g., when defining a limited area during startup) or continually build up and persist scene information over time as the user navigates.
There are two main approaches for scene understanding:
- Marker-based approaches identify visual features or markers captured by the camera. Image processing algorithms and computer vision techniques are often employed to detect features like corners or edges of objects.
- Markerless-based approaches use data from IMUs such as a compass, accelerometer, gyroscope, or GPS. This sensor data is often fused and made available to the application developer through high-level API constructs.
Developers often use a hybrid of the two approaches. Together they can provide richer information or derive missing information. For example, when GPS is unavailable, sensor and visual data can approximate a device’s location from the last known GPS location.
Related to scene understanding is depth understanding (aka depth estimation). This derives the distance to features or objects in the scene. With this information, virtual objects can interact with the environment (e.g., preventing the user from pushing a virtual object through a physical wall). It also facilitates occlusion calculations, where physical objects can cover virtual objects.
Key AR Development Skills
AR blends rendering knowledge, art asset creation, and UI/UX design.
On the rendering side, AR requires knowledge of:
- Working in 3D space, including 3D math like vectors and matrices.
- Graphics pipelines to convert assets from art software packages to a format optimized for a given platform. Developing these tools often requires first-hand knowledge of a target platform, typically acquired through platform documentation (e.g., our Qualcomm Adreno GPU SDK’s documentation describes our GPU’s architecture).
- Shaders to implement special effects.
- Scene management to load/render only what’s required for the current viewport (i.e., the field of view provided by the user’s 2D screen or immersive headworn device).
You should be familiar with real-time, frame-based software architectures. For example, a typical game loop acquires user input, updates game logic based on that input, and then renders it accordingly. An AR loop adds sensor input collection and considers the physical world in the update and rendering phases.
Real-time architectures require real-time debugging techniques. Remote debugging allows the AR viewport to remain visible on the device, reserving the development machine for code, breakpoints, etc. Temporary debug overlays are also useful for displaying everything from frame rates to names of objects in the current scene.
Art Asset Considerations
Art assets provide the visuals in AR, so bring artists skilled in both 2D and 3D graphics onto the team. They can include character modelers and animators, object modelers, UI designers, and texture artists.
2D art assets can include imagery and textures for signage, information boards, virtual UIs, as well as heads-up displays (HUDs) that remain fixed on screen. Textures are also used for effects like particle systems (e.g., water effects, smoke, etc.).
3D art assets include objects, characters, and environmental models that augment the surroundings. Model rigs can be created to procedurally animate or use streams of animation data.
Here are a few general considerations for creating art assets:
- Review how virtual objects look in AR. Since these objects are rendered over the physical world, their size should be appropriate for the surrounding environment. This topic from Google discusses the need to fit object sizes to their environments.
- Use multiple levels of detail to balance performance and realism. Use complex models and detailed textures when the extra rendering load is warranted to show details of close-up objects. As the user moves away, switch to less detailed models to reduce the rendering load.
- Consider adding physics-based rendering (PBR). PBR simulates how light reflects off different materials like in the real world. This improves realism and helps virtual objects blend in with the physical world.
UI and UX Considerations
Many of the gestures used in today’s 2D mobile apps (e.g., taps to select objects, swipes to move objects around, and pinches to resize objects), generally translate well for AR interactions, so mobile app developers will feel at home here. This article from Google provides a good overview of gestures in AR, and this blog post from Wikitude has some great information for developers on how to apply UX design. In addition to implementing these gestures on touchscreens (e.g., smart phones), developers may also implement them via hand tracking, where the movements of user’s hands and fingers are captured using cameras or sensors in the headworn devices. See An Ultra Leap into a Whole new World of Hand Tracking where we discuss this in more detail.
And always remember, safety first! AR experiences are more immersive than standard applications, so users can lose track of their surroundings as they interact or even experience cybersickness. To help prevent this, remind users to be aware of their surroundings and avoid having them walk backward. Also, limit AR session time so users can re-ground themselves in reality, but make it easy to resume their session in the same state. See our eBook on Cybersickness for more tips.
Today’s smartphones, tablets, and headworn units come packed with high-resolution cameras and smart sensors. Some also include technologies like 5G mmWave for low-latency cloud connectivity so developers can decide where the heavy processing is best performed.
These technologies are at the core of our Snapdragon mobile platforms, which power many of today’s mobile devices and AR experiences. AR on smartphones, like the Motorola edge+ is a simple option to get you started on your AR development journey. Or try head-mounted displays like Lenovo’s ThinkReality A3 for more immersive AR development.
What Experience will you Build?
AR applications can span many verticals, including games, healthcare, tech support, manufacturing, and more. And while AR headsets are soon poised to power even more immersive experiences, we feel that currently, the ideal consumer mobile AR applications are the ones that enhance everyday life experiences. You may already be familiar with a few.
For example, many people rely on real-time OCR text translation overlaid on real-world objects (e.g., a product’s label in a foreign language). With remote assistance, objects are highlighted in the user’s viewport as they try to fix something, and the rendered view is shared with a remote support worker for what-you-see assistance. And if you’ve ever gotten lost, AR can overlay directional arrows in your on-device camera to help you navigate.
Start Developing in AR Today
Our new Snapdragon Spaces platform provides the tools you need to build immersive AR experiences. The Snapdragon Spaces HDK includes the Motorola edge+ smartphone and Lenovo ThinkReality A3 smart glasses. Check out our Quick Start Guide, and then download our Snapdragon Spaces SDK. You can then get started using the SDK with either Unity or Unreal, where you can implement and apply some of the constructs and ideas listed in this blog post.
There are also several other tools and frameworks that you can use to build AR experiences, including:
- Google’s ARCore is an API for developing AR applications on Android devices. Download their free ARCore Elements app to see what’s possible.
- Unity’s XR Interaction Toolkit and MARS are tools for visual AR development.
- OpenXR is an open AR API standard for which there are several implementations for different devices. Snapdragon Spaces provides an OpenXR compliant runtime and various OpenXR extensions
- For powerful handheld augmented reality applications, check out the Wikitude SDK.
- Our Qualcomm Computer Vision SDK can be used for gesture detection and computer vision features in AR.
With the rapid growth of AR on mobile, there has never been a better time to get into AR development. By learning and being aware of some of these skills, you will be poised for more complex AR experiences on both mobile and headsets as the industry grows.
Snapdragon, Qualcomm Adreno, Qualcomm Computer Vision SDK, and Snapdragon Spaces are products of Qualcomm Technologies, Inc. and/or its subsidiaries.