We're experiencing a large number of crashes during glLinkProgram on Adreno 505 and 506 devices. The most common device is the Moto G6 running Android 9, driver version V@331.
The shader which is being linked when the crash occurs doesn't appear to matter. But I believe there is a race between shaders being compiled in our application's GL thread, and in the Android UI's renderer thread (ie to render our application's UI, not the system UI)
Sometimes we get this backtrace from our GL thread:
Build fingerprint: 'motorola/ali_n/ali_n:9/PPS29.118-15-11/a37fd:user/release-keys'
Revision: 'PVT2'
ABI: 'arm'
pid: 6220, tid: 6324, name: Renderer >>> org.mozilla.fenix.debug <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x54
Cause: null pointer dereference
r0 78bf6889 r1 00000000 r2 00000000 r3 40000000
r4 00000000 r5 6be6e900 r6 00000000 r7 6af14000
r8 6af14054 r9 6af1403c r10 00000000 r11 0000000f
ip 80000000 sp 78bf6858 lr 98d54601 pc 98d54610
backtrace:
#00 pc 00732610 /vendor/lib/libllvm-glnext.so (ESXLinker::bcConstruct()+752)
#01 pc 00734e19 /vendor/lib/libllvm-glnext.so (SOLinker::linkShaders(QGLC_LINKPROGRAM_DATA*, QGLC_LINKPROGRAM_RESULT*)+88)
#02 pc 0072dafb /vendor/lib/libllvm-glnext.so (CompilerContext::LinkProgram(unsigned int, QGLC_SRCSHADER_IRSHADER**, QGLC_LINKPROGRAM_DATA*, QGLC_LINKPROGRAM_RESULT*)+322)
#03 pc 007e0367 /vendor/lib/libllvm-glnext.so (QGLCLinkProgram(void*, unsigned int, QGLC_SRCSHADER_IRSHADER**, QGLC_LINKPROGRAM_DATA*, QGLC_LINKPROGRAM_RESULT*)+58)
#04 pc 0015091d /vendor/lib/egl/libGLESv2_adreno.so (EsxShaderCompiler::CompileProgram(EsxContext*, EsxProgram const*, EsxLinkedList const*, EsxLinkedList const*, EsxInfoLog*)+1700)
#05 pc 00127b57 /vendor/lib/egl/libGLESv2_adreno.so (EsxProgram::Link(EsxContext*)+494)
#06 pc 000a881b /vendor/lib/egl/libGLESv2_adreno.so (EsxContext::LinkProgram(EsxProgram*)+62)
#07 pc 0553d865 /data/app/org.mozilla.fenix.debug-1ifaNJWoJ7dxuSFtgz58tA==/lib/arm/libxul.so (offset 0x2688000) (webrender::device::gl::Device::compile_shader::h1719f7991337b528+200)
and sometimes we get this backtrace from the UI:
pid: 25170, tid: 25294, name: RenderThread >>> org.mozilla.fenix.debug <<<
signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x54
Cause: null pointer dereference
r0 6e8bde59 r1 00000000 r2 00000000 r3 40000000
r4 00000000 r5 6a0ed800 r6 00000000 r7 6a203000
r8 6a203054 r9 6a20303c r10 00000000 r11 0000000f
ip 80000000 sp 6e8bde28 lr 98d54601 pc 98d54610
backtrace:
#00 pc 00732610 /vendor/lib/libllvm-glnext.so (ESXLinker::bcConstruct()+752)
#01 pc 00734e19 /vendor/lib/libllvm-glnext.so (SOLinker::linkShaders(QGLC_LINKPROGRAM_DATA*, QGLC_LINKPROGRAM_RESULT*)+88)
#02 pc 0072dafb /vendor/lib/libllvm-glnext.so (CompilerContext::LinkProgram(unsigned int, QGLC_SRCSHADER_IRSHADER**, QGLC_LINKPROGRAM_DATA*, QGLC_LINKPROGRAM_RESULT*)+322)
#03 pc 007e0367 /vendor/lib/libllvm-glnext.so (QGLCLinkProgram(void*, unsigned int, QGLC_SRCSHADER_IRSHADER**, QGLC_LINKPROGRAM_DATA*, QGLC_LINKPROGRAM_RESULT*)+58)
#04 pc 0015091d /vendor/lib/egl/libGLESv2_adreno.so (EsxShaderCompiler::CompileProgram(EsxContext*, EsxProgram const*, EsxLinkedList const*, EsxLinkedList const*, EsxInfoLog*)+1700)
#05 pc 00127b57 /vendor/lib/egl/libGLESv2_adreno.so (EsxProgram::Link(EsxContext*)+494)
#06 pc 000a881b /vendor/lib/egl/libGLESv2_adreno.so (EsxContext::LinkProgram(EsxProgram*)+62)
#07 pc 003c12bb /system/lib/libhwui.so (GrGLProgramBuilder::CreateProgram(GrPipeline const&, GrPrimitiveProcessor const&, GrProgramDesc*, GrGLGpu*)+750)
#08 pc 003569bb /system/lib/libhwui.so (GrGLGpu::ProgramCache::refProgram(GrGLGpu const*, GrPipeline const&, GrPrimitiveProcessor const&, bool)+658)
#09 pc 00354f97 /system/lib/libhwui.so (GrGLGpu::flushGLState(GrPipeline const&, GrPrimitiveProcessor const&, bool)+38)
#10 pc 00355a65 /system/lib/libhwui.so (GrGLGpu::draw(GrPipeline const&, GrPrimitiveProcessor const&, GrMesh const*, GrPipeline::DynamicState const*, int)+72)
#11 pc 0034d2bd /system/lib/libhwui.so (GrOpFlushState::executeDrawsAndUploadsForMeshDrawOp(unsigned int, SkRect const&)+184)
#12 pc 0039ff91 /system/lib/libhwui.so (GrRenderTargetOpList::onExecute(GrOpFlushState*)+204)
#13 pc 00396df1 /system/lib/libhwui.so (GrDrawingManager::executeOpLists(int, int, GrOpFlushState*)+304)
#14 pc 00396a8b /system/lib/libhwui.so (GrDrawingManager::internalFlush(GrSurfaceProxy*, GrResourceCache::FlushType, int, GrBackendSemaphore*)+966)
#15 pc 003970db /system/lib/libhwui.so (GrDrawingManager::prepareSurfaceForExternalIO(GrSurfaceProxy*, int, GrBackendSemaphore*)+58)
#16 pc 0035f723 /system/lib/libhwui.so (android::uirenderer::skiapipeline::SkiaPipeline::renderFrame(android::uirenderer::LayerUpdateQueue const&, SkRect const&, std::__1::vector<android::sp<android::uirenderer::RenderNode>, std::__1::allocator<android::sp<android::uirenderer::RenderNode>>> const&, bool, bool, android::uirenderer::Rect const&, sk_sp<SkSurface>)+130)
#17 pc 0035ed8b /system/lib/libhwui.so (android::uirenderer::skiapipeline::SkiaOpenGLPipeline::draw(android::uirenderer::renderthread::Frame const&, SkRect const&, SkRect const&, android::uirenderer::FrameBuilder::LightGeometry const&, android::uirenderer::LayerUpdateQueue*, android::uirenderer::Rect const&, bool, bool, android::uirenderer::BakedOpRenderer::LightInfo const&, std::__1::vector<android::sp<android::uirenderer::RenderNode>, std::__1::allocator<android::sp<android::uirenderer::RenderNode>>
#18 pc 00099b2b /system/lib/libhwui.so (android::uirenderer::renderthread::CanvasContext::draw()+150)
#19 pc 003624b5 /system/lib/libhwui.so (_ZNSt3__110__function6__funcIZN7android10uirenderer12renderthread13DrawFrameTask11postAndWaitEvE3$_0NS_9allocatorIS6_EEFvvEEclEv$c303f2d2360db58ed70a2d0ac7ed911b+576)
#20 pc 0032afcf /system/lib/libhwui.so (android::uirenderer::WorkQueue::process()+122)
#21 pc 000a256f /system/lib/libhwui.so (android::uirenderer::renderthread::RenderThread::threadLoop()+178)
By adding android:hardwareAccelerated="false" to the application's AndroidManifest.xml, this disables the UI render thread and prevents the crash. However, this is not an idea solution.
Is there anything better we can do to avoid this crash? Thanks in advance.