Hi, recently we have developing a heterogeneous nerural network framework by shader storage buffer objects and compute shader of OpenGLES3.1 on Android. We encounted a depressing problem, when we get result through glMapBufferRange, the cost time in Adreno540 is even slower than in Mali T880, the sample code as follow:
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, input_ssbo);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, output_ssbo);
glDispatchCompute(1, 1, 1);
glMemoryBarrier(GL_SHADER_STORAGE_BARRIER_BIT);
glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 0, 0); glBindBufferBase(GL_SHADER_STORAGE_BUFFER, 1, 0);
glBindBuffer(GL_SHADER_STORAGE_BUFFER, output_ssbo);
float* buffer = (float*)glMapBufferRange(GL_SHADER_STORAGE_BUFFER, 0, sizeInBytes, GL_MAP_READ_BIT);
memcpy(out,buffer, sizeInBytes);
Please help us understand if something wrong and how to solve this problem, Thank you.