Forums - Parallelism

5 posts / 0 new
Last post
Parallelism
ofri
Join Date: 23 Oct 16
Posts: 4
Posted: Sun, 2016-10-23 07:33

Hello,

I just found out about QSML and am trying to replace our current Eigen based implementation with QSML in our Android app. We're currently only interested in a single function - cblas_sgemm(), and was wondering how to enable QSML's internal parallellism. This function is the core of running a convolutional neural network and its performance is highly critical for us.

Thanks,

Ofri

  • Up0
  • Down0
Re: Parallelism Best Answer
mbadin (not verified)
Posted: Sun, 2016-10-23 16:03
Hi ofri,
 
That is an excellent question!  Within the "lib" folder for a given platform there are two versions of QSML, one called "libQSML.so" and one called "libQSML-sequential.so".  Linking "libQSML.so" will enable full parallelism whereas linking "libQSML-sequential.so" will only enable a sequential version of QSML.  We provide both versions to not only maximize compatibility with other parallel runtimes, but also to give developers like yourself options for how to take advantage of the highly tuned kernels we provide within QSML.  Additionally, the header you will want to include in your code is the "qblas_cblas.h" header located in the include folder.  It contains all the declarations for all the cblas primitives we support.
 
Please don't hesitate to contact us again if you run into any additional problems.
 
Thank you,
Matthew
  • Up1
  • Down0
ofri
Join Date: 23 Oct 16
Posts: 4
Posted: Mon, 2016-10-24 07:10

Thanks Matthew for the quick response. Very helpful. Now I have two more questions:

 

1. I'm testing the library on my Nexus 6P (I believe it's a Snapdragon 810) and I'm observing extreme fluctuations in running times. The previous code which was based on Eigen (we used serial sgemm with our custom parallelization) was taking about 9.5 seconds to complete. This was happening most of the time when occasionally it'd take ~10.5 or ~8.5 seconds. So the current code is running at 9.5 sec +/- 1sec.

Now when I replaced the sgemm function from Eigen to QSML (letting QSML do its own parallelization), I'm getting anything between 4 sec to 10.7 sec. The algorithm is being run many times and the actual running time varies greatly. The initial run is almost always quite fast (around 5.5 to 6.5 seconds), where future execution is all over the scale, though it tends to slow down. Strangely, the Eigen based code was almost always taking 9.5 sec. Stopping execution of the algorithm for a short time then starting again repeats the same results - first run is super fast then things slow down. What could explain such behavior?

 

2. Since QSML is compatible only with Snapdragon processors, do you guys have a best practice for detecting compatible processors at runtime? We'd like to use QSML on compatible devices but fall back to Eigen for others.

 

Thanks again,

Ofri

  • Up0
  • Down0
mbadin (not verified)
Posted: Wed, 2016-10-26 16:35
Hi Ofri,
 
That is an excellent question, in the next version of QSML we will have a function that can be called that returns a boolean for whether or not the platform is supported (this would be QSML 0.15.1).  QSML 0.15.1 should be ready very soon.  We are also making some significant changes to the parallel runtime we use and the work division, this should help reduce the amount of performance jitter you see.  Another thing to look for, if you are running many tests back to back, is you might be running into thermal throttling on the device.  This might explain why letting the device rest for a bit results in better performance.
 
Please don't hesitate to contact us again if you run into any additional problems or have any suggestions for how we can make QSML easier to use.
 
Thank you,
Matthew
  • Up1
  • Down0
ds3427756
Join Date: 9 Jan 19
Posts: 1
Posted: Wed, 2019-01-09 20:58

Hey thanks for tis valuabke reply

  • Up1
  • Down0
or Register

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.