Hi, I have two questions about benchmark(snpe_bench.py) result.
In this experiments, I use mobilenetV2.onnx file (https://github.com/onnx/models/tree/master/vision/classification/mobilenet).
The result is below.
DSP_timing(1 runs)
avg (us) max (us) min (us) runtime
Load 105 105 105 CPU
Deserialize 3848 3848 3848 CPU
Create 75472 75472 75472 CPU
Init 82414 82414 82414 CPU
De-Init 17164 17164 17164 CPU
Total Inference Time 4828 4828 4828 CPU
DSP_mem(1 runs)
avg (kB) max (kB) min (kB) runtime
pss 14910 16671 13131 NA
prv_dirty 9079 10840 7300 NA
prv_clean 5380 5380 5380 NA
<dsp result (in /SNPE_ROOT/benchmarks/alexnet_sample.json file set "Runtimes":DSP, and also while running snpe-dlc-quantize, set runtime mode 'dsp ') >
GPU_timing(1 runs)
avg (us) max (us) min (us) runtime
Load 88 88 88 CPU
Deserialize 11669 11669 11669 CPU
Create 632168 632168 632168 CPU
Init 661088 661088 661088 CPU
De-Init 17130 17130 17130 CPU
Total Inference Time 7909 7909 7909 CPU
GPU_mem(1 runs)
avg (kB) max (kB) min (kB) runtime
pss 67982 79273 35765 NA
prv_dirty 45045 56128 12476 NA
prv_clean 22378 22712 21548 NA
<gpu result (in /SNPE_ROOT/benchmarks/alexnet_sample.json file set "Runtimes":GPU) >
At the table, the first question is what's the meaning about 'pss', 'prv_dirty', 'prv_clean' ?
Second question is I run on the 'dsp' and 'gpu' (different settings), but why 'runtime' show 'cpu' in timing ? (bold in table)
Thanks for the help!
Hi Seo,
1. Proportional Set Size (PSS) is the portion of main memory (RAM) occupied by a process and is composed of the private memory of that process plus the proportion of shared memory with one or more other processes. Unshared memory including the proportion of shared memory is reported as the PSS.
Clean refers to the page loaded into the memory and is not read by the process but not written into. If the page is written-into and the changes are not written-out into the storage, it is called as Dirty. Pages can move from Clean to Dirty when they're written to.
Private_Clean are the pages in the mapping that have been read and not written by this process but not referenced by any other process;
Private_Dirty are the pages in the mapping that have been written by this process but not referenced by any other process.
2. This happens in the following two possible cases,
i. If the runtime either GPU or DSP is not available in the hardware you are running the NPE Benchmakkeing tool. In some end products which are not meant for development are restricted to use the GPU/DSP even they are physically available.
ii. If the layer is not supported by the runtime you used and is supported by the Fallback runtime.