Forums - [snpe][1.61] tool snpe-tensorflow-to-dlc --quantization_overrides option only affect activations

1 post / 0 new
[snpe][1.61] tool snpe-tensorflow-to-dlc --quantization_overrides option only affect activations
yingwei_ji
Join Date: 8 Mar 23
Posts: 1
Posted: Thu, 2024-03-14 00:02
ubuntu 18.04
snpe 1.61
 
i can only modify activation quantized value, but not the weights. what's wrong?
 
my model is defined using tf2.3
-------------------------------------------------
import os
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Flatten, Dense

# Custom layer for quantizing weights

# Define a Sequential model
model = tf.keras.Sequential([
    keras.layers.InputLayer(input_shape=(28, 28,1),name='input'),
    keras.layers.Conv2D(filters=12, kernel_size=(3, 3),use_bias=False),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(10,use_bias=False)
])

import numpy as np
x=np.zeros((1,28,28,1))
y=model(x)
tf.saved_model.save(model,"my_model")
------------------------------------------------------------
and edit the quant json file as follows
---------------------------------------------------------------
{
    "activation_encodings": {
        "input:0": [
            {
                "bitwidth": 8,
                "max": 12.82344407824954,
                "min": 0.0,
                "offset": 0,
                "scale": 0.050288015993135454
            }
        ],
        "StatefulPartitionedCall/sequential/conv2d/Conv2D:0": [
            {
                "bitwidth": 8,
                "max": 12.82344407824954,
                "min": 0.0,
                "offset": 0,
                "scale": 0.050288015993135454
            }
        ]
       
    },
    "param_encodings": {
        "StatefulPartitionedCall/sequential/conv2d/Conv2D/biases": [
            {
                "bitwidth": 8,
                "max": 1.700559472933134,
                "min": -2.1006477158567995,
                "offset": 140,
                "scale": 0.01490669485799974
            }
        ],
        "StatefulPartitionedCall/sequential/conv2d/Conv2D/weights": [
            {
                "bitwidth": 8,
                "max": 1.700559472933134,
                "min": -2.1006477158567995,
                "offset": 139,
                "scale": 0.01490669485799974
            }
        ]
    }
  }
-----------------------------------------------
snpe-tensorflow-to-dlc --input_network  my_model  --input_dim input "1,28,28,1" --out_node "dense"  --output_path my_model.dlc --quantization_overrides q.json --show_unconsumed_nodes
2024-03-14 14:58:42.548331: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-14 14:58:42.571246: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2592005000 Hz
2024-03-14 14:58:42.572645: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x22703e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-03-14 14:58:42.572753: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
2024-03-14 14:58:42,574 - 214 - INFO - Processing user provided quantization encodings: 
2024-03-14 14:58:42.661712: I tensorflow/core/grappler/devices.cc:78] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2024-03-14 14:58:42.661841: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2024-03-14 14:58:42.667244: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: graph_to_optimize
2024-03-14 14:58:42.667307: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818]   function_optimizer: Graph size after: 16 nodes (11), 15 edges (10), time = 0.48ms.
2024-03-14 14:58:42.667316: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818]   function_optimizer: function_optimizer did nothing. time = 0.008ms.
2024-03-14 14:58:42,684 - 214 - INFO - INFO_TF_CHANGE_NODE_NAME: Change node name: Identity with name dense
2024-03-14 14:58:42,713 - 214 - INFO - INFO_ALL_BUILDING_NETWORK: 
    ==============================================================
    Building Network
    ==============================================================
2024-03-14 14:58:42,721 - 214 - INFO - Resolving static sub-graphs in network...
2024-03-14 14:58:42,722 - 214 - INFO - Resolving static sub-graphs in network, complete.
2024-03-14 14:58:42,724 - 214 - INFO - Processed 3 quantization encodings
2024-03-14 14:58:42,732 - 214 - INFO - INFO_DLC_SAVE_LOCATION: Saving model at my_model.dlc
2024-03-14 14:58:42,735 - 214 - INFO - INFO_CONVERSION_SUCCESS: Conversion completed successfully
 
----------------------------------------------
~/snpe-1.61.0.3358/torch_ws$ snpe-dlc-quantize --input_dlc my_model.dlc --output_dlc quant_model.dlc --input_list a.txt --override_params --debug3
[INFO] InitializeStderr: DebugLog initialized.
[DEBUG1] Reading DLC: my_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert quant_params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert quant_params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert quant_params
[INFO] Writing intermediate model
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Reading DLC: quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Quantizing weights with bw 8, min 0.000000, max 0.010000 delta 0.000039, offset 0.000000 for layer name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] Quantizing weights with bw 8, min -0.223613, max 0.225374 delta 0.001761, offset -127.000000 for layer name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] Quantizing weights with bw 8, min 0.000000, max 0.010000 delta 0.000039, offset 0.000000 for layer name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] Quantizing weights with bw 8, min -0.027285, max 0.027072 delta 0.000213, offset -128.000000 for layer name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Reading DLC: quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DnnRuntime::CreateNetwork open quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Done getting network info
[DEBUG1] No resizing of network is requested.
[DEBUG1] Layer runtime: position = 0, type = 1
[DEBUG1] Layer runtime: position = 1, type = 1
[DEBUG1] Layer runtime: position = 2, type = 1
[DEBUG1] Output Tensor name for the model: StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] Output Tensor name for the model: dense:0
[DEBUG1] Output Tensor name for the model: input:0
[DEBUG1] Created input transition for buffer input:0
[DEBUG1] Added network output transition for input:0
[DEBUG1] Added network output transition for StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] Added network output transition for dense:0
[DEBUG1] Invalid cache record found. Cannot use cache.
[DEBUG1] Subnet number: 1
[DEBUG1] Network descriptor partition start 0 end 3
[DEBUG1] NetworkDescriptor::Partition adding input buffer name input:0
[DEBUG1] NetworkDescriptor::Partition adding output buffer name input:0
[DEBUG1] Network descriptor pushing layer input:0 into subnet
[DEBUG1] NetworkDescriptor::Partition adding input buffer name input:0
[DEBUG1] NetworkDescriptor::Partition adding output buffer name StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] Network descriptor pushing layer StatefulPartitionedCall/sequential/conv2d/Conv2D into subnet
[DEBUG1] NetworkDescriptor::Partition adding input buffer name StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] NetworkDescriptor::Partition adding output buffer name dense:0
[DEBUG1] Network descriptor pushing layer StatefulPartitionedCall/sequential/dense/MatMul into subnet
[DEBUG1] Entering CopyRuntimeInfo, trying to get RuntimeSpecificInfo
[DEBUG1] Adding layer input:0 id 0
[DEBUG1] Adding layer StatefulPartitionedCall/sequential/conv2d/Conv2D id 1
[DEBUG1] LayerCpu::SetupBuffers
[DEBUG1] LayerCpu::SetupBuffers input name input:0 buf 0x0x275b710
[DEBUG1] LayerCpu::SetupBuffers output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0 buf 0x0x275b7d0
[DEBUG1] Adding layer StatefulPartitionedCall/sequential/dense/MatMul id 2
[DEBUG1] LayerCpu::SetupBuffers
[DEBUG1] LayerCpu::SetupBuffers input name StatefulPartitionedCall/sequential/conv2d/Conv2D:0 buf 0x0x275b7d0
[DEBUG1] LayerCpu::SetupBuffers output name dense:0 buf 0x0x26db0f0
[DEBUG1] NeuralNetworkCpu::FinishInit
[DEBUG1] *** Loading images from input list: a.txt***
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [1/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [2/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [3/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [4/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [5/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1] 
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [6/6]
[DEBUG1] Overriding activation quantization with bw 8, min 0.000000, max 12.823444 delta 0.050288, offset 0.000000 for tensor input:0
[INFO] Setting activation for layer: input:0 and buffer: input:0
[INFO] bw: 8, min: 0.000000, max: 12.823444, delta: 0.050288, offset: 0.000000
[DEBUG1] Overriding activation quantization with bw 8, min 0.000000, max 12.823444 delta 0.050288, offset 0.000000 for tensor StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[INFO] Setting activation for layer: StatefulPartitionedCall/sequential/conv2d/Conv2D and buffer: StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[INFO] bw: 8, min: 0.000000, max: 12.823444, delta: 0.050288, offset: 0.000000
[INFO] Setting activation for layer: StatefulPartitionedCall/sequential/dense/MatMul and buffer: dense:0
[INFO] bw: 8, min: -1.077195, max: 0.492432, delta: 0.006155, offset: -175.000000
[INFO] Writing quantized model to: quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[INFO] DebugLog shutting down.

 

  • Up0
  • Down0

Opinions expressed in the content posted here are the personal opinions of the original authors, and do not necessarily reflect those of Qualcomm Incorporated or its subsidiaries (“Qualcomm”). The content is provided for informational purposes only and is not meant to be an endorsement or representation by Qualcomm or any other party. This site may also provide links or references to non-Qualcomm sites and resources. Qualcomm makes no representations, warranties, or other commitments whatsoever about any non-Qualcomm sites or third-party resources that may be referenced, accessible from, or linked to this site.