ubuntu 18.04
snpe 1.61
i can only modify activation quantized value, but not the weights. what's wrong?
my model is defined using tf2.3
-------------------------------------------------
import os
import tensorflow as tf
import tensorflow.keras as keras
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, Flatten, Dense
# Custom layer for quantizing weights
# Define a Sequential model
model = tf.keras.Sequential([
keras.layers.InputLayer(input_shape=(28, 28,1),name='input'),
keras.layers.Conv2D(filters=12, kernel_size=(3, 3),use_bias=False),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(10,use_bias=False)
])
import numpy as np
x=np.zeros((1,28,28,1))
y=model(x)
tf.saved_model.save(model,"my_model")
------------------------------------------------------------
and edit the quant json file as follows
---------------------------------------------------------------
{
"activation_encodings": {
"input:0": [
{
"bitwidth": 8,
"max": 12.82344407824954,
"min": 0.0,
"offset": 0,
"scale": 0.050288015993135454
}
],
"StatefulPartitionedCall/sequential/conv2d/Conv2D:0": [
{
"bitwidth": 8,
"max": 12.82344407824954,
"min": 0.0,
"offset": 0,
"scale": 0.050288015993135454
}
]
},
"param_encodings": {
"StatefulPartitionedCall/sequential/conv2d/Conv2D/biases": [
{
"bitwidth": 8,
"max": 1.700559472933134,
"min": -2.1006477158567995,
"offset": 140,
"scale": 0.01490669485799974
}
],
"StatefulPartitionedCall/sequential/conv2d/Conv2D/weights": [
{
"bitwidth": 8,
"max": 1.700559472933134,
"min": -2.1006477158567995,
"offset": 139,
"scale": 0.01490669485799974
}
]
}
}
-----------------------------------------------
snpe-tensorflow-to-dlc --input_network my_model --input_dim input "1,28,28,1" --out_node "dense" --output_path my_model.dlc --quantization_overrides q.json --show_unconsumed_nodes
2024-03-14 14:58:42.548331: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN)to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2024-03-14 14:58:42.571246: I tensorflow/core/platform/profile_utils/cpu_utils.cc:104] CPU Frequency: 2592005000 Hz
2024-03-14 14:58:42.572645: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x22703e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2024-03-14 14:58:42.572753: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2024-03-14 14:58:42,574 - 214 - INFO - Processing user provided quantization encodings:
2024-03-14 14:58:42.661712: I tensorflow/core/grappler/devices.cc:78] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2024-03-14 14:58:42.661841: I tensorflow/core/grappler/clusters/single_machine.cc:356] Starting new session
2024-03-14 14:58:42.667244: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:816] Optimization results for grappler item: graph_to_optimize
2024-03-14 14:58:42.667307: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: Graph size after: 16 nodes (11), 15 edges (10), time = 0.48ms.
2024-03-14 14:58:42.667316: I tensorflow/core/grappler/optimizers/meta_optimizer.cc:818] function_optimizer: function_optimizer did nothing. time = 0.008ms.
2024-03-14 14:58:42,684 - 214 - INFO - INFO_TF_CHANGE_NODE_NAME: Change node name: Identity with name dense
2024-03-14 14:58:42,713 - 214 - INFO - INFO_ALL_BUILDING_NETWORK:
==============================================================
Building Network
==============================================================
2024-03-14 14:58:42,721 - 214 - INFO - Resolving static sub-graphs in network...
2024-03-14 14:58:42,722 - 214 - INFO - Resolving static sub-graphs in network, complete.
2024-03-14 14:58:42,724 - 214 - INFO - Processed 3 quantization encodings
2024-03-14 14:58:42,732 - 214 - INFO - INFO_DLC_SAVE_LOCATION: Saving model at my_model.dlc
2024-03-14 14:58:42,735 - 214 - INFO - INFO_CONVERSION_SUCCESS: Conversion completed successfully
----------------------------------------------
~/snpe-1.61.0.3358/torch_ws$ snpe-dlc-quantize --input_dlc my_model.dlc --output_dlc quant_model.dlc --input_list a.txt --override_params --debug3
[INFO] InitializeStderr: DebugLog initialized.
[DEBUG1] Reading DLC: my_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert quant_params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert quant_params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert quant_params
[INFO] Writing intermediate model
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Reading DLC: quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Quantizing weights with bw 8, min 0.000000, max 0.010000 delta 0.000039, offset 0.000000 for layer name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] Quantizing weights with bw 8, min -0.223613, max 0.225374 delta 0.001761, offset -127.000000 for layer name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] Quantizing weights with bw 8, min 0.000000, max 0.010000 delta 0.000039, offset 0.000000 for layer name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] Quantizing weights with bw 8, min -0.027285, max 0.027072 delta 0.000213, offset -128.000000 for layer name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Reading DLC: quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DnnRuntime::CreateNetwork open quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] Done getting network info
[DEBUG1] No resizing of network is requested.
[DEBUG1] Layer runtime: position = 0, type = 1
[DEBUG1] Layer runtime: position = 1, type = 1
[DEBUG1] Layer runtime: position = 2, type = 1
[DEBUG1] Output Tensor name for the model: StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] Output Tensor name for the model: dense:0
[DEBUG1] Output Tensor name for the model: input:0
[DEBUG1] Created input transition for buffer input:0
[DEBUG1] Added network output transition for input:0
[DEBUG1] Added network output transition for StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] Added network output transition for dense:0
[DEBUG1] Invalid cache record found. Cannot use cache.
[DEBUG1] Subnet number: 1
[DEBUG1] Network descriptor partition start 0 end 3
[DEBUG1] NetworkDescriptor::Partition adding input buffer name input:0
[DEBUG1] NetworkDescriptor::Partition adding output buffer name input:0
[DEBUG1] Network descriptor pushing layer input:0 into subnet
[DEBUG1] NetworkDescriptor::Partition adding input buffer name input:0
[DEBUG1] NetworkDescriptor::Partition adding output buffer name StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] Network descriptor pushing layer StatefulPartitionedCall/sequential/conv2d/Conv2D into subnet
[DEBUG1] NetworkDescriptor::Partition adding input buffer name StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[DEBUG1] NetworkDescriptor::Partition adding output buffer name dense:0
[DEBUG1] Network descriptor pushing layer StatefulPartitionedCall/sequential/dense/MatMul into subnet
[DEBUG1] Entering CopyRuntimeInfo, trying to get RuntimeSpecificInfo
[DEBUG1] Adding layer input:0 id 0
[DEBUG1] Adding layer StatefulPartitionedCall/sequential/conv2d/Conv2D id 1
[DEBUG1] LayerCpu::SetupBuffers
[DEBUG1] LayerCpu::SetupBuffers input name input:0 buf 0x0x275b710
[DEBUG1] LayerCpu::SetupBuffers output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0 buf 0x0x275b7d0
[DEBUG1] Adding layer StatefulPartitionedCall/sequential/dense/MatMul id 2
[DEBUG1] LayerCpu::SetupBuffers
[DEBUG1] LayerCpu::SetupBuffers input name StatefulPartitionedCall/sequential/conv2d/Conv2D:0 buf 0x0x275b7d0
[DEBUG1] LayerCpu::SetupBuffers output name dense:0 buf 0x0x26db0f0
[DEBUG1] NeuralNetworkCpu::FinishInit
[DEBUG1] *** Loading images from input list: a.txt***
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [1/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [2/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [3/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [4/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [5/6]
[DEBUG1] Reading Non-Nv21 input files.
[DEBUG1] Running ForwardPropagate on the next network id 0
[DEBUG1] NeuralNetworkCpu running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu done running layer number 0, name input:0
[DEBUG1] NeuralNetworkCpu running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] NeuralNetworkCpu done running layer number 1, name StatefulPartitionedCall/sequential/conv2d/Conv2D
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] NeuralNetworkCpu running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] NeuralNetworkCpu done running layer number 2, name StatefulPartitionedCall/sequential/dense/MatMul
[DEBUG1] ClusterMgr::getInstance: for ThreadId : 140056158484416
[DEBUG1]
[DEBUG1] Done ForwardPropagate on the network id 0
[DEBUG1] Getting output name input:0, buffer dims {1, 28, 28, 1}, size 784
[DEBUG1] Getting output name StatefulPartitionedCall/sequential/conv2d/Conv2D:0, buffer dims {1, 26, 26, 12}, size 8112
[DEBUG1] Getting output name dense:0, buffer dims {1, 10}, size 10
[DEBUG1] Successfully parsed input [6/6]
[DEBUG1] Overriding activation quantization with bw 8, min 0.000000, max 12.823444 delta 0.050288, offset 0.000000 for tensor input:0
[INFO] Setting activation for layer: input:0 and buffer: input:0
[INFO] bw: 8, min: 0.000000, max: 12.823444, delta: 0.050288, offset: 0.000000
[DEBUG1] Overriding activation quantization with bw 8, min 0.000000, max 12.823444 delta 0.050288, offset 0.000000 for tensor StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[INFO] Setting activation for layer: StatefulPartitionedCall/sequential/conv2d/Conv2D and buffer: StatefulPartitionedCall/sequential/conv2d/Conv2D:0
[INFO] bw: 8, min: 0.000000, max: 12.823444, delta: 0.050288, offset: 0.000000
[INFO] Setting activation for layer: StatefulPartitionedCall/sequential/dense/MatMul and buffer: dense:0
[INFO] bw: 8, min: -1.077195, max: 0.492432, delta: 0.006155, offset: -175.000000
[INFO] Writing quantized model to: quant_model.dlc
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[DEBUG1] DlContainerImpl::CreateCatalog try insert dlc.metadata
[DEBUG1] DlContainerImpl::CreateCatalog try insert model
[DEBUG1] DlContainerImpl::CreateCatalog try insert model.params
[INFO] DebugLog shutting down.