So far I've been able to get the image-classifiers example running and we've all had a blast playing around with it taking pictures of things and seeing what it says. /my next step is to get our own networks installed and I've hit a road block. We are currently working with both inception-v4 and inception-resnet-v2 networks and I have not been able to get the snpe-tensorflow-to-dlc script to succeed on either architecture. I call it thusly:
snpe-tensorflow-to-dlc \
--graph graph_opt.pb \
--input_dim input 299,299,3 \
--out_node InceptionV4/Logits/Predictions \
--dlc inception-v4.dlc
And am presented with the following output:
2017-11-24 19:02:47.238671: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:02:47.238690: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:02:47.238708: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:02:47.238715: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:02:47.238731: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:02:47.299022: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-11-24 19:02:47.299060: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: tyrion
2017-11-24 19:02:47.299066: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: tyrion
2017-11-24 19:02:47.299105: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.90.0
2017-11-24 19:02:47.299131: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.66 Tue Aug 1 16:02:12 PDT 2017
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
"""
2017-11-24 19:02:47.299152: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.66.0
2017-11-24 19:02:47.299159: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version 384.66.0 does not match DSO version 384.90.0 -- cannot find working devices in this configuration
2017-11-24 19:02:59,727 - 388 - WARNING - WARNING_TF_SCOPE_OP_NOT_CONSUMED: Scope (InceptionV4/Logits/PreLogitsFlatten) operation(s) not consumed by converter: [u'Shape', u'Slice', u'Slice', u'Prod', u'ExpandDims'].
2017-11-24 19:02:59,727 - 122 - ERROR - Conversion failed: Some nodes in the Tensorflow graph were not resolved to a layer!
Similarly if I try to convert an inception-resnet-v2 network:
snpe-tensorflow-to-dlc \
--graph graph_opt.pb \
--input_dim input 299,299,3 \
--out_node InceptionResnetV2/Logits/Predictions \
--dlc inception-resnet-v2.dlc
I'm met with:
2017-11-24 19:05:56.731835: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:05:56.731854: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:05:56.731858: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:05:56.731861: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:05:56.731864: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
2017-11-24 19:05:56.747961: E tensorflow/stream_executor/cuda/cuda_driver.cc:406] failed call to cuInit: CUDA_ERROR_NO_DEVICE
2017-11-24 19:05:56.747989: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:158] retrieving CUDA diagnostic information for host: tyrion
2017-11-24 19:05:56.747995: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:165] hostname: tyrion
2017-11-24 19:05:56.748021: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:189] libcuda reported version is: 384.90.0
2017-11-24 19:05:56.748039: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:369] driver version file contents: """NVRM version: NVIDIA UNIX x86_64 Kernel Module 384.66 Tue Aug 1 16:02:12 PDT 2017
GCC version: gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04.3)
"""
2017-11-24 19:05:56.748054: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:193] kernel reported version is: 384.66.0
2017-11-24 19:05:56.748059: E tensorflow/stream_executor/cuda/cuda_diagnostics.cc:303] kernel version 384.66.0 does not match DSO version 384.90.0 -- cannot find working devices in this configuration
2017-11-24 19:06:04,941 - 388 - WARNING - WARNING_TF_SCOPE_OP_NOT_CONSUMED: Scope (InceptionResnetV2/InceptionResnetV2/Repeat/block35_1) operation(s) not consumed by converter: [u'Mul', u'Add'].
2017-11-24 19:06:04,941 - 122 - ERROR - Conversion failed: Some nodes in the Tensorflow graph were not resolved to a layer!
Am I doing something wrong? I presume that the CUDA messaging is not important at this stage, but perhaps that is indicative of a deeper problem?
A little more detail... Here are the nodes in question from the graph_def that it seems to be clompaining about.
inception-v4:
And from the inception-resnet-v2 graph_def:
You may want to upgrate to the latest SDK release and try using --allow_unconsumed_nodes options in the tensorflow-to-dlc tool.
Thanks for the prompt response! I downloaded version 1.8.0 (presumably the latest?) and tried adding the allow_unconsumed_nodes flag but it still fails, although with a much more profuse error output this time.
for inception-resnet-v2:
from inception-v4:
and I went ahead and tried my own inception-v3 assuming that would work since it's in the example but that failed as well:
The first two models are not currently supported I suspect because some of the Add/Mul operations have contants as inputs.
SNPE will only convert successfuly if the inputs of an element wise add/mul are actual layers (conv operations are not currently supported)
With regard to the inception-v3 model where is is from and what command line arguments are you using? The SDK documentation has an inception v3 example that you should be able to follow to completion.
I managed to get both inception-v3 and inception-v4 networks converted with some tweaks to the layers used.
I'm stuck with the inception-resnet-v2 network though. Without support for scalar multiplies and additions, and without support for tiling/repeating, there seems to be no path forward.