I have a couple questions about sconv_mm. I have been unable to make it produce the expected results.
1. The documentation says the data is in channel major format. I assume this means
C1 C1 C1 C1 C1 C1 C2 C2 C2 C2 C2 C2
If we have 2 channels and width is 3 and height is 2. Is this correct? And same for filters and the output?
2. I am surprised to see that the output width/height has to be specified. It should derive from input size and kernel_size/stride/pad. What is this used for? I assume the output is generated for each channel, and also channel major format as above?
Any hints how to make sconv_mm work are highly appreciate. Example code would be even more awesome.