Hi all,
recently, I am trying to implement neural networks on one of my Cortex-M devices. However, I wasn't know what "int_width" in arm_nn_activations_direct_q7 is about. The definition in the document is: "bit-width of the integer part, assume to be smaller than 3" but it is still unclear to me. Please give me some example about it as I am trying to transfer neural networks from other DL framework like tensorflow and pytorch. Many thanks!!
So in the quantization point of view, the int_width is a scale factor that remaps the float values into [-8, 8), I suppose?
As I undertand it, q7_t is a single data-type being used as one of q7, q1.6, q2.5, or q3.4 (all in TI notation). By default, q7_t means q7 (in TI notation). To distinguish between different uses of q7_t, they introduced another variable int_width, so q7_t with int_width==3 is q3.4 (TI), q7_t with int_width==0 is q7 (TI), etc. They could have created different types q7_t, q1_6_t, q2_5_t, q3_4_t, etc to make the Q format explicit, i.e. the API could have been designed a bit better.
So how could we know the output of the previous layer is q1.6, q2.5 or q3.4 according to the library?
I am unable answer that question. The caller of arm_nn_activations_direct_q7 must know the format/range of the input and so must appopriately select the int_width value.