This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Vectorizing Compiler

Note: This was originally posted on 29th June 2010 at http://forums.arm.com

Hi,

Please see the following tool chain

CPP=arm-none-linux-gnueabi-gcc
SWS=-march=armv7-a -mtune=cortex-a8 -mfpu=neon -mfloat-abi=softfp -flax-vector-conversions
Target is beegle board

How can i disable the vectorization.
If i give the above tool chain, it will create a default vectorized code for the given C source
if i write the NEON C intrinsics then will the compiler overrides its optimization and use the programmer neon direction.

Please help me to solve the doubts
Parents
  • Note: This was originally posted on 1st July 2010 at http://forums.arm.com

    [...]
    Then in my IMViewer application i take the performence of both versions
    the code fragment is given below

    [...]
    void main(int argc, char**argv)
    {
      gettimeofday(&First, NULL);
      [...]
    }


    I'd suggest using 'times()' or 'getrusage(RUSAGE_SELF, ...)' instead of 'gettimeofday()' since gettimeofday will be measuing other processes, too, not just yours.  The other functions should be less suseptible to interference from outside sources and give you more consistent numbers.  But it may not make much difference on a quiet system.

    And the pedant in me says, that should be 'int main() { ... return 0; }'  -- 'void main() { ... }' isn't really legal.  But that's not causing any timing difference.

    But sadly the performance for version 2 is not good. It is near to C version. I don't spot
    what is the problem here !


    Since you're specifying -O3 for the C version, gcc may be doing vectoriztion.  You can add -ftree-vectorizer-verbose=2 and look for 'LOOP VECTORIZED' in gcc's messages.  Or you can 'arm-...-objdump -d' the .o file (or even the executable?) and look for the vector instructions.


    Following doubts still exists

    1. Will i can configure the L1 and L2 cache size of OS kernel?

    No, the kernel should enable and deal with the caches -- that's part of it's job.

    2. Is there any hand  written assembly is needed for enable the Neon processor of beegle board

    That's also the kernel's job.  If you executed a NEON instruction with a kernel that had NEON disabled, I'd expect your process to killed by SIGILL.  'uname -a' will tell us the kernel version number.

    3. My gcc version is Red Hat 3.4.4-2

    That looks like the host compiler.  I should have said 'arm-none-linux-gnueabi-gcc --version'
Reply
  • Note: This was originally posted on 1st July 2010 at http://forums.arm.com

    [...]
    Then in my IMViewer application i take the performence of both versions
    the code fragment is given below

    [...]
    void main(int argc, char**argv)
    {
      gettimeofday(&First, NULL);
      [...]
    }


    I'd suggest using 'times()' or 'getrusage(RUSAGE_SELF, ...)' instead of 'gettimeofday()' since gettimeofday will be measuing other processes, too, not just yours.  The other functions should be less suseptible to interference from outside sources and give you more consistent numbers.  But it may not make much difference on a quiet system.

    And the pedant in me says, that should be 'int main() { ... return 0; }'  -- 'void main() { ... }' isn't really legal.  But that's not causing any timing difference.

    But sadly the performance for version 2 is not good. It is near to C version. I don't spot
    what is the problem here !


    Since you're specifying -O3 for the C version, gcc may be doing vectoriztion.  You can add -ftree-vectorizer-verbose=2 and look for 'LOOP VECTORIZED' in gcc's messages.  Or you can 'arm-...-objdump -d' the .o file (or even the executable?) and look for the vector instructions.


    Following doubts still exists

    1. Will i can configure the L1 and L2 cache size of OS kernel?

    No, the kernel should enable and deal with the caches -- that's part of it's job.

    2. Is there any hand  written assembly is needed for enable the Neon processor of beegle board

    That's also the kernel's job.  If you executed a NEON instruction with a kernel that had NEON disabled, I'd expect your process to killed by SIGILL.  'uname -a' will tell us the kernel version number.

    3. My gcc version is Red Hat 3.4.4-2

    That looks like the host compiler.  I should have said 'arm-none-linux-gnueabi-gcc --version'
Children
No data