Please note: We are aware of an issue affecting replies on the Arm Community forums, which may not be loading as expected.

We apologize for any inconvenience and appreciate your patience while we investigate and work to resolve the issue.

Thank you for your understanding.


This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

SSEtoNEON FAQ

_mm_sfence() and _mm_pause() are two of Intel instruction set.

Unlike ordinary operation instructions, they provide optimization functions. If I want to implement them on arm, is there any suitable instruction or statement to replace it?

Thanks.

  • Hi,

    Here are two repositories with the SIMD translations between two architectures:

    github.com/.../sse2neon

    github.com/.../ARM_NEON_2_x86_SSE

    From https://github.com/DLTcollab/sse2neon, you can download the code and find sse2neon.h, where _mm_sfence() and _mm_pause() are translated with compiler builtin functions & instructions, as there are no equivalent Neon instructions:

    /* Streaming Extensions */

    // Guarantees that every preceding store is globally visible before any
    // subsequent store.
    // msdn.microsoft.com/.../5h2w73d1(v=vs.90).aspx
    FORCE_INLINE void _mm_sfence(void)
    {
    __sync_synchronize();
    }

    // Pause the processor. This is typically used in spin-wait loops and depending
    // on the x86 processor typical values are in the 40-100 cycle range. The
    // 'yield' instruction isn't a good fit beacuse it's effectively a nop on most
    // Arm cores. Experience with several databases has shown has shown an 'isb' is
    // a reasonable approximation.
    FORCE_INLINE void _mm_pause()
    {
    __asm__ __volatile__("isb\n");
    }