This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Big-endian kernel run on Juno board

Hi All,

     My Juno board could run a big-endian kernel, but it just work in UP mode, when I compiled

the kernel in SMP mode, it always halted in "arch_spin_lock".

     I checked the PE status, when halted, pc register alway point to line 3 which is the code of

"arch_spin_lock".

     line1     "     sevl\n"

     line2     "2:  wfe\n"

     line3     "     ldaxrh %w2, %4\n"         //pc always point here

     I don't know how this could happen, and really need some help to make Juno board run a SMP

kernel.

Thanks

Best regards

Parents
  • Hello,

    Please note that as per our response to your previous post here, running a big-endian Linux kernel on the Juno is not officially supported. With that said, I'll outline my thoughts below.

    While executing a WFI or WFE instruction, the PC will point to the address of the next instruction. This means that the PE is getting stuck on the WFE waiting for an event, hence the PC pointing to the LDAXRH instruction.

    The arch_spin_lock() function is defined here.

    34 static inline void arch_spin_lock(arch_spinlock_t *lock)

    35 {

    36     unsigned int tmp;

    37     arch_spinlock_t lockval, newval;

    38

    39     asm volatile(

    40     /* Atomically increment the next ticket. */

    41 "   prfm pstl1strm, %3\n"

    42 "1: ldaxr %w0, %3\n"

    43 "   add %w1, %w0, %w5\n"

    44 "   stxr %w2, %w1, %3\n"

    45 "   cbnz %w2, 1b\n"

    46     /* Did we get the lock? */

    47 "   eor %w1, %w0, %w0, ror #16\n"

    48 "   cbz %w1, 3f\n"

    49     /*

    50      * No: spin on the owner. Send a local event to avoid missing an

    51      * unlock before the exclusive load.

    52      */

    53 "   sevl\n"

    54 "2: wfe\n"

    55 "   ldaxrh %w2, %4\n"

    56 "   eor %w1, %w2, %w0, lsr #16\n"

    57 "   cbnz %w1, 2b\n"

    58     /* We got the lock. Critical section starts here. */

    59 "3:"

    60     : "=&r" (lockval), "=&r" (newval), "=&r" (tmp), "+Q" (*lock)

    61     : "Q" (lock->owner), "I" (1 << TICKET_SHIFT)

    62     : "memory");

    63 }

    The spinlock is failing to acquire the lock, so makes a single pass of the second loop (lines 54-57) before getting stuck waiting for an event at line 54 (the SEVL on line 53 allowed the PE to pass line 54's WFE on the first pass, but the CBNZ on line 57 loops back to line 54, so there's no event this time).

    The idea here is that the PE will receive a WFE wakeup event when the current owner of the spinlock releases the monitor using an STLR (Store Release) instruction, which you can see in the arch_spin_unlock() function here.

    So with all of that said, two possibilities come to mind:

    1. The owner of the spinlock is never calling arch_spin_unlock()
    2. The owner of the spinlock is calling arch_spin_unlock(), but the dead-locked PE is not receiving a WFE wakeup event
    3. The owner of the spinlock is the same PE calling arch_spin_lock(), and is doing so before it has released the spinlock

    So if I were you I would start by asking yourself the following:

    1. How many CPUs are running? Is this the only CPU? If so, why is it trying to reacquire the spinlock before it's released it?
    2. How many CPUs are dead-locked? All or only some? Can you try forcing the spinlock owner to call arch_spin_unlock()?

    As I mentioned earlier, running a big-endian Linux kernel is not officially supported on the Juno so you may have to consult the open source community regarding this issue.

    I hope this helps,

    Ash.

Reply
  • Hello,

    Please note that as per our response to your previous post here, running a big-endian Linux kernel on the Juno is not officially supported. With that said, I'll outline my thoughts below.

    While executing a WFI or WFE instruction, the PC will point to the address of the next instruction. This means that the PE is getting stuck on the WFE waiting for an event, hence the PC pointing to the LDAXRH instruction.

    The arch_spin_lock() function is defined here.

    34 static inline void arch_spin_lock(arch_spinlock_t *lock)

    35 {

    36     unsigned int tmp;

    37     arch_spinlock_t lockval, newval;

    38

    39     asm volatile(

    40     /* Atomically increment the next ticket. */

    41 "   prfm pstl1strm, %3\n"

    42 "1: ldaxr %w0, %3\n"

    43 "   add %w1, %w0, %w5\n"

    44 "   stxr %w2, %w1, %3\n"

    45 "   cbnz %w2, 1b\n"

    46     /* Did we get the lock? */

    47 "   eor %w1, %w0, %w0, ror #16\n"

    48 "   cbz %w1, 3f\n"

    49     /*

    50      * No: spin on the owner. Send a local event to avoid missing an

    51      * unlock before the exclusive load.

    52      */

    53 "   sevl\n"

    54 "2: wfe\n"

    55 "   ldaxrh %w2, %4\n"

    56 "   eor %w1, %w2, %w0, lsr #16\n"

    57 "   cbnz %w1, 2b\n"

    58     /* We got the lock. Critical section starts here. */

    59 "3:"

    60     : "=&r" (lockval), "=&r" (newval), "=&r" (tmp), "+Q" (*lock)

    61     : "Q" (lock->owner), "I" (1 << TICKET_SHIFT)

    62     : "memory");

    63 }

    The spinlock is failing to acquire the lock, so makes a single pass of the second loop (lines 54-57) before getting stuck waiting for an event at line 54 (the SEVL on line 53 allowed the PE to pass line 54's WFE on the first pass, but the CBNZ on line 57 loops back to line 54, so there's no event this time).

    The idea here is that the PE will receive a WFE wakeup event when the current owner of the spinlock releases the monitor using an STLR (Store Release) instruction, which you can see in the arch_spin_unlock() function here.

    So with all of that said, two possibilities come to mind:

    1. The owner of the spinlock is never calling arch_spin_unlock()
    2. The owner of the spinlock is calling arch_spin_unlock(), but the dead-locked PE is not receiving a WFE wakeup event
    3. The owner of the spinlock is the same PE calling arch_spin_lock(), and is doing so before it has released the spinlock

    So if I were you I would start by asking yourself the following:

    1. How many CPUs are running? Is this the only CPU? If so, why is it trying to reacquire the spinlock before it's released it?
    2. How many CPUs are dead-locked? All or only some? Can you try forcing the spinlock owner to call arch_spin_unlock()?

    As I mentioned earlier, running a big-endian Linux kernel is not officially supported on the Juno so you may have to consult the open source community regarding this issue.

    I hope this helps,

    Ash.

Children