This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Optimizing specific code

Hi,
I'm searching of an optimization of the following code:

void prepareData(uint16_t* dataOut, uint16_t* dataIn, uint32_t length)
{
        uint32_t i;
        for (i = 0; i < length; i += 2)
        {
                dataOut[i] = (dataIn[i+1] >> 4) & 0x03FF;
                dataOut[i+1] = (dataIn[i] >> 4) & 0x03FF;
        }
}

It's just swapping 2 16-bit words. shifting them by 4 and setting the upper 6 bits to 0.
I already tried the hints from http://www.keil.com/support/man/docs/armcc/armcc_cjajacch.htm . But its getting slower with decrementing counter.

It's taking about 50ms (55ms with decrementing counter) for a length of 350000.
Target: AT91SAM9260, executed from external RAM.

Parents

0 Marcus Harnisch over 16 years ago in reply to Stefan Hartwig

> I tried to enable it, by setting it to 0x0005107D -
> MMU and DCache enabled - but the processor then hangs.
> Is there a special proceeding to enable the data cache?

Did you set up a page table at all? The MMU needs one to work properly. Don't forget to initialize cp15,c2 (TTB). RTFTRM ;-)

Looking at the assembler output, I am not sure if the unrolled loop is better than the single-word parallel version that I posted.

Regards
Marcus
http://www.doulos.com/arm/

PS: -Otime seems to be detrimental to performance (RealView Compiler) of all variants that have been posted here.
Cancel
Vote up 0 Vote down

Cancel

Reply

0 Marcus Harnisch over 16 years ago in reply to Stefan Hartwig

> I tried to enable it, by setting it to 0x0005107D -
> MMU and DCache enabled - but the processor then hangs.
> Is there a special proceeding to enable the data cache?

Did you set up a page table at all? The MMU needs one to work properly. Don't forget to initialize cp15,c2 (TTB). RTFTRM ;-)

Looking at the assembler output, I am not sure if the unrolled loop is better than the single-word parallel version that I posted.

Regards
Marcus
http://www.doulos.com/arm/

PS: -Otime seems to be detrimental to performance (RealView Compiler) of all variants that have been posted here.
Cancel
Vote up 0 Vote down

Cancel

Children

No data