This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

which kind of compiler optimization can be applied in this code?

compiler: linaro-aarch64-2020.09-gcc10.2-linux5.4

optimization option: -O3

CPU: Arm A53 1Ghz

Hello, this is newbie.

code1 is 3.1x slower than code2 

- code1: 106 ms

- code2: 34 ms

I think using constant in for-loop is the only(?) difference. 

I really wonder why such big performance difference between two code.

<code 1: img_bitshift function>

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void img_bitshift
(
CAMERA_OPAQUE_t *pstDevInfo,
int16_t img_width,
int16_t img_height,
int16_t bitshift
)
{
uint16_t *src_img = (uint16_t *) pstDevInfo->some_field.pVirt;
uint8_t *dst_img = (uint8_t *) pstDevInfo->some_field.pVirt;
for (int i = 0; i < img_height; i++)
{
for (int j = 0; j < img_width; j++)
{
uint16_t pixel = src_img[i*img_width + j];
dst_img[i*img_width + j] = pixel >> bitshift;
}
}
}
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

<code 2: copy and paste of img_bitshift function>

Fullscreen
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
void dummy
(
CAMERA_OPAQUE2_t *camerainfo,
DummyType *dummy
)
{
int32_t channelIndex = 0;
for( channelIndex = 0 ; channelIndex < 1 ; channelIndex++ )
{
// copy&paste of img_bitshift()
CAMERA_OPAQUE_t *pstDevInfo = camerainfo->channelDevice;
uint16_t *src_img = (uint16_t *) pstDevInfo->somefield.pVirt;
uint8_t *dst_img = (uint8_t *) pstDevInfo->somefield.pVirt;
// NOTE:-----------------------------------------
// Here, we used constant instead of variable!
// ----------------------------------------------
uint16_t img_width = 12800;
uint16_t img_height = 8000;
uint16_t bitshift = 8;
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX

Thank in advance.

0