We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
Hi Experts,
unsigned int var1_32; unsigned int var2_32; unsigned short int var1_16; unsigned short int var2_16; unsigned char var1_8; unsigned char var2_8;
In the above declarations which is faster,
if(var1_32 == var2_32) { }
or
if(var1_16 == var2_16) { }
if(var1_8 == var2_8) { }
As a follow-on to Chris' comment about type conversions often coming for free, it's worth pointing out that compilers also know that there is no need to do type conversions for intermediate results. For example, in the following code:
extern unsigned char c[4]; unsigned char sum(void) { return c[0] + c[1] + c[2] + c[3]; }
... the compiler may generate code like this for the core of sum():
sum()
ldrb r1, [r3] ldrb r2, [r3, #2] add r0, r0, r1 ldrb r3, [r3, #3] add r0, r0, r2 add r0, r0, r3 uxtb r0, r0
The upper bits of the intermediate result in r0 contain garbage in the form of overflowed bits, but the compiler knows that this doesn't affect the bits that are important for the result. Only one truncation is needed, at the end - and that it only needed because the procedure call standard requires the spare bits to be zero when returning a value of type unsigned char.
r0
unsigned char
If the function is inlined, the compiler doesn't need to follow the procedure call standard for this value and the uxtb will likely disappear.
uxtb
On the whole, you should not worry about which types are "more efficient" - the CPU architecture and implementation and the compiler between them will generally do a pretty good job. Good choice of algorithms and data representation, or using appropriate pre-optimized libraries for your program, have a much bigger impact on performance. This is the part the compiler can't do for you. Focusing on the code design also keeps your code more portable - important if you want it to perform well on both AArch32 and AArch64 for example.
AArch32
AArch64
It's definitely worth getting into the habit of disassembling the code coming out of the compiler - the optimisations the compiler applies (or fails to apply) can be very surprising, especially at high optimization levels.
Dave's comments are spot on here. In general, the compiler will do a very good job with what you give it. But some thought into the most efficient/appropriate/suitable data types will give it a lot of help.
One other reason which occurs to me for using "small" containers is the possibility of getting much more value out of SIMD instructions. The NEON architecture (and to a lesser extend the v6 SIMD extensions) are capable of handling a number of individual data items packed into wide vector registers. The smaller the items are, the more of them you can fit into a vector. This can pay huge dividends if used correctly.
Chris