We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
CMSIS Driver UART Revision: V1.02
file C:\Keil\ARM\Pack\Keil\STM32F4xx_DFP\1.0.8\RTE_Driver\UART_STM32F4xx
function UART_ReadData
static int32_t UART_ReadData (uint8_t *data, uint32_t size, UART_RESOURCES *ptr_uart) { volatile uint32_t val; osMutexWait(ptr_uart->info->mutex_read, osWaitForever); if ((size == 0) || (ptr_uart->info->rx.empty == true)) { osMutexRelease(ptr_uart->info->mutex_read); return 0; } val = (ptr_uart->info->rx.idx_in - ptr_uart->info->rx.idx_out) & (ptr_uart->rx_buffer_size - 1); if (val == 0) val = ptr_uart->rx_buffer_size; if (size > val) size = val; if ((ptr_uart->hw_flow == true) && (ptr_uart->manual_rts_cts == false)) { if (ptr_uart->flow_ctrl_threshold < val) GPIO_PinWrite(ptr_uart->pin_rts.port, ptr_uart->pin_rts.num, 0); } if (size > (ptr_uart->rx_buffer_size - ptr_uart->info->rx.idx_out)) { memcpy(data, ptr_uart->rx_buffer + ptr_uart->info->rx.idx_out, val); memcpy(data + val, ptr_uart->rx_buffer, size - val); if (ptr_uart->info->rx.idx_in == (size - val)) ptr_uart->info->rx.empty = true; ptr_uart->info->rx.idx_out = size - val; } else { memcpy(data, ptr_uart->rx_buffer + ptr_uart->info->rx.idx_out, size); val = (ptr_uart->info->rx.idx_out + size) & (ptr_uart->rx_buffer_size - 1); if (ptr_uart->info->rx.idx_in == val) ptr_uart->info->rx.empty = true; ptr_uart->info->rx.idx_out = val; }
Guess what will be if variable size is less than val?
It's called production-ready code.
Someone else have already set up the test bench and walked through all corner cases.
Or maybe not...
But the code is at least well documented with source code comments.
And I like a UART driver that not only uses a ring buffer with individual rx and tx pointers. But also adds a separate flag "empty". All to make sure that it suddenly needs to use a mutex for synchronization. And here I thought people used ring buffers because it separated the ownership of the rx and tx pointers, allowing the producer and consumer to access the ring buffer without any locking.
I know that I don't know how to fix the mentioned problem properly, so I would like to learn more, if possible.
if ( (size - val) > 0 ) memcpy(data + val, ptr_uart->rx_buffer, size - val);
I guess I can fix it by the above if statement. But is this a correct way? Should I also consider the branch prediction overhead?
To reproduce this issue:
As an example, UART_BAUDRATE = 921600; default ptr_uart->rx_buffer_size = 0x80 (128) HSE = 25MHz. Internall PLL is Off (system_clock is also 25MHz).
Send any bytes continuously to stm32f4 UART from outside.
On stm32f4 side:
#define BUF_SIZE 256 uint8_t buf[BUF_SIZE]; uint8_t num_read; uint8_t num = 10; while(1) { osDelay(1000); num_read = sect_uart.ReadData(buf, num); }
In this particular example, the condition
(size > (ptr_uart->rx_buffer_size - ptr_uart->info->rx.idx_out) && (size < val)
will rise in (rx_buffer_size / num) = 128/10 = 12 seconds after start of that code. This issue will happen only if ((rx_buffer_size % num) != 0). That means, hard_fault won't happen if num = 1,2,4,.. (in case rx_buffer_size = 128).
This can be fixed as you(John) suggested. But you also should check first memcpy for correctness:
memcpy(data, ptr_uart->rx_buffer + ptr_uart->info->rx.idx_out, val);
I start a separate thread for different purpose.
if statement and Branch Penalty http://www.keil.com/forum/58512/
The first memcpy is not correct also, because it over reads the input buffer. So first I've added the following line before the first memcpy:
//number of characters before the end of the input buffer val=ptr_uart->rx_buffer_size-ptr_uart->info->rx.idx_out;
After I've added this line, then I checked the following logic, and everything seems to be correct. So finally the code, which is works fine for me:
if (size > (ptr_uart->rx_buffer_size - ptr_uart->info->rx.idx_out)) { val=ptr_uart->rx_buffer_size-ptr_uart->info->rx.idx_out; memcpy(data, ptr_uart->rx_buffer + ptr_uart->info->rx.idx_out, val); memcpy(data + val, ptr_uart->rx_buffer, size - val); if (ptr_uart->info->rx.idx_in == (size - val)){ ptr_uart->info->rx.empty = true; } ptr_uart->info->rx.idx_out = size - val; }
When I tested the original version, that causes hard fault. The same test works fine with this modified version. There is no hard fault, nor data losses.
I hope, that I didn't make another bug, which is not discovered yet, but if yes, please post a comment.
Attila Zombori
For the special case where the processor has 2^n wide registers with "traditional" overflow of unsigned integers and where the ring buffer has 2^n elements, it would have been way better if you had completely rewritten the code into a more classical ring-buffer solution. This CMSIS implementation is just poor code from the start.
right,it is good,also bug in other dirvers for ST like STM32F10X,STM32F20X,STM32F30X