This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Flash error correction on XC167CI

Hello! I try to estimate as far as is necessary to perform CRC 8 in XC167. I've read about ECC (Error Correction Code) which is used when Flash accessing. So, every 64-bit data of flash refers to 8-bit ECC which automatically formed with Hamming codes when programming flash. Is it right? This corrects only single-bit errors. When double-bit error occured it trigger an Access Fault trap. User's manual recommends to reprogram (refresh) wordline with single-bit error to prevent possible double-bit error. I think it is not suitable in real-time application when unit is in operation.

I have next questions:

1. Does the ECC sufficient to provide normal operation? Does CRC needed when ECC already work?

2. How to identify double-bit error and identify Access Fault trap and then process it to indicate fault condition? (continuously check FSR or what else?) Maybe there are examples of code correction in real application in systems with a very long service life.

3. Maybe there are common guidelines to check proper functionality of XC167CI?

Thanks for any advice!

  • Read section 3.9.3 of the system user's manual…

    Data integrity is supported by the Error Correction Code (ECC). This ECC is dynamically (by the internal flash state machine) generated during Flash write operations and stored in the Flash array together with the corresponding data. For each read access the associated 8-bit ECC is fetched together with the 64-bit read data and is evaluated.

    Single-bit errors are detected and automatically corrected on-the-fly (during run-time). Therefore, single bit errors do not affect system operation.

    Double-bit errors are detected and trigger an Access Fault trap. This prevents erroneous instructions or data from being used. You have the possibility to erase the wordline (256-bytes) and reprogram the pages. This is entirely possible and feasible in a real-time system provided you have the ability to suspend the normal operation until the update is finished.

    Generally you would still perform a check of the memory periodically albeit at different margin levels so you can detect a "weak bit". Then correct it before you get to a point of a double bit failure.

    If you get a double-bit error you will get a Class B trap "Program Memory Access Error".

    All of this information is in the manual but implementation is left to the user since the user has to decide on the behavior of the system.

  • I've read user's manual, and I understand what correction features XC167 supports. My question was mostly about software implementation because of lack of experience in using embedded controllers. How to handle traps and what to do if double bit error occured? In hard real time I have no time to stop algorithm and save/reprogram wordline. Does CRC needed if there is ECC? I thougt that these questions are common.. Thanks!

  • What you do in an embedded system when you get a trap depends very much on the nature of your system. For a safety critical system you would usually have some failsafe halted state, and you should enter that state. If the processor has a user interface, or is part of a network, it should report the occurance of the trap because a system with flash double bit errors will need to be replaced.