This discussion has been locked.

You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Cache and store buffer maintenance in cortex-a8!

Hamed over 11 years ago

Dear All,

Technical data sheets for the ARM7500FE and ARM7100 say that:

"In the ARM Processor the cache will be searched regardless of the state of the C bit, only reads that miss the cache will be affected."

Now the question is that whether it holds for the Cortex-A8 processors family or not?

The other question is that when switching the ARM domains whether the store buffer gets automatically drained or we have to use the barriers for this?

Many thanks in advance.

Parents

0 Matt Sealey over 11 years ago in reply to Hamed

Hi Hamed,
When you say switch I'll take that to mean that you are changing the processor mode in the CPSR (either manually, or by returning from an exception in an approved manner). In this case, the mode switch is a context synchronization event and it will ensure that any state changes you have made will take effect upon entry to the new mode. This does imply an instruction pipeline flush (i.e. the same effect as an explicit 'isb' barrier instruction) but I could not find any architectural guarantee that the store buffer would be drained, nor anything in particular in the Cortex-A8 - it is not usually something you need to do before switching processes in an operating system (it does not matter, really, whether the writes for one OS application are still in the store buffer when you switch to another. Where it does, explicit synchronization primitives would usually be in use anyway, at the application level). If you have a concern then you should manually insert a 'dsb' barrier (see the note in section A3.8.3/A3-153 on Memory Barriers in the ARMv7-A/R Architecture Reference Manual).
In the statement you quoted, the C bit refers to the C bit in the System Control Register (SCTLR). This is the global enable for cacheability within an ARM core, although it's full effects are somewhat implementation defined - but the implementation defined effects are mitigated, as it is really not very common to modify this bit at runtime. Usually you turn it on and leave it on until you need to power down the cores at which time prevention of allocation into the cache is important so that you have a static set of data to clean and invalidate out to the next level of cache, and the next, and so on until you get to the memory subsystem. There is a really good discussion on this in section B2.2.3 (Cache enabling and disabling) of the ARMv7-A/R ARM, with the prudent points:

In ARMv7:

• SCTLR.C enables or disables all data and unified caches for data accesses, across all levels of cache visible to the processor. It is IMPLEMENTATION DEFINED whether it also enables or disables the use of unified caches for instruction accesses.

..

When a cache is disabled for a type of access, for a particular translation regime:

• it is IMPLEMENTATION DEFINED whether a cache hit occurs if a location that is held in the cache is accessed by that type of access.

• any location that is not held in the cache is not brought into the cache as a result of a memory access of that type.

Note that with SCTLR.C disabled, MMU table walks will act as if they access non-cacheable memory, too. With regards to the C bit within the translation tables, this note (in section B3.8.2) is important:

The B (Bufferable), C (Cacheable), and TEX (Type extension) bit names are inherited from earlier versions of the architecture. These names no longer adequately describe the function of the B, C, and TEX bits.

If you are used to older versions of the architecture and older page table descriptor formats this can be quite a challenge, and in fact depending on other settings the meanings of these bits completely changes and becomes an indirect index to a memory type defined elsewhere. They do somewhat confer cacheability of memory accesses, but have really no physical impact on the operation of the cache itself as the SCTLR does.
I'd suggest you have a quick read through at least section D15 of the ARMv7-A/R ARM which describes the differences between ARMv4/5 and ARMv7. Since you are coming from reading about a much older version of the architecture, section D12 (differences from ARMv6 to ARMv7) is also useful to you, but not quite as much as D15. It is arranged in such a way that you should very quickly find out there have been quite a few changes - the part which is going to be of most use is D15.3.4 (Memory model and memory ordering). Unfortunately it is quite short and summarized - the real meat of the topic is in the first few chapters of section B (and it is very long).
For your information, you can find the ARM ARM here:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0406-/index.html
And the latest Cortex-A8 TRM here:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344-/index.html
If you have any questions as you read through, don't hesitate to ask - although usually if it is not mentioned in either of those documents, it is usually irrelevant or not worth knowing.
Ta,
Matt
Cancel
Vote up 0 Vote down

Cancel

Reply

0 Matt Sealey over 11 years ago in reply to Hamed

Hi Hamed,
When you say switch I'll take that to mean that you are changing the processor mode in the CPSR (either manually, or by returning from an exception in an approved manner). In this case, the mode switch is a context synchronization event and it will ensure that any state changes you have made will take effect upon entry to the new mode. This does imply an instruction pipeline flush (i.e. the same effect as an explicit 'isb' barrier instruction) but I could not find any architectural guarantee that the store buffer would be drained, nor anything in particular in the Cortex-A8 - it is not usually something you need to do before switching processes in an operating system (it does not matter, really, whether the writes for one OS application are still in the store buffer when you switch to another. Where it does, explicit synchronization primitives would usually be in use anyway, at the application level). If you have a concern then you should manually insert a 'dsb' barrier (see the note in section A3.8.3/A3-153 on Memory Barriers in the ARMv7-A/R Architecture Reference Manual).
In the statement you quoted, the C bit refers to the C bit in the System Control Register (SCTLR). This is the global enable for cacheability within an ARM core, although it's full effects are somewhat implementation defined - but the implementation defined effects are mitigated, as it is really not very common to modify this bit at runtime. Usually you turn it on and leave it on until you need to power down the cores at which time prevention of allocation into the cache is important so that you have a static set of data to clean and invalidate out to the next level of cache, and the next, and so on until you get to the memory subsystem. There is a really good discussion on this in section B2.2.3 (Cache enabling and disabling) of the ARMv7-A/R ARM, with the prudent points:

In ARMv7:

• SCTLR.C enables or disables all data and unified caches for data accesses, across all levels of cache visible to the processor. It is IMPLEMENTATION DEFINED whether it also enables or disables the use of unified caches for instruction accesses.

..

When a cache is disabled for a type of access, for a particular translation regime:

• it is IMPLEMENTATION DEFINED whether a cache hit occurs if a location that is held in the cache is accessed by that type of access.

• any location that is not held in the cache is not brought into the cache as a result of a memory access of that type.

Note that with SCTLR.C disabled, MMU table walks will act as if they access non-cacheable memory, too. With regards to the C bit within the translation tables, this note (in section B3.8.2) is important:

The B (Bufferable), C (Cacheable), and TEX (Type extension) bit names are inherited from earlier versions of the architecture. These names no longer adequately describe the function of the B, C, and TEX bits.

If you are used to older versions of the architecture and older page table descriptor formats this can be quite a challenge, and in fact depending on other settings the meanings of these bits completely changes and becomes an indirect index to a memory type defined elsewhere. They do somewhat confer cacheability of memory accesses, but have really no physical impact on the operation of the cache itself as the SCTLR does.
I'd suggest you have a quick read through at least section D15 of the ARMv7-A/R ARM which describes the differences between ARMv4/5 and ARMv7. Since you are coming from reading about a much older version of the architecture, section D12 (differences from ARMv6 to ARMv7) is also useful to you, but not quite as much as D15. It is arranged in such a way that you should very quickly find out there have been quite a few changes - the part which is going to be of most use is D15.3.4 (Memory model and memory ordering). Unfortunately it is quite short and summarized - the real meat of the topic is in the first few chapters of section B (and it is very long).
For your information, you can find the ARM ARM here:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0406-/index.html
And the latest Cortex-A8 TRM here:
http://infocenter.arm.com/help/topic/com.arm.doc.ddi0344-/index.html
If you have any questions as you read through, don't hesitate to ask - although usually if it is not mentioned in either of those documents, it is usually irrelevant or not worth knowing.
Ta,
Matt
Cancel
Vote up 0 Vote down

Cancel

Children

0 Hamed over 11 years ago in reply to Matt Sealey

Thanks a lot Matt,
Cancel
Vote up 0 Vote down

Cancel