This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Regarding mismatched memory attributes and cacheability

As described in ARM ARM (ARMv7), mismatched memory attributes for mapping a physical region would happen when either/all of the memory type, shareability or cacheability of aliases differ

My question is specific to the case when it is only the cacheability that is different across aliases. E.g., consider a physical page @0x80000000 mapped @0xE0000000 and @0xF0000000 with

1) mapping @0xE0000000 is normal memory, inner (L1) + outer (L2) cacheable

2) mapping @0xF0000000 is normal memory, inner (L1) cacheable only

Now, in the system , if there are 2 execution threads (may even be across public, secure modes) using the above virtual addresses to access, share the same physical region but taking care of L2 clean/inv before using 0xF0000000 to access the same region, do you see this falls into UNPREDICTABLE operation due to mismatched attributes?

Thanks.

Parents
  • Thanks for the response.

    Agree this is mismatched attributes case, but will this fall under unpredictable behavior? Regarding the speculative fetching in L2, as long as I take care of (please correct is this is not possible/susceptible to race) that latest copy of data is present in L1 or in memory or in both at any time?

    That is, considering your point about L2 doing speculative fetch and in following scenario (from earlier example) for inclusive cache:

    1) Thread1 (using 0xE0000000 - inner + outer cacheable) writes to memory and does L2 clean range

    2) Thread2 (using 0xF0000000 - inner cacheable only) reads, writes the same memory

    3) Thread1 does L2 invalidate range

    4) Thread1 reads memory, so on...

    Do you see any issue  - particularly anything unpredictable in above sequence? Particularly, is the above scenario for data sharing is something which can be made to work always by using proper cache ops (I may have missed some steps above due to my limited understanding) even when this falls under mismatched attributes case?

    Thanks.

Reply
  • Thanks for the response.

    Agree this is mismatched attributes case, but will this fall under unpredictable behavior? Regarding the speculative fetching in L2, as long as I take care of (please correct is this is not possible/susceptible to race) that latest copy of data is present in L1 or in memory or in both at any time?

    That is, considering your point about L2 doing speculative fetch and in following scenario (from earlier example) for inclusive cache:

    1) Thread1 (using 0xE0000000 - inner + outer cacheable) writes to memory and does L2 clean range

    2) Thread2 (using 0xF0000000 - inner cacheable only) reads, writes the same memory

    3) Thread1 does L2 invalidate range

    4) Thread1 reads memory, so on...

    Do you see any issue  - particularly anything unpredictable in above sequence? Particularly, is the above scenario for data sharing is something which can be made to work always by using proper cache ops (I may have missed some steps above due to my limited understanding) even when this falls under mismatched attributes case?

    Thanks.

Children
  • There are a couple of ways to go about answering that.

    Let's start with the architectural answer.  The ARMv7-A Architecture Reference Manual says:

    Mismatched memory attributes

    A physical memory location is accessed with mismatched attributes if all accesses to the location do not use a

    common definition of all of the following attributes of that location:

    • memory type, Strongly-ordered, Device, or Normal

    • shareability

    cacheability, for both the inner and outer levels of cache, but excluding any cache allocation hints.

    The following rules apply when a physical memory location is accessed with mismatched attributes:

    ...

    (my emphasis)

    For a strict architectural answer then, your outer cacheability attributes for the two aliases are different therefore this rule applies.

    Then there is the micro-architectural or practical answer.  That is will this really be a problem on processor X in system Y, and will preventative measures Z be enough. 

    What you are doing is something which is architecturally a bad idea (mismatched aliases), and then trying to mitigate for that (lots of clean/invalidates).  It does beg the question of wouldn't it be easier to match the attributes instead?  Don't forget these cleans and invalidates take time to perform.

    What could go wrong?  You didn't say what processor(s) you are using so I'll have to be generic.  After (2) you expect L1 to be dirty, and L2 to be clean (and out of date).  So in (3) you invalidate the L2.  Problem is that L1 could have written the data out to L2 between (2) and (3) - which you've just invalidated. You could work around this, but it's getting increasingly complicated.

    (NOTE: Chapter B2.2.2 of the Architecture Reference Manual covers the rules for caches.  But broadly, cache locations can be prefetched at any time, evicted at any time, and if dirty written back at any time.  When looking at code I work on the principle that any/all of these will happen and at the most inconvenient time)

    A final cautionary note - this kind of thing can be a real pain to debug if you get it wrong.  Because it works "most" of the time, and when it doesn't it does so in hard to predict ways.  And often in ways which are timing sensitive. 

  • Sorry for not being clear on my requirement earlier. Basically, I am concerned about:

    Does the above scenario of only cacheability mismatch falls into being UNPREDICTABLE - provided software takes care of cache maintenance correctly?

    The reasons I am seeking clarification on above are following:

    1) I have a Cortex A9 based system (single core) where public & secure worlds coexist. The mappings of DDR on public are inner+outer cacheable, (Thraed1 in example)  while the mappings for DDR on secure side are inner cacheable only (Thread2). There is a constraint that I could not use L2 on secure side (though I may get around this by probably using TEX remapping but not sure if CP15DISABLE has a role to play preventing the same) but need to share data across both worlds - as efficiently as possible.

    Thus, if the above behavior does not fall into UNPREDICTABLE case and proper cache ops could ensure data correctness _always_, then I could still have proper sharing (leaving out L2).

    2) Compare the above scenario with mismatched attributes concerning memory types - one mapping as NORMAL memory while other as DEVICE - that would in any case remain UNPREDICTABLE, right?

    3) There is one post in the Linux community where it is mentioned (quoting from http://lwn.net/Articles/409700/):

    "multiple mappings of the same physical address region with differing

      cache attributes is also unpredictable - you can't guarantee whether

      the access will be performed using the cache attributes through the

      mapping you're performing the access through."

    My understanding of the above statement is (from our example) - using 0xF0000000 to access memory with non-outer cacheable may still lead to the actual access being made to L2 (as if using 0xE0000000). Now, this would indicate the behavior is UNPREDICTABLE but not clear if that indeed could be the case and how.

    Thanks.

  • Continuation with above, I have confirmed that using TEX remapping is not possible in this case and thus I have to live with the above mentioned aliases where one will have L1 & L2 cacheable and other having L1 only attributes.

    Since I can't change either of the mappings,  only option would be to do explicit flush on both the threads and not caring about performance:

    1) Thread1 (using 0xE0000000 - inner + outer cacheable) writes to memory and does L1 flush, L2 clean range

    2) Thread2 (using 0xF0000000 - inner cacheable only) reads, writes the same memory and does L1 flush to PoC

    3) Thread1 does L2 invalidate range

    4) Thread1 reads memory, so on...

    Will the above sequence be safe in this case (both threads do these operations exclusively)?

    Thanks.

  • Thanks for clarifications, I have understood that this kind of mapping will lead to UNPREDICTABLE behavior at some time.