DSB command for external Nor-Flash memory (Cortex-M7)

Good afternoon, dear collegs! 

I'v started to connect STM32H7 (Cortex M7) MCU with exteranal Nor-Flash memory by use FMC. I'm alittle confused by write command. Without DSB() command it refuses to work correct. Program loads from embedded Flash (7 waitstaits).

Doesn't work:

*(__IO uint16_t *)((uint32_t)(__ADDRESS__)) = (__DATA__);

Asm:

MOVS r0,#0xF0
STRH r0,[r4,#0xAAA]
(next instuction)
...

Work correct:

*(__IO uint16_t *)((uint32_t)(__ADDRESS__)) = (__DATA__);

__DSB();

Asm:

MOVS r0,#0xF0
STRH r0,[r4,#0xAAA]
NOP
NOP
NOP
DSB.W
NOP
NOP
NOP
NOP
(next instuction)
...

Can anyboady explain me why it work so? Why we need to wait execution of all previous instructions before storing new data in new address? On STM32F4 (Cortex-M4) it works without DSB command.

What time (CPU cycles) does instruction STR takes? RM of ARM v7 gives 2 N-cyles. But what does it means in CPU cylces I don't understand.

I'll be glad to any help!