In ARM7 and ARM9 PC=current + 8, but in the cortex-A7(8-stage pipeline) the PC is also the same value(PC=current +8), how does this work?
I think the cortex-A7 has 8-stage pipeline, the PC value is also current+8(this is back-forward for old design), but how it works?
It's probably best to see all core registers as "virtual registers". What happens in the core and its pipeline is not the same that what you see in you asm code. This might have been the case long time ago when cores were simple - and that is where the PC+8 thing comes from. Since then ARM has kept is for compatibility and you just have to get along with this. Don't try to make too much sense of it on a Cortex core.
In the 64 bit ARMv8 architecture they've got rid of this bit of history so PC just points to the current instruction as a human would think of it going through them one at a time. In an out of order processor you could have lots of instructions executing at the same time and for each of them the PC is just the PC for that instruction. I hate to think how on earth we'd cope if we had to deal with a single 'hardware' PC I second Axel's advice to try not to make too much sense of PC+8 in ARMv7 and just accept that's what you get.
The original ARM design had a 3-stage pipeline (fetch-decode-execute). To simplify the design they chose to have the PC read as the value currently on the instruction fetch address lines, rather than that of the currently executing instruction from 2 cycles ago. Since most PC-relative addresses are calculated at link time, it's easier to have the assembler/linker compensate for that 2-instruction offset than to design all the logic to 'correct' the PC register.
Of course, that's all firmly on the "things that made sense 30 years ago" pile. Now imagine what it takes to keep a meaningful value in that register on today's 15+ stage, multiple-issue, out-of-order pipelines, and you might appreciate why it's hard to find a CPU designer these days who thinks exposing the PC as a register is a good idea.