Linaro supports a solution for instruction trace without external debugger involved if the Coresight components are embedded.
This article describes the steps to related building, setup and command.
The test environment is Juno-busybox: Linux (none) 4.9.0-dirty #9 SMP PREEMPT Tue Mar 28 10:39:46 CST 2017 aarch64 GNU/Linux
If you download the Juno kernel code, the Coresight driver code is in the Linux kernel directory: drivers/hwtracing/coresight/. Alternatively you can download the workspace for Juno platform by following instructions described at https://community.arm.com/dev-platforms/b/documents/posts/using-linaros-deliverables-on-juno
CONFIG_CORESIGHT=yCONFIG_CORESIGHT_LINKS_AND_SINKS=yCONFIG_CORESIGHT_LINK_AND_SINK_TMC=yCONFIG_CORESIGHT_SINK_TPIU=yCONFIG_CORESIGHT_SINK_ETBV10=yCONFIG_CORESIGHT_SOURCE_ETM4X=yCONFIG_CORESIGHT_QCOM_REPLICATOR=y
/* * Juno TRMs specify the size for these coresight components as 64K. * The actual size is just 4K though 64K is reserved. Access to the * unmapped reserved region results in a DECERR response. */ etf@20010000 { compatible = "arm,coresight-tmc", "arm,primecell"; reg = <0 0x20010000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; ports { #address-cells = <1>; #size-cells = <0>; /* input port */ port@0 { reg = <0>; etf_in_port: endpoint { slave-mode; remote-endpoint = <&main_funnel_out_port>; }; }; /* output port */ port@1 { reg = <0>; etf_out_port: endpoint { remote-endpoint = <&replicator_in_port0>; }; }; }; }; tpiu@20030000 { compatible = "arm,coresight-tpiu", "arm,primecell"; reg = <0 0x20030000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { tpiu_in_port: endpoint { slave-mode; remote-endpoint = <&replicator_out_port0>; }; }; }; main-funnel@20040000 { compatible = "arm,coresight-funnel", "arm,primecell"; reg = <0 0x20040000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; ports { #address-cells = <1>; #size-cells = <0>; port@0 { reg = <0>; main_funnel_out_port: endpoint { remote-endpoint = <&etf_in_port>; }; }; port@1 { reg = <0>; main_funnel_in_port0: endpoint { slave-mode; remote-endpoint = <&cluster0_funnel_out_port>; }; }; port@2 { reg = <1>; main_funnel_in_port1: endpoint { slave-mode; remote-endpoint = <&cluster1_funnel_out_port>; }; }; }; }; etr@20070000 { compatible = "arm,coresight-tmc", "arm,primecell"; reg = <0 0x20070000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { etr_in_port: endpoint { slave-mode; remote-endpoint = <&replicator_out_port1>; }; }; }; etm0: etm@22040000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0 0x22040000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { cluster0_etm0_out_port: endpoint { remote-endpoint = <&cluster0_funnel_in_port0>; }; }; }; cluster0-funnel@220c0000 { compatible = "arm,coresight-funnel", "arm,primecell"; reg = <0 0x220c0000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; ports { #address-cells = <1>; #size-cells = <0>; port@0 { reg = <0>; cluster0_funnel_out_port: endpoint { remote-endpoint = <&main_funnel_in_port0>; }; }; port@1 { reg = <0>; cluster0_funnel_in_port0: endpoint { slave-mode; remote-endpoint = <&cluster0_etm0_out_port>; }; }; port@2 { reg = <1>; cluster0_funnel_in_port1: endpoint { slave-mode; remote-endpoint = <&cluster0_etm1_out_port>; }; }; }; }; etm1: etm@22140000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0 0x22140000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { cluster0_etm1_out_port: endpoint { remote-endpoint = <&cluster0_funnel_in_port1>; }; }; }; etm2: etm@23040000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0 0x23040000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { cluster1_etm0_out_port: endpoint { remote-endpoint = <&cluster1_funnel_in_port0>; }; }; }; cluster1-funnel@230c0000 { compatible = "arm,coresight-funnel", "arm,primecell"; reg = <0 0x230c0000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; ports { #address-cells = <1>; #size-cells = <0>; port@0 { reg = <0>; cluster1_funnel_out_port: endpoint { remote-endpoint = <&main_funnel_in_port1>; }; }; port@1 { reg = <0>; cluster1_funnel_in_port0: endpoint { slave-mode; remote-endpoint = <&cluster1_etm0_out_port>; }; }; port@2 { reg = <1>; cluster1_funnel_in_port1: endpoint { slave-mode; remote-endpoint = <&cluster1_etm1_out_port>; }; }; port@3 { reg = <2>; cluster1_funnel_in_port2: endpoint { slave-mode; remote-endpoint = <&cluster1_etm2_out_port>; }; }; port@4 { reg = <3>; cluster1_funnel_in_port3: endpoint { slave-mode; remote-endpoint = <&cluster1_etm3_out_port>; }; }; }; }; etm3: etm@23140000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0 0x23140000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { cluster1_etm1_out_port: endpoint { remote-endpoint = <&cluster1_funnel_in_port1>; }; }; }; etm4: etm@23240000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0 0x23240000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { cluster1_etm2_out_port: endpoint { remote-endpoint = <&cluster1_funnel_in_port2>; }; }; }; etm5: etm@23340000 { compatible = "arm,coresight-etm4x", "arm,primecell"; reg = <0 0x23340000 0 0x1000>; clocks = <&soc_smc50mhz>; clock-names = "apb_pclk"; power-domains = <&scpi_devpd 0>; port { cluster1_etm3_out_port: endpoint { remote-endpoint = <&cluster1_funnel_in_port3>; }; }; }; coresight-replicator { /* * Non-configurable replicators don't show up on the * AMBA bus. As such no need to add "arm,primecell". */ compatible = "arm,coresight-replicator"; ports { #address-cells = <1>; #size-cells = <0>; /* replicator output ports */ port@0 { reg = <0>; replicator_out_port0: endpoint { remote-endpoint = <&tpiu_in_port>; }; }; port@1 { reg = <1>; replicator_out_port1: endpoint { remote-endpoint = <&etr_in_port>; }; }; /* replicator input port */ port@2 { reg = <0>; replicator_in_port0: endpoint { slave-mode; remote-endpoint = <&etf_out_port>; }; }; }; };
git clone -b master https://github.com/Linaro/OpenCSD.git
cd decoder/build/linux/make LINUX64=1 DEBUG=1make LINUX64=1 DEBUG=0The OpenCSD library is in decode/lib/linux64/dbg[rel]/ directory
git clone -b perf-opencsd-4.8 https://github.com/Linaro/OpenCSD.git perf-opencsd-4.8
cd perf-opencsd-4.8/tools/perf
export CSTRACE_PATH= xxx/OpenCSD/decoder
/*build the perf running on the HOST*/make
/*build the perf running on TARGET*/make ARCH=arm64 CROSS_COMPILE=gcc-linaro-4.9-2015.05-x86_64_aarch64-linux-gnu/bin/aarch64-linux-gnu- If you want to build a static perf, specify the option "LDFLAGS=-static" in Makefile.perf
The target perf is generated in the current directory.
aarch64-linux-gnu-gcc -static -o test main.c
#include "stdio.h" #include "string.h" int array[32][32]; int main() { int loop=0; int i=0, j=0; //while(1) { memset(array,0,sizeof(array)); for(i=0;i<32;i++) { for(j=0;j<32;j++) { array[i][j] = i+j; } } } return 0; }
Disassemble the binary:
314 0000000000400658 <main>: 315 400658: a9be7bfd stp x29, x30, [sp,#-32]! 316 40065c: 910003fd mov x29, sp 317 400660: b9001fbf str wzr, [x29,#28] 318 400664: b90017bf str wzr, [x29,#20] 319 400668: b9001bbf str wzr, [x29,#24] 320 40066c: 90000480 adrp x0, 490000 <tzfile_mtime> 321 400670: 91110000 add x0, x0, #0x440 322 400674: d2820002 mov x2, #0x1000 // #4096 323 400678: 52800001 mov w1, #0x0 // #0 324 40067c: 940052e1 bl 415200 <__memset> 325 400680: b90017bf str wzr, [x29,#20] 326 400684: 14000016 b 4006dc <main+0x84> 327 400688: b9001bbf str wzr, [x29,#24] 328 40068c: 1400000e b 4006c4 <main+0x6c> 329 400690: b94017a1 ldr w1, [x29,#20] 330 400694: b9401ba0 ldr w0, [x29,#24] 331 400698: 0b000022 add w2, w1, w0 332 40069c: 90000480 adrp x0, 490000 <tzfile_mtime> 333 4006a0: 91110000 add x0, x0, #0x440 334 4006a4: b9801ba1 ldrsw x1, [x29,#24] 335 4006a8: b98017a3 ldrsw x3, [x29,#20] 336 4006ac: d37be863 lsl x3, x3, #5 337 4006b0: 8b010061 add x1, x3, x1 338 4006b4: b8217802 str w2, [x0,x1,lsl #2] 339 4006b8: b9401ba0 ldr w0, [x29,#24] 340 4006bc: 11000400 add w0, w0, #0x1 341 4006c0: b9001ba0 str w0, [x29,#24] 342 4006c4: b9401ba0 ldr w0, [x29,#24] 343 4006c8: 71007c1f cmp w0, #0x1f 344 4006cc: 54fffe2d b.le 400690 <main+0x38> 345 4006d0: b94017a0 ldr w0, [x29,#20] 346 4006d4: 11000400 add w0, w0, #0x1 347 4006d8: b90017a0 str w0, [x29,#20] 348 4006dc: b94017a0 ldr w0, [x29,#20] 349 4006e0: 71007c1f cmp w0, #0x1f 350 4006e4: 54fffd2d b.le 400688 <main+0x30> 351 4006e8: 52800000 mov w0, #0x0 // #0 352 4006ec: a8c27bfd ldp x29, x30, [sp],#32 353 4006f0: d65f03c0 ret 354 4006f4: 00000000 .inst 0x00000000 ; undefined
./perf record -e cs_etm/@20010000.etf/u --filter 'start 0x400658@/test, stop 0x4006f0@/test' --per-thread ./test
tar czf cs_example.tgz .debug/ perf.data
tar xzf cs_example.tgz
rm -rf ~/.debug mv .debug ~/.debug
perf report --stdio --dump
# To display the perf.data header info, please use --header/--header-only options. # 0x178 [0x1d8]: event: 70 . . ... raw event: size 472 bytes . 0000: 46 00 00 00 00 00 d8 01 03 00 00 00 00 00 00 00 F............... . 0010: 00 00 00 00 00 00 00 00 06 00 00 00 08 00 00 00 ................ . 0020: 00 00 00 00 00 00 00 00 40 40 40 40 40 40 40 40 ........@@@@@@@@ . 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 10 00 00 00 00 00 00 00 a1 0e 00 28 00 00 00 00 ...........(.... . 0050: 03 f4 00 41 00 00 00 00 88 04 00 00 00 00 00 00 ...A............ . 0060: 00 00 00 00 00 00 00 00 cc 00 00 00 00 00 00 00 ................ . 0070: 40 40 40 40 40 40 40 40 01 00 00 00 00 00 00 00 @@@@@@@@........ . 0080: 00 00 00 00 00 00 00 00 12 00 00 00 00 00 00 00 ................ . 0090: a1 0e 00 28 00 00 00 00 00 f4 00 41 00 00 00 00 ...(.......A.... . 00a0: 88 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 00b0: cc 00 00 00 00 00 00 00 40 40 40 40 40 40 40 40 ........@@@@@@@@ . 00c0: 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 00d0: 14 00 00 00 00 00 00 00 a1 0e 00 28 00 00 00 00 ...........(.... . 00e0: 00 f4 00 41 00 00 00 00 88 04 00 00 00 00 00 00 ...A............ . 00f0: 00 00 00 00 00 00 00 00 cc 00 00 00 00 00 00 00 ................ . 0100: 40 40 40 40 40 40 40 40 03 00 00 00 00 00 00 00 @@@@@@@@........ . 0110: 00 00 00 00 00 00 00 00 16 00 00 00 00 00 00 00 ................ . 0120: a1 0e 00 28 00 00 00 00 03 f4 00 41 00 00 00 00 ...(.......A.... . 0130: 88 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0140: cc 00 00 00 00 00 00 00 40 40 40 40 40 40 40 40 ........@@@@@@@@ . 0150: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0160: 18 00 00 00 00 00 00 00 a1 0e 00 28 00 00 00 00 ...........(.... . 0170: 03 f4 00 41 00 00 00 00 88 04 00 00 00 00 00 00 ...A............ . 0180: 00 00 00 00 00 00 00 00 cc 00 00 00 00 00 00 00 ................ . 0190: 40 40 40 40 40 40 40 40 05 00 00 00 00 00 00 00 @@@@@@@@........ . 01a0: 00 00 00 00 00 00 00 00 1a 00 00 00 00 00 00 00 ................ . 01b0: a1 0e 00 28 00 00 00 00 03 f4 00 41 00 00 00 00 ...(.......A.... . 01c0: 88 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 01d0: cc 00 00 00 00 00 00 00 ........ 0x178 [0x1d8]: PERF_RECORD_AUXTRACE_INFO type: 3 0x350 [0x50]: event: 1 . . ... raw event: size 80 bytes . 0000: 01 00 00 00 01 00 50 00 ff ff ff ff 00 00 00 00 ......P......... . 0010: 00 10 08 08 80 ff ff ff ff ef f7 f7 7f 00 00 00 ................ . 0020: 00 10 08 08 80 ff ff ff 5b 6b 65 72 6e 65 6c 2e ........[kernel. . 0030: 6b 61 6c 6c 73 79 6d 73 5d 5f 73 74 65 78 74 00 kallsyms]_stext. . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0x350 [0x50]: PERF_RECORD_MMAP -1/0: [0xffffff8008081000(0x7ff7f7efff) @ 0xffffff8008081000]: x [kernel.kallsyms]_stext 0x3a0 [0x28]: event: 3 . . ... raw event: size 40 bytes . 0000: 03 00 00 00 00 00 28 00 ba 00 00 00 ba 00 00 00 ......(......... . 0010: 70 65 72 66 00 00 00 00 00 00 00 00 00 00 00 00 perf............ . 0020: 00 00 00 00 00 00 00 00 ........ 0x3a0 [0x28]: PERF_RECORD_COMM: perf:186/186 0x3c8 [0x30]: event: 11 . . ... raw event: size 48 bytes . 0000: 0b 00 00 00 00 00 30 00 00 00 00 00 00 00 00 00 ......0......... . 0010: 30 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 0............... . 0020: ba 00 00 00 ba 00 00 00 5a 00 00 00 00 00 00 00 ........Z....... 0x3c8 [0x30]: PERF_RECORD_AUX offset: 0 size: 0x30 flags: 0 [] 0x3f8 [0x28]: event: 3 . . ... raw event: size 40 bytes . 0000: 03 00 00 00 00 20 28 00 ba 00 00 00 ba 00 00 00 ..... (......... . 0010: 74 65 73 74 00 00 00 00 ba 00 00 00 ba 00 00 00 test............ . 0020: 5b 00 00 00 00 00 00 00 [....... 0x3f8 [0x28]: PERF_RECORD_COMM exec: test:186/186 0x420 [0x60]: event: 10 . . ... raw event: size 96 bytes . 0000: 0a 00 00 00 02 00 60 00 ba 00 00 00 ba 00 00 00 ......`......... . 0010: 00 00 40 00 00 00 00 00 00 d0 07 00 00 00 00 00 ..@............. . 0020: 00 00 00 00 00 00 00 00 00 00 00 00 01 00 00 00 ................ . 0030: 14 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 05 00 00 00 02 18 00 00 2f 74 65 73 74 00 00 00 ......../test... . 0050: ba 00 00 00 ba 00 00 00 5b 00 00 00 00 00 00 00 ........[....... 0x420 [0x60]: PERF_RECORD_MMAP2 186/186: [0x400000(0x7d000) @ 0 00:01 1044 0]: r-xp /test 0x480 [0x60]: event: 10 . . ... raw event: size 96 bytes . 0000: 0a 00 00 00 02 00 60 00 ba 00 00 00 ba 00 00 00 ......`......... . 0010: 00 20 1f 7c 7f 00 00 00 00 10 00 00 00 00 00 00 . .|............ . 0020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0040: 00 00 00 00 00 00 00 00 5b 76 64 73 6f 5d 00 00 ........[vdso].. . 0050: ba 00 00 00 ba 00 00 00 5b 00 00 00 00 00 00 00 ........[....... 0x480 [0x60]: PERF_RECORD_MMAP2 186/186: [0x7f7c1f2000(0x1000) @ 0 00:00 0 0]: ---p [vdso] 0x4e0 [0x30]: event: 11 . . ... raw event: size 48 bytes . 0000: 0b 00 00 00 00 00 30 00 30 00 00 00 00 00 00 00 ......0.0....... . 0010: 90 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ . 0020: ba 00 00 00 ba 00 00 00 5a 00 00 00 00 00 00 00 ........Z....... 0x4e0 [0x30]: PERF_RECORD_AUX offset: 0x30 size: 0x90 flags: 0 [] 0x510 [0x30]: event: 4 . . ... raw event: size 48 bytes . 0000: 04 00 00 00 00 00 30 00 ba 00 00 00 ba 00 00 00 ......0......... . 0010: ba 00 00 00 ba 00 00 00 f4 a9 49 cc 77 06 00 00 ..........I.w... . 0020: ba 00 00 00 ba 00 00 00 5b 00 00 00 00 00 00 00 ........[....... 0x510 [0x30]: PERF_RECORD_EXIT(186:186):(186:186) 0x540 [0x30]: event: 71 . . ... raw event: size 48 bytes . 0000: 47 00 00 00 00 00 30 00 c0 00 00 00 00 00 00 00 G.....0......... . 0010: 00 00 00 00 00 00 00 00 b4 75 4c 6f a0 a5 b0 23 .........uLo...# . 0020: 00 00 00 00 ba 00 00 00 ff ff ff ff 00 00 00 00 ................ 0x540 [0x30]: PERF_RECORD_AUXTRACE size: 0xc0 offset: 0 ref: 0x23b0a5a06f4c75b4 idx: 0 tid: 186 cpu: -1 . ... CoreSight ETM Trace data: size 192 bytes 0: I_ASYNC : Alignment Synchronisation. 12: I_TRACE_INFO : Trace Info.; PCTL=0x0 48: I_ASYNC : Alignment Synchronisation. 60: I_TRACE_INFO : Trace Info.; PCTL=0x0 65: I_TRACE_ON : Trace On. 66: I_ADDR_CTXT_L_64IS0 : Address & Context, Long, 64 bit, IS0.; Addr=0x0000000000400658; Ctxt: AArch64,EL0, NS; 76: I_ATOM_F3 : Atom format 3.; EEN 77: I_ATOM_F3 : Atom format 3.; ENN 78: I_ATOM_F5 : Atom format 5.; NEEEE 80: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 81: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 82: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEN 83: I_ATOM_F2 : Atom format 2.; NE 84: I_ADDR_L_64IS0 : Address, Long, 64 bit, IS0.; Addr=0x0000000000400680; 93: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 94: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEN 96: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 97: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 98: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 99: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 100: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 101: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 102: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 103: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 104: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 105: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 106: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 107: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 108: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 109: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 110: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 112: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 113: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 114: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 115: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 116: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 117: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 118: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 119: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 120: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 121: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 122: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 123: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 124: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 125: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 126: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 128: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 129: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 130: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 131: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 132: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 133: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 134: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 135: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 136: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 137: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 138: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 139: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 140: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 141: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 142: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 144: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 145: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 146: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 147: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 148: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 149: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 150: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 151: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 152: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 153: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 154: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 155: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 156: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 157: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 158: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 160: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEEEEEEEEEEEEEEE 161: I_ATOM_F6 : Atom format 6.; EEEEEEEEEEN 162: I_ATOM_F2 : Atom format 2.; NE 163: I_ADDR_S_IS0 : Address, Short, IS0.; Addr=0x000000000040093C ~[0x93C] 0x630 [0x8]: event: 68 . . ... raw event: size 8 bytes . 0000: 44 00 00 00 00 00 08 00 D....... 0x630 [0x8]: PERF_RECORD_FINISHED_ROUND Aggregated stats: (excludes AUX area (e.g. instruction trace) decoded / synthesized events) TOTAL events: 11 MMAP events: 1 COMM events: 2 EXIT events: 1 MMAP2 events: 2 AUX events: 2 FINISHED_ROUND events: 1 AUXTRACE_INFO events: 1 AUXTRACE events: 1 cs_etm/@20010000.etf/u stats: dummy:u stats:
~/code/OpenCSD/trace-data$ cat disapp.sh
#!/bin/bash
export PERF_EXEC_PATH=xxx/perf-opencsd-4.8/tools/perf/export EXEC_PATH=xxx/perf-opencsd-4.8/tools/perf/export SCRIPT_PATH=$EXEC_PATH/scripts/python/export XTOOL_PATH=xxx/gcc-linaro-4.9-2015.05-x86_64_aarch64-linux-gnu/bin/
perf --exec-path=${EXEC_PATH} script --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump
./disapp.sh
FILE: /test CPU: 1 FNAME: ~/.debug/test/054e3cdda6a6a75dd8376bfe48fbab64e8e0b81e/elf 400658: a9be7bfd stp x29, x30, [sp,#-32]! 40065c: 910003fd mov x29, sp 400660: b9001fbf str wzr, [x29,#28] 400664: b90017bf str wzr, [x29,#20] 400668: b9001bbf str wzr, [x29,#24] 40066c: 90000480 adrp x0, 490000 <tzfile_mtime> 400670: 91110000 add x0, x0, #0x440 400674: d2820002 mov x2, #0x1000 // #4096 400678: 52800001 mov w1, #0x0 // #0 40067c: 940052e1 bl 415200 <__memset> FILE: /test CPU: 1 FNAME: ~/.debug/test/054e3cdda6a6a75dd8376bfe48fbab64e8e0b81e/elf 415200: aa0003e8 mov x8, x0 415204: 72001c27 ands w7, w1, #0xff 415208: 54000640 b.eq 4152d0 <__memset+0xd0> ….
I also test the kernel trace with perf and the Coresight driver
/ # ./perf record -e cs_etm/@20010000.etf/k --filter 'start 0xffffff80080cd5d0,stop 0xffffff80080cd694' --per-thread uname
cat disker.sh
export PERF_EXEC_PATH=xxxperf-opencsd-4.8/tools/perf/export EXEC_PATH=xxx/perf-opencsd-4.8/tools/perf/export SCRIPT_PATH=$EXEC_PATH/scripts/python/export XTOOL_PATH=xxx/gcc-linaro-4.9-2015.05-x86_64_aarch64-linux-gnu/bin/
perf --exec-path=${EXEC_PATH} script --vmlinux=./vmlinux --script=python:${SCRIPT_PATH}/cs-trace-disasm.py -- -d ${XTOOL_PATH}/aarch64-linux-gnu-objdump -k ./vmlinux
./disker.sh
FILE: [kernel.kallsyms] CPU: 2 FNAME: ./vmlinux ffffff80080cd5d0: a9bc7bfd stp x29, x30, [sp,#-64]! ffffff80080cd5d4: 910003fd mov x29, sp ffffff80080cd5d8: a90153f3 stp x19, x20, [sp,#16] ffffff80080cd5dc: a9025bf5 stp x21, x22, [sp,#32] ffffff80080cd5e0: f9001bf7 str x23, [sp,#48] ffffff80080cd5e4: aa1e03e0 mov x0, x30 ffffff80080cd5e8: 900055f5 adrp x21, ffffff8008b89000 <nop_trace+0x10> ffffff80080cd5ec: f00054b4 adrp x20, ffffff8008b64000 <cpu_worker_pools+0x180> ffffff80080cd5f0: 911e0293 add x19, x20, #0x780 ffffff80080cd5f4: 97ff153f bl ffffff8008092af0 <_mcount> FILE: [kernel.kallsyms] CPU: 2 FNAME: ./vmlinux ffffff8008092af0: d65f03c0 ret FILE: [kernel.kallsyms] CPU: 2 FNAME: ./vmlinux ffffff8008092af4: d503201f nop ffffff8008092af8: d503201f nop ffffff8008092afc: d503201f nop ffffff8008092b00: a9bf7bfd stp x29, x30, [sp,#-16]! ffffff8008092b04: 910003fd mov x29, sp ffffff8008092b08: d10013c0 sub x0, x30, #0x4 ffffff8008092b0c: f94003a1 ldr x1, [x29] ffffff8008092b10: f9400421 ldr x1, [x1,#8] ffffff8008092b14: d1001021 sub x1, x1, #0x4 ffffff8008092b18: d503201f nop ffffff8008092b1c: d503201f nop ffffff8008092b20: a8c17bfd ldp x29, x30, [sp],#16 ffffff8008092b24: d65f03c0 ret FILE: [kernel.kallsyms] CPU: 2 FNAME: ./vmlinux ffffff80080cd618: b9485260 ldr w0, [x19,#2128] ffffff80080cd61c: 360803e0 tbz w0, #1, ffffff80080cd698 <scheduler_tick+0xc8> FILE: [kernel.kallsyms] CPU: 2 FNAME: ./vmlinux ffffff80080cd698: aa1303e0 mov x0, x19 ffffff80080cd69c: 97ffee43 bl ffffff80080c8fa8 <update_rq_clock.part.24> FILE: [kernel.kallsyms] CPU: 2 FNAME: ./vmlinux ffffff80080c8fa8: a9be7bfd stp x29, x30, [sp,#-32]! ffffff80080c8fac: 910003fd mov x29, sp ffffff80080c8fb0: f9000bf3 str x19, [sp,#16] ffffff80080c8fb4: aa0003f3 mov x19, x0 ffffff80080c8fb8: aa1e03e0 mov x0, x30 ffffff80080c8fbc: 97ff26cd bl ffffff8008092af0 <_mcount>
Perf --filter supports address range trace. The only limitation with address filters is that the amount of address comparatives found on an implementation and the mutual exclusion between range and start stop filters
Filter format is: filter|start|stop|tracestop <start symbol or address> [/ <end symbol or size>] [@<file name>]
The demo application disasm file, app & kernel trace instruction decode below:
https://community.arm.com/cfs-file/__key/communityserver-blogs-components-weblogfiles/00-00-00-21-12/test.asm
https://community.arm.com/cfs-file/__key/communityserver-blogs-components-weblogfiles/00-00-00-21-12/test.trace
https://community.arm.com/cfs-file/__key/communityserver-blogs-components-weblogfiles/00-00-00-21-12/kernel.trace
For more information refer to a slides from Linaro on Hardware Assisted Tracing on Arm with CoreSight and OpenCSD
Information can also be found on Github at HOWTO - using the library with perf
Hi, this is a great tutorial. However, the command for git clone -b perf-opencsd-4.8 github.com/.../OpenCSD.git perf-opencsd-4.8 is no longer working
Hi Jeremy, thanks for your kind reminder, I think now it is here https://github.com/Linaro/perf-opencsd