We are running a survey to help us improve the experience for all of our members. If you see the survey appear, please take the time to tell us about your experience if you can.
The time it took me to test each instruction with MegPeak is as follows。stp instruction time differs greatly from the official documentation(Arm_Cortex-A76_Software_Optimization_Guide)bandwidth: 19.067337 Gbpsldd throughput: 0.221717 ns 3547478.000000 runs 16000000ldq throughput: 0.221717 ns 3547478.000000 runs 16000000stq throughput: 1.327807 ns 21244908.000000 runs 16000000ldpq throughput: 0.442687 ns 5666398.000000 runs 12800000lddx2 throughput: 0.442888 ns 7086206.000000 runs 16000000ld1q throughput: 0.221316 ns 3541061.000000 runs 16000000eor throughput: 0.221280 ns 3540478.000000 runs 16000000fmla throughput: 0.221590 ns 3545436.000000 runs 16000000fmlad throughput: 0.221480 ns 3543687.000000 runs 16000000fmla_x2 throughput: 0.475682 ns 7610905.000000 runs 16000000mla throughput: 0.884828 ns 14157246.000000 runs 16000000fmul throughput: 0.221298 ns 3540769.000000 runs 16000000mul throughput: 0.884700 ns 14155203.000000 runs 16000000addp throughput: 0.221262 ns 3540187.000000 runs 16000000sadalp throughput: 0.442833 ns 7085331.000000 runs 16000000add throughput: 0.221262 ns 3540186.000000 runs 16000000fadd throughput: 0.221590 ns 3545436.000000 runs 16000000smull throughput: 0.442432 ns 7078915.000000 runs 16000000smlal_4b throughput: 0.442724 ns 7083581.000000 runs 16000000smlal_8b throughput: 0.442851 ns 7085622.000000 runs 16000000dupd_lane_s8 throughput: 0.221280 ns 3540478.000000 runs 16000000mlaq_lane_s16 throughput: 0.885192 ns 10622309.000000 runs 12000000sshll throughput: 0.442706 ns 7083289.000000 runs 16000000tbl throughput: 0.221262 ns 3540187.000000 runs 16000000ins throughput: 0.442651 ns 7082415.000000 runs 16000000sqrdmulh throughput: 0.884609 ns 14153745.000000 runs 16000000usubl throughput: 0.221207 ns 3539311.000000 runs 16000000abs throughput: 0.221553 ns 3544853.000000 runs 16000000fcvtzs throughput: 0.885320 ns 14165121.000000 runs 16000000scvtf throughput: 0.884828 ns 14157246.000000 runs 16000000fcvtns throughput: 0.884810 ns 14156954.000000 runs 16000000fcvtms throughput: 0.884773 ns 14156371.000000 runs 16000000fcvtps throughput: 0.885265 ns 14164246.000000 runs 16000000fcvtas throughput: 0.884427 ns 14150829.000000 runs 16000000fcvtn throughput: 0.884554 ns 14152871.000000 runs 16000000fcvtl throughput: 0.884974 ns 14159579.000000 runs 16000000ins_ldd throughput: 0.442824 ns 5668148.000000 runs 12800000ldq_fmlaq throughput: 0.232800 ns 3724808.000000 runs 16000000ldd_fmlaq_sep throughput: 0.249211 ns 3189901.000000 runs 12800000ldd_fmlaq_lane_sep throughput: 0.243519 ns 3896305.000000 runs 16000000ldd_ldx_ins_fmlaq_lane_sep throughput: 0.364600 ns 4666874.000000 runs 12800000ins_fmlaq_lane_1_4_sep throughput: 0.381871 ns 4887954.000000 runs 12800000ldd_fmlaq_lane_1_4_sep throughput: 0.221891 ns 2840199.000000 runs 12800000ins_fmlaq_lane_sep throughput: 1.089538 ns 17432604.000000 runs 16000000
MegPeak --github.com/.../MegPeak