This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

Using Streamline Instruction Executed counter to measure MIPS

Hello,

We are adding some extra sound effects to Android's mediaserver. We're using DS-5 's Streamline to measure the performance of an active thread that implements this sound effect, on a Nexus 5 phone. This phone's cpu has four cores, which are correctly detected by Streamline. We build the entire Android AOSP platform for Android 5/6, using the prebuilt toolchain supplied by Android. The code that is used to build the shared library corresponding to the thread being measured, was compiled using gcc.

I use DS-5/Streamline by playing a media file for one minute and simultaneously using Streamline to capture the cpu activity. I've done the following

- compiled all code that implements the thread using the flags -g -fno-inline -fno-omit-frame-pointer, as described in

   Streamline User Guide | Recommended compiler options | ARM DS-5 Development Studio

- pushed the compiled shared library (with symbols) to the phone

- in Streamline's Capture & Analysis Options, selected "High Resolution Timeline", and added the location of the shared library with symbols to "Program Images"

After the test, I expand the Cross Section Marker to cover a time period of one minute. The Instructions Executed counter displays the total MIPS for this elapsed period of time.

I filter all counters for the process I want to measure, and divide the filtered Instruction Executed count by 60 to get the MIPS figure, averaged for the four cpu cores.

My questions are:

1. Is this the best way to measure the MIPS, using DS-5?

2. Using these compile-time options seem counter-intuitive when taking profiling measurements. For example, the whole point of using inlining is to speed up the performance. Do these flags apply to profiling measurements?

3. When I don't use the compile time flags -fno-inline -fno-omit-frame-pointer listed above, I get a total Instruction Executed count figure about 35% less (26 Ginstruction vs 40 Ginstruction). However, the indicated CPU activity averaged for the four cores for the same thread is about 15% less (11.3% vs 13.4%). Using or omitting the -g flags makes no difference, which also seems counter-intuitive.

Many thanks,

Paul

Parents
  • Hi Paul,

    The compile-time options are required for accurate sampling and stack walking of the code as shown in the Call Paths, Functions, and Code views of Streamline. If you do not care about these views then these compile-time options are not required. For MIPS, try the following against the Instruction Executed chart: update the display type from Accumulate to Hertz, add "IPS" to the units, and enabling average selection (averages the values displayed by the CSM). These setting may be better for what you are doing.

    Wade

Reply
  • Hi Paul,

    The compile-time options are required for accurate sampling and stack walking of the code as shown in the Call Paths, Functions, and Code views of Streamline. If you do not care about these views then these compile-time options are not required. For MIPS, try the following against the Instruction Executed chart: update the display type from Accumulate to Hertz, add "IPS" to the units, and enabling average selection (averages the values displayed by the CSM). These setting may be better for what you are doing.

    Wade

Children
No data