This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion

confusion of fp32 operations per cycle

Hello community,

In GPU datasheet, the fp32 operations per cycle is 256 for immortalis-g715. Is this for all 16 cores or 1 core only?

Thanks,

Venkatesh.

Parents
  • Industry convention for GPUs is to only count FMAs, and count them as two operations (mul + add), so the data sheet saying 256 fp32 ops/cy means that each shader core can do 128 fp32 FMA operations per clock cycle. These numbers completely ignore the CVT and SFU units.

    Whether you get more than 256 ops/cy because of the CVT and SFU units depends on the operation and GPU generation. Some can issue in parallel, some can't,  as we continuously rebalance the shader core design to get the right balance of operations for industry content trends and to optimize for energy efficiency. 

Reply
  • Industry convention for GPUs is to only count FMAs, and count them as two operations (mul + add), so the data sheet saying 256 fp32 ops/cy means that each shader core can do 128 fp32 FMA operations per clock cycle. These numbers completely ignore the CVT and SFU units.

    Whether you get more than 256 ops/cy because of the CVT and SFU units depends on the operation and GPU generation. Some can issue in parallel, some can't,  as we continuously rebalance the shader core design to get the right balance of operations for industry content trends and to optimize for energy efficiency. 

Children