Arm Community
Site
Search
User
Site
Search
User
Support forums
Mobile, Graphics, and Gaming forum
fp64 on Mali T604
Jump...
Cancel
Locked
Locked
Replies
5 replies
Subscribers
137 subscribers
Views
7708 views
Users
0 members are here
OpenCL
Mali-T604
Mali-GPU
Options
Share
More actions
Cancel
Related
How was your experience today?
This discussion has been locked.
You can no longer post new replies to this discussion. If you have a question you can start a new discussion
fp64 on Mali T604
rahul garg
over 12 years ago
Note: This was originally posted on 21st August 2012 at
http://forums.arm.com
First of all, congrats to ARM for submitting Mali T604 for OpenCL full profile conformance. I hope the tests are finished soon.
I was wondering about fp64 support on the T604. ARM has been quite vocal about T604 supporting fp64, but details (such as speed relative to fp32) have not been released. Any more details on how fp64 is implemented and what performance to expect?
fp64 will make it really useful for my project, which is related to GPGPU for scientific computing type workloads.
Parents
Peter Harris
over 12 years ago
Note: This was originally posted on 14th November 2012 at
http://forums.arm.com
[font=arial, sans-serif][size=2]> 1/24 of fp32 rate [/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
I expect it will be significantly better than this ...
[font=arial, sans-serif][size=2]
[/size][/font]
[font=arial, sans-serif][size=2]> to 1/2 of fp32 rate[/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
... but not quite as good as this.
More specifically the main issue how do you quantify "fp32" rate. For graphics workloads a lot of the fp32 (highp) and fp16 (mediump) calculations can be optimized in the hardware. Common graphics operations like vector dot products are not usually done with "general purpose" floating point units; you can do it more power efficiently in fixed function accelerators. [font=arial, sans-serif][size=2]Now graphics doesn't need fp64 (or even fp32) in most cases, so this fixed function hardware probably doesn't exist for the fp64 cases.[/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
Cancel
Vote up
0
Vote down
Cancel
Reply
Peter Harris
over 12 years ago
Note: This was originally posted on 14th November 2012 at
http://forums.arm.com
[font=arial, sans-serif][size=2]> 1/24 of fp32 rate [/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
I expect it will be significantly better than this ...
[font=arial, sans-serif][size=2]
[/size][/font]
[font=arial, sans-serif][size=2]> to 1/2 of fp32 rate[/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
... but not quite as good as this.
More specifically the main issue how do you quantify "fp32" rate. For graphics workloads a lot of the fp32 (highp) and fp16 (mediump) calculations can be optimized in the hardware. Common graphics operations like vector dot products are not usually done with "general purpose" floating point units; you can do it more power efficiently in fixed function accelerators. [font=arial, sans-serif][size=2]Now graphics doesn't need fp64 (or even fp32) in most cases, so this fixed function hardware probably doesn't exist for the fp64 cases.[/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
[font=arial, sans-serif][size=2]
[/size][/font]
Cancel
Vote up
0
Vote down
Cancel
Children
No data