You know those deep questions you get in philosophy like "If a tree falls in a forest and no one is there to hear it, does it still make a sound?", "What is the sound of one hand clapping?", "Why is there something rather than nothing?", etc. Well the equivalent in graphics terms is probably "How many triangle per second is enough?".
Let’s start with a proverb…
I’ve put together this blog to assist in the understanding of appropriate use of geometry to drive fragment creation with the hopeful outcome that more people will understand the question – “What is the purpose of a triangle in 3D graphics?”
As a generalized statement, 3D graphics objects are made up of convex geometric approximations of the outer hull or “skin” of a real world object that it represents using points in 3D (X,Y,Z) space. These points are referred to as vertices with a relationship between those vertices such that one or more points will combine to form a primitive, which in OpenGL® ES is constrained to points, triangles or lines. In the case of the triangle this represents a facet of the surface of the object.
Why approximation? Well, if you throw enough triangles at it you can pretty much say they are exact (note: my internal monologue is now having a Sheldon Cooper-esque argument about the correctness of that statement), but approximation is usually good enough for your brain to make out the general shape of the object and recognize it as a small off duty Czechoslovakian traffic warden or a banana, as it were. The job of a single triangle in this geometry is to serve as a sample container for the section of the surface it represents, with the detail of that surface being represented by the fragments that will be generated in the final image. The point of this is that the triangle can cheaply represent a container, which can in turn be relatively cheaply manipulated (scaled, rotated, etc.) and the relationship between effects and other sample references (lighting, textures, normal maps, etc.) applied to its vertices can then be cheaply extrapolated to create the fragments which are contained within its surface. The fragments are the “things” value, as they are the things that will be seen, the triangle primitive is the vessel by which they arrive. For more on this see this blog.
Given the above statement, it is expected that the ratio of primitive to fragments is a 1:’N’ relationship where ‘N’ is many. Given this, there is a watermark for ‘N’ where, when rendered, if the geometry consistently yields a relationship where ‘N’ is lower than that watermark, the primitive as a sample container breaks down and so does the efficiency of the GPU and you are now limited by something that was supposed to have trivial cost. In other words, if the cost of the per vertex calculations (where cost is a function of compute, bandwidth, etc.) actually outstrips the total cost of fragments it contains then you are into negative ROI because you are spending more compute on non-visible containers than on visible pixels.
The watermark for ‘N’ depends greatly on the relative fragment and vertex processing costs and can be very complex to calculate. To make life easier, as a rule of thumb, the watermark is usually characterized by the cost of the rasterisation stage, the mathematical process which breaks the primitive into the fragments that its footprint represents. In modern GPUs, rasterisation cost is in the order of 8-10 cycles for a basic triangle; therefore, as the rule of thumb, coverage of 10 fragments to one triangle should be used as the low water mark.
Using the above discussion we have established the purpose of the triangle and some sensible constraints for what we could define as being a meaningful triangle. Given this we can begin to discuss how many triangles are appropriate before you reach a point of visual saturation. Plowman's Law is my attempt at defining a point at which adding more geometry to a scene will yield little to no extra visual return for the additional processing cost of that geometry, i.e. you have reached a point of visual saturation.
Plowman's Law defines visual saturation as the point at which a surface of a given resolution which, allowing for overdraw by multiplying by the average overdraw factor, when divided by an average coverage per triangle yields the maximum number of triangles drawn (i.e. not culled before rasterisation). To allow for additional triangles which may be potentially back facing, which in an average convex object would be about 50%, we double the calculation. Laying that out as a formula we end up with something like this:
Sensible overdraw factors should lie within the range of 1.3 to 2.0 with 2.0 being a very high overdraw factor for a well-written graphics application. Remember that this is not a single point of overdraw, but the amount of overdraw per pixel averaged across the surface being rendered. The 1.3 factor is much more sensible as this represents about 30% of the screen being redrawn. Now we have our formula, lets look at a simple example for a 1080p screen:
Given this number, if we were to multiply by a target frame rate we would derive the required number of triangle/second required to achieve visual saturation in every frame, or the answer to the question "How many triangle per second is enough?". Which in the case of a 1080P screen @ 60Hz = 32M tri/sec.
Next time we'll talk about the issues of getting all that performance within the constraints of a mobile device in “PHENOMENAL COSMIC POWERS! Itty-bitty living space!”