A Bug on huawei phones

Today, I checked a glow jitter problem: one of our PBR motorcycles was flashing badly in the highlights after the glow was turned on, and this flicker only appeared on a Huawei mobile phone (Mali GPU).

After analyzing with RenderDoc, the highlights at the flicker were out of the sky, as shown below:

As you can see from the figure above, the color value of the red box is 65504. Since we have FP16 HDR enabled, 65504 is the maximum value that FP16 can represent.

0 11110 1111111111=(-1)^0 * 2^15 * (1+1-2^-10)=65504

This is intuitively a problem with floating-point accuracy, since the Mali GPU suffered a lot earlier 🙂

correction

This problem can be solved simply by clamp on the final specular values.

But as an OCD, I wanted to find out exactly what was wrong, so I did some debugging and found the code as follows:

half perceptualRoughness = SmoothnessToPerceptualRoughness(smoothness);
half roughness = PerceptualRoughnessToRoughness(perceptualRoughness);

half V = SmithJointGGXVisibilityTerm(NoL, NoV, roughness); 
half D = GGXTerm(NoH, roughness);
half specularTerm = V * D * UNITY_PI;
Copy the code

Here, the CALCULATION of PBR highlight item is directly taken from Unity’s BRDF1 algorithm, and the Fresnel term is removed. Roughness accuracy in the above code affects the final calculation result of highlight.

Let’s look at the code for the normal distribution function GGXTerm:

inline float GGXTerm (float NdotH, float roughness)
{
    float a2 = roughness * roughness;
    floatD = (NdotH * a2 - NdotH) * NdotH + 1.0f; // 2 madreturn UNITY_INV_PI * a2 / (d * d + 1e-7f); 
    // This function is not intended to be running on Mobile,
    // therefore epsilon is smaller than what can be represented by half
}
Copy the code

The arguments are float, and there is a clear comment at the end of the function stating that this function is not intended to run on mobile devices, because here 1e-7f is not considered compatible with half accuracy:

This function is not intended to be running on Mobile, therefore epsilon is smaller than what can be represented by half

The smallest half-precision floating-point number that can be represented is 6.10×10^(-5) :

0 00001 0000000000=2^-14 = 6.10*10^-5

So change the roughness from half to float, and the problem is fixed.

URP pipeline for BRDF simplification

Using Standard pipeline BRDF1 algorithm directly on mobile devices, the computation is slightly higher.

Here we can also refer to BRDF2, or refer to the URP pipeline for DirectBDRF simplification, the code is as follows:

// Based on Minimalist CookTorrance BRDF
// Implementation is slightly different from original derivation: http://www.thetenthplanet.de/archives/255
//
// * NDF [Modified] GGX  
// * Modified Kelemen and Szirmay-Kalos for Visibility term
// * Fresnel approximated with 1/LdotH 
half3 DirectBDRF(BRDFData brdfData, half3 normalWS, half3 lightDirectionWS, half3 viewDirectionWS)
{
#ifndef _SPECULARHIGHLIGHTS_OFF
    float3 halfDir = SafeNormalize(float3(lightDirectionWS) + float3(viewDirectionWS)); 

    floatNoH = saturate(dot(normalWS, halfDir)); half LoH = saturate(dot(lightDirectionWS, halfDir)); // BRDFspec = (D * V * F) / 4 // D = roughness^2 / (NoH^2 * (roughness^ 2-1) + 1) ^2 // V * F = 1.0 / (LoH^2 * (roughness + 0.5)) // See"Optimizing PBR for Mobile"from Siggraph 2015 moving mobile graphics course // https://community.arm.com/events/1155 // Final BRDFspec = Roughness ^2 / (NoH^2 * (roughness^ 2-1) + 1) ^2 * (LoH^2 * (roughness + 0.5) * 4.0) // We further optimize a few light Invariant terms / / brdfData. NormalizationTerm = (roughness) + 0.5 + 2.0 * 4.0 * 4.0 rewritten as roughness to fit a a MAD.floatD = NoH NoH * * brdfData. Roughness2MinusOne + 1.00001 f; half LoH2 = LoH * LoH; Half specularTerm = brdfData. Roughness2 / (d) (d * * Max (0.1 h, LoH2) * brdfData normalizationTerm); // On platformswhere half actually means something, the denominator has a risk of overflow
    // clamp below was added specifically to "fix" that, but dx compiler (we convert bytecode to metal/gles)
    // sees that specularTerm have only non-negative terms, so it skips max(0,..) inclamp (leaving only min(100,...) )#if defined (SHADER_API_MOBILE) || defined (SHADER_API_SWITCH)
    specularTerm = specularTerm - HALF_MIN;
    specularTerm = clamp(specularTerm, 0.0, 100.0); // Prevent FP16 overflow on mobiles
#endif

    half3 color = specularTerm * brdfData.specular + brdfData.diffuse;
    return color;
#else
    return brdfData.diffuse;  
#endif
}
Copy the code

The code is clearly commented and simplified by referring to SIGGRAPH 2015 Optimizing PBR for Mobile.

The classic BRDF formula for microsurface highlights is as follows:

Optimizing PBR for Mobile, V * F can be combined and approximated:

BRDFspec = (D * V * F) / 4.0

D = roughness^2 / ( NoH^2 * (roughness^2 – 1) + 1 )^2

V * F = 0.5 / (LoH^2 * (roughness + 0.5))

The final result is as follows:

Finally, the above code also takes care of half accuracy:

#define HALF_MIN 6.103515625E-5 // 2^-14, the same value for 10, 11 and 16-bit: https://www.khronos.org/opengl/wiki/Small_Float_Formats

// On platforms where half actually means something, the denominator has a risk of overflow
// clamp below was added specifically to "fix" that, but dx compiler (we convert bytecode to metal/gles)
// sees that specularTerm have only non-negative terms, so it skips max(0,..) inclamp (leaving only min(100,...) )#if defined (SHADER_API_MOBILE) || defined (SHADER_API_SWITCH)
    specularTerm = specularTerm - HALF_MIN;
    specularTerm = clamp(specularTerm, 0.0, 100.0); // Prevent FP16 overflow on mobiles
#endif
Copy the code

Personal home page

Links to personal home page of this article: baddogzz. Making. IO / 2020/04/27 /… .

Okay, bye!