Myles Maxfield @Litherum

Translator: UC International research and development Jothy



Welcome to the “UC International Technology” public account, we will provide you with the client, server, algorithm, testing, data, front-end and other related high-quality technical articles, not limited to original and translation.


This article introduces a new Web graphics coloring language: Web Advanced Coloring Language (WHLSL, pronounced “whistle”). The language was inspired by HLSL, the primary coloring language used by graphics application developers. It extends the HLSL of the Web platform to make it secure and reliable. It is easy to read and write, uses formal techniques and is well specified.



background

3D graphics have changed significantly over the past few decades, and the apis programmers use to write 3D applications have changed accordingly. Five years ago, the most advanced graphics applications used OpenGL to perform rendering. However, in the past few years, the 3D graphics industry has been moving toward newer, lower-level graphics frameworks that are more consistent with the behavior of real hardware. In 2014, Apple created the Metal framework, which allows iOS and macOS apps to take full advantage of gpus. In 2015, Microsoft created Direct3D 12, a major update to Direct3D that allows console-level rendering and computing efficiency.

In 2016, Khronos Group released the Vulkan API, primarily for Android, with similar advantages.

Just as WebGL brought OpenGL to the Web, the Web community is looking to bring this type of new low-level 3D graphics API to the platform. Last year, Apple created the WebGPU Community Group within the W3C to standardize the new 3D graphics API, which offers the benefits of native apis but is also suitable for Web environments. This new Web API can be implemented on Metal, Direct3D, and Vulkan. All major browser vendors are participating in and contributing to this standardization effort.




Each of these modern 3D graphics apis uses shaders, and WebGpus are no exception. Shaders are programs that take advantage of gPU-specific architectures. In particular, the GPU is superior to the CPU in heavy parallel numerical processing. To take advantage of both architectures, modern 3D applications use hybrid designs that use cpus and Gpus to accomplish different tasks. By leveraging the best features of each architecture, the modern graphics API provides developers with a powerful framework for creating complex, rich, and fast 3D applications. Metal Shading Language is used for Metal, HLSL for Direct3D 12, and Spir-V or GLSL for Vulkan.





Language requirements

Like its native counterparts, WebGPU needs a shader language. The language needs to meet several requirements to be suitable for the Web platform.

It needs to be safe. Whatever the application does, the shader must only read or write data from the domain of the web page. Without this guarantee, malicious sites can run shaders to read pixels from other parts of the screen, or even native applications.



It needs to specify language specifications explicitly. The language specification must specify whether every possible string is a valid program. As with all other Web formats, the Web’s coloring language must be specified precisely to ensure interoperability between browsers.

It also needs to explicitly specify a compilation specification so that it can be used as a compilation target. Many rendering teams write shaders in in-house custom languages and then cross-compile them into the necessary languages. For this reason, languages should have a fairly small set of explicit syntax and type-checking rules that compiler writers can refer to when issuing the language.



It needs to be translated into Metal Shading Language, HLSL (or DXIL) and SPir-V. This is because WebGPU is designed to work on Metal, Direct3D 12, and Vulkan at the same time, so shaders need to be able to be represented in a form acceptable to each of the above apis.

It needs to be efficient. The ultimate reason developers want to use gpus in the first place is performance. The compiler itself needs to run fast, and the programs generated by the compiler need to run efficiently on a real GPU.



It needs to evolve using the WebGPU API. WebGPU features such as binding models and surface subdivision models interact deeply with the coloring language. While it is possible to use a language independent of API development, using the WebGPU API and the coloring language in the same forum ensures shared goals and simplifies development.

It needs to be easy for developers to read and write. There are two parts to this: First, both GPU programmers and CPU programmers should be familiar with the language. GPU programmers are important users because they have experience writing shaders. CPU programmers are important because Gpus are increasingly used for purposes other than rendering, including machine learning, computer vision, and neural networks. For them, the language should be compatible with familiar programming language concepts and syntax.



The second part is that language should be human readable. The culture of the Web is that anyone can start writing a Web page with a text editor and browser. The democratization of content is one of the Web’s greatest strengths. This culture has created a rich ecosystem of tools and reviewers, and tinkerers can use View-Source to investigate how any web page works. A human-readable language with a single specification will greatly assist the community in adopting the WebGPU API.

All the major languages used on the Web today are human-readable, with one exception. The WebAssembly community group wants to parse bytecodes more efficiently than text languages. But that turned out not to be the case; Asm.js is JavaScript source code and is still faster than WebAssembly in many use cases.



Similarly, using bytecode formats such as WebAssembly does not avoid the need for browsers to optimize source code. Every major browser runs optimizations on bytecode before executing. Unfortunately, the desire for simpler compilers never ended.

The community group is actively discussing whether such human-readable languages should be accepted by the API itself, but the group agrees that languages written with shaders should be easy to read and write.





A new language? Is it true?

While there are many existing languages, none are designed with the Web and modern graphical applications in mind, and none meet the requirements listed above. Before we describe WHLSL, let’s look at some of the existing languages.



Metal Shading Language is very similar to C++, which means it has all the features of bit conversion and raw Pointers. It’s very powerful; You can even compile the same source code for the CPU and GPU. Porting existing CPU-side code to Metal Shading Language is very easy. Unfortunately, all of these capabilities have some drawbacks. For example, in Metal Shading Language, you can write a shader that converts the pointer to an integer, add 17, cast it back to the pointer, and then unreference it. This is a security issue because it means that the shader can access any resource that happens to be in the application’s address space, contrary to the Web’s security model. Theoretically, it would be possible to specify a Metal Shading Language without primitive Pointers, but Pointers are so basic to C and C++ languages that the results would be completely alien. C++ also relies heavily on undefined behavior, so any effort to fully specify C++ ‘s many features is unlikely to succeed.

HLSL is a supported language for portable Direct3D shaders. It is currently the most popular real-time coloring language in the world, and therefore the most familiar language for graphics programmers. There are multiple implementations, but no formal specification, making it difficult to create consistent, interoperable implementations. Nonetheless, given the ubiquity of HLSL, it is valuable to adopt its syntax wherever possible in the design of WHLSL.



GLSL is the language used by WebGL and is used by WebGL for the Web platform. However, interoperability across browsers is extremely difficult to achieve due to GLSL compiler incompatibility. GLSL is still under investigation due to long-standing security and portability errors. Besides, GLSL is coming of age. Its limitations lie in its lack of pointer-like objects, or the ability to have variable-length arrays. Its inputs and outputs are global variables with hard-coded names.

Spir-v is designed as a low-level, general-purpose intermediate format for the actual shader language that developers will use. People don’t write Pir-v; They use human-readable language and then use tools to convert it into SPIR-V bytecode.

There are some challenges to adopting Pir-V on the Web. First, Spir-V was not written with security as a first principle, and it is not clear that it can be modified to meet the security requirements of the Web. The Fork Spir-v language means that developers must recompile shaders and may be forced to rewrite their source code. In addition, browsers still can’t trust incoming bytecodes and need validators to make sure they’re not doing anything unsafe. Since Windows and macOS/iOS do not support Vulkan, the incoming Spil-v still needs to be translated/compiled into another language. Oddly, this means that on both platforms, the start and end points are human-readable, but the bits in between are confused without any benefit.



Second, Spir-v contains more than 50 optional features whose implementation is selectively supported, so shader authors using Spir-v do not know if their shader will work on a WebGPU implementation. This is the opposite of the one-write run feature of the Web.

Third, many graphics applications (such as babibs.js) need to dynamically modify shaders at run time. Using the bytecode format means that these applications must include a compiler written in JavaScript that runs in a browser to generate bytecode from a dynamically created shader. This will significantly increase the bloat of these sites and will result in worse performance.

Although JavaScript is the standard language for the Web, its properties make it a poor candidate for a coloring language. One of its strengths is its flexibility, but this dynamic results in many conditions and different control flows that the GPU cannot perform effectively. It’s also garbage collected, which is definitely not a good program for GPU hardware.



WebAssembly is another familiar possibility, but it also doesn’t map well to the ARCHITECTURE of the GPU. For example, WebAssembly assumes a dynamically sized heap, but A GPU program can access multiple dynamically sized buffers. There is no recompilation, no high-performance way to map between the two models.

Therefore, after a fairly exhaustive search of the corresponding languages, we could not find a language that was sufficient to meet the requirements of the project. So the community group is working on a new language. Creating a new language was a daunting task, but we thought there was an opportunity to make something new that used the design principles of modern programming languages and met our requirements.



WHLSL

WHLSL is a new coloring language suitable for Web platforms. It was developed by the W3C’s WebGPU community group, which is working on specifications, compilers, and CPU port interpreters to demonstrate its correctness.

The language is based on HLSL, but simplifies and extends it. We really want the existing HLSL shaders to work as WHLSL shaders. Because WHLSL is a powerful and expressive coloring language, some HLSL shaders need to be tweaked, so WHLSL guarantees the above security and other benefits.

For example, here is an example vertex shader from Microsoft’s Directx-Graphics-Samples repository. It can be used as a WHLSL shader without any changes:

VSParticleDrawOut output;
output.pos = g_bufPosVelo[input.id].pos.xyz;
float mag = g_bufPosVelo[input.id].velo.w / 9;
output.color = lerp(float4(1.0f, 0.1f, 0.1f, 1.0f), input.color, mag);
return output;Copy the code
Here is the associated pixel shader, which runs as a completely unmodified WHLSL shader:

floatIntensity = 0.5 f-length (float2 (0.5 f to 0.5 f) - input. Tex); Intensity = clamp(intensity, 0.0F, 0.5F) * 2.0F;return float4(input.color.xyz, intensity);Copy the code


basis

Let’s talk about language itself.



As in HLSL, the raw data types are bool, int, uint, float, and half. Double types are not supported because they do not exist in Metal and software emulation is too slow. Bool has no specific bit representation and therefore cannot appear in shader input/output or resources. The same limitation exists in Spir-v, and we want to be able to use OpTypeBool in the generated Spir-v code. WHLSL also includes the smaller integer types char, uchar, short and USHORT, which can be used directly in Metal Shading Language and can be specified in Spir-v by specifying 16 in OpTypeFloat, And it can be simulated in HLSL. These types of emulation are faster than double emulation because the types are smaller and their bit representation is less complex.

WHLSL does not provide c-style implicit conversions. We found that implicit conversions were a common source of errors in shaders and forced programmers to specify where conversions occurred, eliminating this often frustrating and mysterious error. This is an approach similar to that of languages such as Swift. In addition, the lack of implicit conversions makes the specification and compiler simple.



As in HLSL, WHLSL has vector and matrix types, such as FLOAT4 and INT3x4. We chose to keep the library simple rather than add a bunch of “x1” single-element vectors and matrices, because single-element vectors can already be represented as scalars, and single-element matrices as vectors. This is consistent with the desire to eliminate implicit conversions and requires explicit conversions between float1 and float, which is cumbersome and unnecessarily verbose.

Therefore, here is a valid fragment of the shader:

int a = 7;
a += 3;
float3 b = float3 (float(a) * 5, 6, 7);
float3 c = b.xxy;
float3 d = b * c;Copy the code
I mentioned earlier that implicit conversions are not allowed, but you may have noticed in the code snippet above that 5 is not written as 5.0. This is because literals are represented as special types that can be unified with other numeric types. When the compiler looks at the code above, it knows that the multiplication operator requires arguments of the same type, and that the first argument is obviously a floating-point number. So, when the compiler sees float(a)* 5, it says “WELL, I know the first argument is a float, which means I have to use the (float, float) overload, so let’s unify 5 with the second argument, so 5 becomes a float. “This works even if both arguments are literal, because literal has a preferred type. Therefore, 5 * 5 will get (int, int) overload, 5u * 5u will get (uint, uint) overload, and 5.0 * 5.0 will get (float, float) overload.



One difference between WHLSL and C is that WHLSL initializes all uninitialized variables at its declaration site at zero. This prevents non-portable behavior across operating systems and drivers — or even worse, reading any values of the page before the shader starts executing. This also means that all constructible types in WHLSL have zero values.

The enumeration

Because enumerations don’t incur any runtime costs and are very useful, WHLSL itself supports them.

enum Weekday {
   Monday,
   Tuesday,
   Wednesday,
   Thursday,
   PizzaDay
}Copy the code
The base type of enumerations defaults to int, but you can override types by, for example, enumerating Weekday:uint. Similarly, enumerated values can have base values, such as Tuesday = 72. Because enumerations already define types and values, they can be used in buffers, and they can be converted between the underlying type and the enumerated type. When you want to reference a value in your code, you can use weekday. PizzaDay just as you would use enumerations in C++. This means that enumeration values do not pollute the global namespace, nor do values from individual enumerations clash.

structure

The structure in WHLSL is similar to HLSL and C.

struct Foo {
   int x;
   float y;
}Copy the code
Simple in design, they can avoid inheritance, virtual methods, and access control. It is not possible to have “private” members of a structure. Since the structure has no access control, the structure does not need to have member functions. Free functions can see every member of every structure.

An array of



Like other coloring languages, arrays are value types that pass and return functions by value (also known as “copy-in copy-out,” similar to regular scalars). One can be created using the following syntax:

int[3] x;Copy the code
Like any variable declaration, this fills the contents of the array with zero and is therefore an O(n) operation. We want to place the parentheses after the type instead of the variable name for two reasons:

  1. Having all type information in one place makes the parser simpler (avoid the clockwise/spiral rule)
  2. Avoid ambiguity when declaring multiple variables in a single statement (e.g. Int [10] x, y;)


One of our key ways to ensure language security is to perform boundary checking on each array access. We make this potentially expensive operation efficient in a number of ways. The array index is uint, which reduces the check to a single comparison. Arrays have no sparse implementation and contain a length member that is available at compile time, making access costs close to zero.

Arrays are value types, and WHLSL implements reference semantics using two other types: safe Pointers and array references.

Safety pointer The first one is the safety pointer. Some form of reference semantics (behavior Pointers allow) is used in almost every CPU-side programming language. Including Pointers in WHLSL will make it easier for developers to migrate existing CPU-side code to gpus, making it easy to port things like machine learning, computer vision, and signal processing applications.



To meet safety requirements, WHLSL uses safety Pointers that are guaranteed to point to valid or invalid Pointers. As with C, you can use the & operator to create Pointers to values to the left and the * operator to dereference them. Unlike C, you can’t index by pointer – if it’s an array. You cannot convert it to a scalar value, nor can you use a specific bit-pattern representation. Therefore, it cannot exist in buffers or as shader input/output.

Just like in OpenCL and Metal Shading Language, gpus have different heaps, or address Spaces that can store values. WHLSL has four different heaps: devices, constants, thread groups, and threads. All reference types must be marked with the address space they point to.



The device address space corresponds to most of the memory on the device. The memory is read-write and corresponds to the out-of-order access view in Direct3D and the device memory in Metal Shading Language. A constant address space corresponds to a read-only area of memory and is typically optimized for data broadcast to each thread. Therefore, writing an lvalue that exists in a constant address space is a compilation error. Finally, the thread group address space corresponds to a region of memory that can be read and written, which is shared between each thread in the thread group. It can only be used to compute shaders.

By default, values exist in the thread address space:

int i = 4;
thread int* j = &i;
*j = 7;
// i is now 7Copy the code
Since all variables are initialized at zero, Pointers are null-initialized. Therefore, the following is valid:

thread int* i;Copy the code
Attempting to dereference this pointer will result in a trap or clamp, as described later.

An array reference

Array references are similar to Pointers, but they can be used with subscript operators to access multiple elements in an array reference. Although the length of the array is known at compile time and must be declared in the type declaration, the length of the array reference is known only at run time. Just like Pointers, they must be associated with the address space, and they may be nullPTR. Just like arrays, they are indexed with uint for single comparison bounds checking, and they cannot be sparse.



They correspond to the OpTypeRuntimeArray type in SPIR-V and one of the buffers, rwBuffers, structuredBuffers, or rwstructuredBuffers in HLSL. In Metal, it is represented as a pointer and a tuple of length. Just like array access, all operations are checked against the length of the array reference. Buffers are passed to the API’s entry point either by array references or Pointers.

You can use the @ operator to reference an array from an lvalue:

int i = 4;
thread int[] j = @i;
j[0] = 7;
// i is 7
// j.length is 1Copy the code
As you might expect, using @ on pointer j creates a reference to the same array as j:

int i = 4;
thread int* j = &i;
thread int[] k = @j;
k[0] = 7;
// i is 7
// k.length is 1Copy the code
Use @ on an array to make an array reference to that array:

int[3] i = int[3](4, 5, 6);
thread int[] j = @i;
j[1] = 7;
// i[1] is 7
// j.length is 3Copy the code

function

The function looks very similar to the function of C. For example, here is a function from the standard library:

float4 lit(float n_dot_l, float n_dot_h, float m) {
   float ambient = 1;
   float diffuse = max(0, n_dot_l);
   float specular = n_dot_l < 0 || n_dot_h < 0 ? 0 : n_dot_h * m;
   float4 result;
   result.x = ambient;
   result.y = diffuse;
   result.z = specular;
   result.w = 1;
   return result;
}Copy the code
This example shows how functions like WHLSL are similar to C: function declarations and calls (for example, for Max ()) have similar syntax, arguments and arguments match in pairs sequentially, and support for ternary expressions.



Operator and operator overloading But there’s something else going on here, too. When the compiler sees n_dot_h * m, it essentially doesn’t know how to perform the multiplication. Instead, the compiler converts it to a call to operator(). Then, the standard function overload decision algorithm selects specific operators for execution. This is important because it means you can write your own operator*() function and teach WHLSL how to multiply your own types.

This even applies to operations like ++. Although the front and back deltas have different behaviors, they are both overridden to the same function: operator++(). Here is an example from the standard library:

int operator++(int value) {
   return value + 1;
}Copy the code
This operator is called for pre – and post-increments, and the compiler is smart enough to do the right thing with the result. This solves the problem of C++ running these operators in different places and using extra pseudo-int arguments to distinguish between them. For post-increment, the compiler issues code to save the value to an anonymous variable, calls operator++(), assigns the result, and uses the saved value for further processing.



Operator overloading is used throughout the language. That’s how you do vector and matrix multiplication. That’s the way arrays are indexed. This is how the hybrid operator works. Operator overloading provides power and simplicity; The core language does not have to know each operation directly because they are implemented by overloaded operators.

Generate properties

But WHLSL doesn’t just stop at operator overload. The previous examples include b.xy, where b is float3. This is an expression that means “make a 3-element vector where the first two elements have the same value as bx and the third element has the same value”, so it’s kind of like a member of a vector, except that it’s not ‘actually associated with any storage; Instead, it is calculated during the visit. These “hybrid operators” exist in every real-time coloring language, and WHLSL is no exception. They are supported by marking them as generated properties, as in Swift.

Getters

The standard library contains many features in the form of:

float3 operator.xxy(float3 v) {
   float3 result;
   result.x = v.x;
   result.y = v.x;
   result.z = v.y;
   return result;
}Copy the code
When the compiler sees an attribute access to a nonexistent member, it can call an operator that passes the object as the first argument. Informally, we call it a getter.

Setters

The same method even applies to setters:

float4 operator.xyz=(float4 v, float3 c) {
   float4 result = v;
   result.x = c.x;
   result.y = c.y;
   result.z = c.z;
   return result;
}Copy the code
Using setters is very natural:

float4 a = float4 (1, 2, 3, 4); a.xyz =float3 (7, 8, 9);Copy the code
The implementation of the setter uses the new data to create a copy of the object. When the compiler sees an assignment to the generated property, it calls the setter and assigns the result to the original variable.

Anders

The generalization of getters and setters is ander, which is used with Pointers. It exists as a performance optimization, so the setter does not have to create a copy of the object. Here’s an example:

thread float* operator.r(thread Foo* value) {
   return &value->x;
}Copy the code
Anders is more powerful than getters or setters because the compiler can use Anders to implement reads or assignments. When reading from a generated property through Ander, the compiler calls Ander and then dereferences the result. When writing, the compiler calls ander, dereferencing the result, and assigning the result to it. Any user-defined type can contain any combination of getters, setters, Anders, and Indexer; The compiler will prefer ander if the same type has an ANDER and a getter or setter.

Indexers

But what about matrices? In most real-time coloring languages, the member access matrix corresponding to its column or row is not used. Instead, they are accessed using array syntax, such as 3 of myMatrix. Vector types usually have this syntax as well. So what? More operators overload!

float operator[](float2 v, uint index) {
   switch (index) {
       caseZero:return v.x;
       case 1:
           return v.y;
       default:
           /* trap or clamp, more on this below */
   }
}

float2 operator[]=(float2 v, uint index, float a) {
   switch (index) {
       case 0:
           v.x = a;
           break;
       case 1:
           v.y = a;
           break;
       default:
           /* trap or clamp, more on this below */
   }
   return v;
}Copy the code
As you can see, indexes also use operators, so they can be overloaded. Vectors also get these “indexers”, so myvector. x and myVector [0] are synonyms for each other.

The standard library

We designed the standard library based on Microsoft Docs, which describes the HLSL standard library. The WHLSL standard library consists primarily of mathematical operations that can handle both scalar values and elements of vectors and matrices. Defines all the standard operators you expect, both logical and bitwise, such as operator*() and operator<<(). Define all hybrid operators, getters, and setters for vectors and matrices, where applicable.



One of the design principles of WHLSL is to keep the language itself small so that it can be defined in the standard library as much as possible. Of course, not all functions in the library can be represented in WHLSL (such as the bool operator * (float, float)), but almost all functions are implemented in WHLSL. For example, this function is part of the standard library:

float smoothstep(float edge0, float edge1, float x) {
   float t = clamp((x - edge0) / (edge1 - edge0), 0, 1);
   return t * t * (3 - 2 * t);
}Copy the code
Because the library is designed to match HLSL as much as possible, most of its functions already exist directly in HLSL. Therefore, an assembly of HLSL’s WHLSL standard library will choose to ignore these functions and use the built-in versions instead. For example, this happens with all vector/matrix indexers – the GPU never actually sees the code above; The code generation step in the compiler should be replaced internally. However, different coloring languages have different built-in functions, so each function is defined to allow correctness testing. Similarly, WHLSL includes a CPU-side interpreter that uses WHLSL implementations of these functions when executing WHLSL programs. This is true for every WHLSL function, including the texture sampling function.



Not every feature in the HLSL standard library exists in WHLSL. For example, HLSL supports printf(). However, implementing such functionality in Metal Shading Language or Spir-V would be very difficult. We include as many functions as possible in the HLSL standard library, which makes sense in a Web environment.

Variable Lifetime

But if there are Pointers in the language, how do we deal with the problem of free use? For example, consider the following code snippet:

thread int* foo() {
   int a;
   return&a; }... int b = *foo();Copy the code
In a language like C, this code has undefined behavior. Therefore, one solution is for WHLSL to simply disallow this construct and throw a compilation error when it sees something like this. However, this requires keeping track of what value each pointer might point to, which is a difficult analysis when there are loops and function calls. Instead, WHLSL makes each variable behave as if it had a global life cycle.



This means that this WHLSL snippet is fully valid and well-defined for two reasons:

Declaring that there is no initializer will populate it with zero. Therefore, the value of A is clearly defined. This zero padding happens every time foo() is called. All variables have a global life cycle (similar to static keywords in C). So, never out of range.



This global lifecycle is only possible because recursion is not allowed (which is common for coloring languages), which means there are no reentrant problems. Similarly, shaders cannot allocate or free memory, so the compiler knows at compile time every block of memory that the shader may access.

So, for example:

thread int* foo() {
   int a;
   return&a; }... thread int* x = foo(); *x = 7; thread int* y = foo(); // *x equals 0, because the variable got zero-filled again *y = 8; // *x equals 8, because x and y point to the same variableCopy the code
Most variables do not need to be truly global, so they do not have much impact on performance. If the compiler can prove that it is not observable whether a particular variable actually has a global lifetime, the compiler is free to keep the variable as local. Because the pattern of returning a pointer to a local is discouraged in other languages (in fact, many other coloring languages don’t even have Pointers), examples like this would be relatively rare.

Compilation phase

WHLSL does not use a preprocessor as other languages do. In other languages, the primary purpose of the preprocessor is to include multiple source files together. On the Web, however, there is no direct file access, and typically the entire shader appears in a downloaded resource. In many coloring languages, preprocessors are used to conditionally enable rendering in large Ubershaders, but WHLSL allows this use case by using specialized constants. In addition, many variations of the preprocessor are incompatible in subtle ways, so the benefits of a preprocessor for WHLSL do not outweigh the complexity of creating a specification for it.



WHLSL is designed for two-stage compilation. In our research, we found that many 3D engines want to compile large shaders, and each compilation includes a large library of functions that are repeated between compilations. Rather than compiling these support functions multiple times, a better solution would be to compile the entire library at once and then allow the second stage to choose which entry points in the library should be used together.

This two-phase compilation means that as many compilations as possible are done in the first pass, so there are not many runs for the shader family. This is why entry points in WHLSL are marked as vertices, fragments, or computations. Letting the first phase of compilation know which functions are which type of entry points allows more compilation to take place in the first phase than in the second.



The second compilation phase also provides convenient locations for specifying specialized constants. Recall that WHLSL does not have a preprocessor, which is the traditional way to enable and disable functionality in HLSL. The engine typically customizes a single shader for a particular situation by enabling render effects or by toggle BRDF with a flip switch. The technique of including each render option in a single shader, and specifically setting a single shader based on which effect is enabled, is so common that it has a name: UberShaders. WHLSL programmers can use specialized constants instead of preprocessor macros, which work in the same way as the specialized constants of SPir-V. From a language point of view, they are just scalar constants. However, the values of these constants are provided during the second compilation phase, which makes it very easy to configure the program at run time.

Because a single WHLSL program can contain multiple shaders, the input and output of a shader are not represented by global variables as in other shader languages. Instead, the inputs and outputs of a particular shader are associated with the shader itself. The input is represented as the parameters of the shader entry point, and the output as the return value of the entry point.

The following shows how to describe compute shader entry points:

Compute void ComputeKernel(device uint[] b: register(u0)) {... }Copy the code

security

WHLSL is a safe language. This means that information beyond the origin of the site cannot be accessed. One of the ways WHLSL achieves this is by eliminating undefined behavior, as described above regarding uniformity.



Another way WHLSL implements security is by performing boundary checking for array/pointer access. These boundary checks may take three forms:

1. Trapping. When a trap occurs in the program, the shader stage exits immediately, filling the output of all shader stages with zeros. The draw call continues, and the next phase of the graphics pipeline runs.

Because trap printing introduces a new control flow, it affects the consistency of the program. Traps are emitted within boundary checking, which means that they must exist in non-uniform control flows. This may be fine for some programs that don’t use uniformity, but in general this makes traps difficult to use.



2. The Clamping. An array index operation can limit an index to the size of the array. There is no new control flow involved, so it has no effect on uniformity. You can even “clap” pointer access or zero-length array access by ignoring writes and returning 0 for reads. This is possible because there is a limit to what you can do with the pointer in WHLSL, so we can simply have each operation do something clearly defined with a “Clamped” pointer. Hardware and driver support. Some hardware and drivers already include a mode in which out-of-bounds access does not occur. With this approach, the mechanism by which the hardware prohibits out-of-bounds access is implementation-defined. An example is the ARB_robustness OpenGL extension. Unfortunately, WHLSL should run on almost all modern hardware, and there are not enough apis/devices to support these modes.

Whichever method the compiler uses should not affect the uniformity of the shader; In other words, it cannot turn otherwise valid programs into invalid ones.



To determine the optimal behavior of boundary checking, we performed some performance experiments. We took some of the kernels used in the Metal Performance Shaders framework and created two new versions: one using CLAMP and one using Trap. The kernels we chose were those that do a lot of array access: multiply by large matrices, for example. We ran this benchmark on a variety of devices with different data sizes. We made sure that no traps were actually hit, and that no clamp actually had any effect, so we could be sure that we were measuring the common case of properly written programs.

We expect traps to be faster because downstream compilers can eliminate redundant traps. But we found no clear winner. Clamp is significantly faster than trap on some devices and is significantly faster than trap on others. These results suggest that the compiler should be able to choose which approach is best for the particular device on which it runs, rather than being forced to always choose one approach.

Shader identifies WHLSL as supporting a language feature of HLSL called “semantics.” They are used to identify variables between the shader stage and the WebGPU API. There are four types of semantics:

  • Built-in variables, such as uint vertexID: SV_VertexID
  • Specialization constants, such as uint Numlights: specialization
  • Phase input/output semantics, e.g. Float2 Coordinates: property (0)
  • Resource semantics, such as Device float [] Coordinates: register (u0)
As mentioned above, WHLSL programs accept their inputs and outputs as function parameters, not global variables.

However, shaders typically have multiple outputs. The most common example is when a vertex shader passes multiple output values to an interpolator to provide as input to a fragment shader.



To accommodate this, the shader’s return value can be a structure, and the fields are handled independently. In fact, this works recursively – a structure can contain another structure whose members can also be handled independently. The nested structure is flattened, and all unstructured fields are collected and output as shaders.

Shader parameters work the same way. A single parameter can be a shader input or a structure with a shader input set. Structures can also contain other structures. Variables in these structures are handled independently, as if they were additional parameters to the shader.



After flattening all of these structures into a set of inputs and a set of outputs, each item in the collection must have semantics. Each built-in variable must have a specific type and can only be used in a specific shader phase. Specialized constants must have only simple scalar types.

Phase input/output variables have attribute semantics rather than traditional HLSL semantics, because many shaders pass data that does not match the default semantics provided by HLSL. In HLSL, it is common to package generic data into COLOR semantics because COLOR is FLOAT4 and the data fits in float4. Instead, the approach of Spir-v and Metal Shading Language (via [[user(n)]]) is to assign an identifier to each stage input/output variable and use assignments to match variables between shader stages.



HLSL programmers should be familiar with resource semantics. WHLSL includes resource semantics and address Spaces, but the two serve different purposes. The address space of a variable is used to determine which cache and memory hierarchy should be accessed within it. The address space is necessary because it exists even through pointer operations; Device Pointers cannot be set to point to thread variables. In WHLSL, resource semantics are used only to identify variables in the WebGPU API. However, in order to be consistent with HLSL, the resource semantics must “match” the address space of the variables it places. For example, you can’t place a register (s0) on texture. You cannot place register (u0) on a constant resource. Arrays in WHLSL have no address space (because they are value types, not reference types), so if an array is displayed as a shader parameter, it is treated as a device resource for matching semantics.

Just like Direct3D, WebGPU has a two-level binding model. Resource descriptors are aggregated into sets, and sets can be toggled in the WebGPU API. WHLSL matches HLSL: Register (U0, SPACE1) by modeling it with optional spatial parameters within the resource semantics.



The ‘logical mode’ limits WHLSL’s design requirements to be compatible with Metal Shading Language, SPir-V and HLSL (or DXIL). Spir-v has many different modes of operation, targeting different embedding apis. Specifically, we were interested in the taste of Pir-v, which Vulkan targets.

This spir-v flavor is the spir-V flavor and is called logical addressing mode. In SPIR -v logical mode, variables cannot have pointer types. Similarly, Pointers cannot be used for Phi operations. The result is that each pointer must always point to one thing; Pointers are just names of values.



Because WHLSL needed to be compatible with PIR-V, WHLSL had to be more expressive than Pir-V. As a result, WHLSL has some limitations in the Pir-V logical mode that make it expressible. These limitations did not surface as an optional mode for WHLSL; Rather, they are part of the language itself. Eventually, we hope to remove these restrictions in future language versions, but until then, the language is limited.

These limitations are:

Pointers and arrays references may not appear in the device, constant Pointers and arrays in memory or the thread group references may not appear in an array or an array reference pointer and array references may not be in its initialization program distribution in its statement () returns a pointer or an array reference function can only have one return point three expression does not produce a pointer with these restrictions, The compiler knows exactly what each pointer points to.



But not so fast! Recall that thread variables have a global life cycle, which means they behave as if they were declared at the beginning of an entry point. What if the runtime collects all these local variables together, sorts them by type, and aggregates all variables of the same type into an array? The pointer can then simply be an offset of the appropriate array. In WHLSL, Pointers cannot be redirected to different types, which means that the compiler statically determines the corresponding array. Therefore, thread Pointers do not need to comply with the above restrictions. However, this technique does not work for Pointers in other address Spaces; It only applies to thread Pointers.

resources

WHLSL supports buffer texture, sampler and array references. Just as in HLSL, the texture type in WHLSL looks like Texture2D < FLOAT4 >. The presence of these Angle brackets does not imply templates or generics; The language doesn’t have those facilities (for simplicity’s sake). The only types allowed to use them are a limited set of built-in types. This design allows a middle ground between these types (in HLSL), but also allows the language to be further developed in a way that community groups can use Angle bracket characters.



Depth Textures are different from non-depth textures because they are different types of Metal Shading Language, so the compiler needs to know which one to issue when issuing the Metal Shading Language. Textures sampling is not like texture.Sample(…) because WHLSL does not support member functions ; Instead, it uses things like Sample(texture,…) The free function does that.

Unprofessional sampler; All use cases have a sampler type. You can use this sampler for deep textures and non-deep textures. Depth Textures supports things like comparison operations in the sampler. If the sampler is configured to include depth comparisons and it is used with non-depth Textures, the depth operations are ignored.



The WebGPU API will automatically issue some resource barriers at certain locations, which means that the API needs to know which resources are being used in the shader. Therefore, an “unconstrained” resource model cannot be used. This means that all resources are listed as explicit input to the shader. Similarly, the API wants to know which resources are used for reading and which resources are used for writing; The compiler knows this statically by checking the program. “Const” has no language-level support, or there is no difference between StructuredBuffer and RWStructuredBuffer because the information already exists in the program.



The current progress

The WebGPU Community Group is working on a formal language specification written in OTT that describes the rigor of WHLSL’s adoption with other Web languages. We are also working on compilers that can generate Metal Shading Language, Spil-V and HLSL. In addition, the compiler includes a CPU-side interpreter to show the correctness of the implementation. Please try it!





The future direction

WHLSL is still in its infancy and has a long way to go before the language design is complete. We’d love to hear your comments, concerns, and use cases! Feel free to post questions about your thoughts and ideas in our GitHub repository!



For the first proposal, we want to satisfy the constraints outlined at the beginning of this article while providing ample opportunity to extend the language. A natural evolution of languages can add facilities for type abstraction, such as protocols or interfaces. WHLSL contains simple structures with no access control or inheritance. Other coloring languages such as Slang model type abstraction serve as a set of methods that must exist within a structure. However, Slang ran into a problem in that it could not make existing types conform to the new interface. Once a structure is defined, you cannot add new methods to it; Curly braces close the structure forever. This problem is solved by extensions, similar to Objective-C or Swift, that retroactively add methods to structures after they have been defined. Java solves this problem by encouraging authors to add new classes (called adapters) that exist only in the implementation interface and connect each call to the implementation type.

The WHLSL method is much simpler; By using free functions instead of structural methods, we can use systems like Haskell type classes. Here, the type class defines a set of arbitrary functions that must exist, and the type complies with the type class by implementing them. Such a solution may be added to the language in the future.





conclusion

This describes a new coloring language called WHLSL owned by the W3C’s WebGPU community group. Its familiar HLSL-based syntax, security guarantees, and simple, extensible design meet the language’s goals. As such, it represents the best supported way to write shaders used in the WebGPU API. However, the WebGPU community group is unsure whether WHLSL programs should be provided directly to the WebGPU API, or whether they should be compiled into intermediate form before being delivered to the API. Either way, WebGPU programmers should write in WHLSL as it is best suited to the API.



Please join us! We’re doing this on the WebGPU GitHub project. We’ve been working on a formal specification of the Language, a reference compiler for issuing Metal Shading Language and Spir-V, and a CPU-side interpreter for verifying correctness. We welcome you to give it a try and let us know how it works!

For more information, you can contact me at [email protected] or @litherum, or you can contact our evangelist Jonathan Davis.



English text: https://webkit.org/blog/8482/web-high-level-shading-language/

UC International TechnologiesDedicated to sharing high quality technical articles with you

Welcome to wechat search
UC International TechnologyFollow our official account, or share this article with your friends