The reason
A long time ago, I heard that WebGL colleagues of the company investigated GPU compression texture. I also did some research before, and found that basis_Universal tool can realize fast UASTC, etc1s fast transcode to the compressed texture format supported by the corresponding platform. However, due to the large volume of WASM and Loader js, they are not used. I found a lighter transcode implementation later, so I wanted to take advantage of it.
explore
Basic-universal-transcoders are written by the KhronosGroup using AssemblyScript. Compared with the wasm of Basis 220+ KB, basic-universal-transcoders are very lightweight, but they only support three transcode formats. Development is not very active.
Later, we learned that LayaAir’s compressed texture use scheme is relatively simple and rough. Ios uses PVRTC, Android ETC1, and others are PNG/JPG. With hDR-prefilter-texture implemented earlier, the same idea can be applied to compressed textures.
Everything that needs to be processed by the Runtime can be preprocessed, and the Runtime only needs to load the preprocessed products
So there’s this gPU-compressed texture extension that stores the Basis Transcode output and the Runtime downloads the corresponding preprocessed format based on the supported format.
Front knowledge
GLTF structure
Since the goal is a GLTF extension, you need to understand the GLTF format.
ExtensionsUsed tells parser that it needs an extension to parse GLTF. Other tables are similar to relational tables, but are associated with subscripts. For example:
Nodes [j]. Nodes [I]. Meshes [I]. Meshes [I]. Materials to materials [I] [I] normalTexture: point to textures textures [I] [I]. Source: [I]. [I] images to images uri: BufferViews [I] bufferViews[I]. Buffer: Buffers [I] buffers[I]. Uri: buffers
GLTF extension
Now that you have a brief understanding of how GLTF information is related, you can begin to understand how the GLTF extension is written. The need to implement a GLTF extension can also be interpreted as a degraded extension, quite similar to the EXT_texture_webp that Google implements.
function GLTFTextureWebPExtension(parser) {
this.parser = parser;
this.name = EXTENSIONS.EXT_TEXTURE_WEBP;
this.isSupported = null;
}
GLTFTextureWebPExtension.prototype.loadTexture = function (textureIndex) {
var name = this.name;
var parser = this.parser;
var json = parser.json;
var textureDef = json.textures[textureIndex];
if(! textureDef.extensions || ! textureDef.extensions[name]) {return null;
}
var extension = textureDef.extensions[name];
var source = json.images[extension.source];
var loader = parser.textureLoader;
if (source.uri) {
var handler = parser.options.manager.getHandler(source.uri);
if(handler ! = =null) loader = handler;
}
return this.detectSupport().then(function (isSupported) {
if (isSupported) return parser.loadTextureImage(textureIndex, source, loader);
if (json.extensionsRequired && json.extensionsRequired.indexOf(name) >= 0) {
throw new Error('THREE.GLTFLoader: WebP required by asset but unsupported.');
}
// Fall back to PNG or JPEG.
return parser.loadTexture(textureIndex);
});
};
GLTFTextureWebPExtension.prototype.detectSupport = function () {
if (!this.isSupported) {
this.isSupported = new Promise(function (resolve) {
var image = new Image();
image.src = 'data:image/webp; base64,UklGRiIAAABXRUJQVlA4IBYAAAAwAQCdASoBAAEADsD+JaQAA3AAAAAA';
image.onload = image.onerror = function () {
resolve(image.height === 1);
};
});
}
return this.isSupported;
};
Copy the code
DetectSupport and loadTexture are both logical and easy to understand. The loadTexture is triggered by GLTFLoader.
It’s easy to find custom GLTF extensions. Just search for this._invokeOne in GLTFLoader to see how many hook functions are supported
- loadMesh
- loadBufferView
- loadMaterial
- loadTexture
- getMaterialType
implementation
First sort out the general idea of implementation.
GLTF extension
- Define the extended scheme
- DetectSupport is obtained by getting GL to read extended support
- LoadTexture loads the corresponding data according to Scheme, produces CompressedTexture and returns
Tool parts
- From GLTF/GLB loaded, containing texture converts the basis, and then decode into astc | bc7 | DXT | PVRTC | etc1
- Export GLTF in Scheme format.
Define the scheme
Refer to EXT_texture_webp to see that the extension configuration is stored in extension.ext_texture_webp, that is, only this part of the format needs to be defined.
{
"textures": [{"source": 0."extensions": {
"EXT_GPU_COMPRESSED_TEXTURE": {
"astc": 1."bc7": 2."dxt": 3."pvrtc": 4."etc1": 5."width": 2048."height": 2048."hasAlpha": 0."compress": 1}}}]."buffers": [{"name": "buffer"."byteLength": 207816."uri": "buffer.bin" },
{ "name": "image3.astc"."byteLength": 48972."uri": "image3.astc.bin" },
{ "name": "image3.bc7"."byteLength": 50586."uri": "image3.bc7.bin" },
{ "name": "image3.dxt"."byteLength": 10686."uri": "image3.dxt.bin" },
{ "name": "image3.pvrtc"."byteLength": 21741."uri": "image3.pvrtc.bin" },
{ "name": "image3.etc1"."byteLength": 22360."uri": "image3.etc1.bin"}}]Copy the code
Format is simple, a see understand, astc | bc7 | DXT | PVRTC | etc1 fields to buffers [I].
Generate GLTF of corresponding structure
Here you can refer to the basis webgl/texture/index.html for part of the cycle to generate 5 types of compressed texture products to the bin file, and then manually write the GLTF file.
At this point, the base version is ready to be written.
export class GLTFGPUCompressedTexture {
constructor(parser) {
this.name = 'EXT_GPU_COMPRESSED_TEXTURE';
this.parser = parser;
}
detectSupport(renderer) {
this.supportInfo = {
astc: renderer.extensions.has('WEBGL_compressed_texture_astc'),
bc7: renderer.extensions.has('EXT_texture_compression_bptc'),
dxt: renderer.extensions.has('WEBGL_compressed_texture_s3tc'),
etc1: renderer.extensions.has('WEBGL_compressed_texture_etc1'),
etc2: renderer.extensions.has('WEBGL_compressed_texture_etc'),
pvrtc:
renderer.extensions.has('WEBGL_compressed_texture_pvrtc') ||
renderer.extensions.has('WEBKIT_WEBGL_compressed_texture_pvrtc'),};return this;
}
loadTexture(textureIndex) {
const { parser, name } = this;
const json = parser.json;
const textureDef = json.textures[textureIndex];
if(! textureDef.extensions || ! textureDef.extensions[name])return null;
const extensionDef = textureDef.extensions[name];
const { width, height, hasAlpha } = extensionDef;
for (let name in this.supportInfo) {
if (this.supportInfo[name] && extensionDef[name] ! = =undefined) {
return parser
.getDependency('buffer', extensionDef[name])
.then(buffer= > {
// TODO:Support for compressed textures with MiPMap
// TODO:ZSTD compression
const mipmaps = [
{
data: new Uint8Array(buffer),
width,
height,
},
];
// The current buffer is directly passed to the GPU buffer
const texture = new CompressedTexture(
mipmaps,
width,
height,
typeFormatMap[name][hasAlpha],
UnsignedByteType,
);
texture.minFilter =
mipmaps.length === 1 ? LinearFilter : LinearMipmapLinearFilter;
texture.magFilter = LinearFilter;
texture.generateMipmaps = false;
texture.needsUpdate = true;
returntexture; }); }}// Fall back to PNG or JPEG.
returnparser.loadTexture(textureIndex); }}Copy the code
Rich details
- Since the basis produced by ETC1S is small in size but poor in quality, and UASTC is high in quality but large in size, lossless compression is required.
- Mipmap must be supported. GPU compression textures cannot quickly generate MIPMap on the GPU, and miPMap loading must be implemented
- Since compression is required, you may need to use Web Worker acceleration, WASM acceleration, SIMD acceleration, etc
- CLI conversion tool supports multi-process, batch processing, and output size statistics
- Write performance test cases, compare the compressed texture scheme of KTX2+ UASTC, and record data arrangement tables
- PC, mobile browser comparison, as well as ImageBitmapLoader, texture quantity size, resolution size and other comparison
- Use UI thread decode for fewer images and worker decode for more images
- Perfect resource release logic, dipose
Then there is a relatively perfect solution glTF-GPU-compressed -texture
A GLTF extension for GPU compression texture degradation, as well as a batch CLI conversion tool for THREE’s GLTFLoader, DEMO address, extended definition
The performance data
Use ImageBitmapLoader, THREE R129, localhost, disable cache: true
model | parameter | load | render | The total time consuming | The model size | Depend on the size |
---|---|---|---|---|---|---|
banzi_blue | gltf-tc zstd no-mimap no-worker | 36.10 ms | 1.60 ms | 37.70 ms | 506kb | 22.3 KB |
banzi_blue | gltf-tc no-zstd mimap no-worker | 25.80 ms | 1.50 ms | 27.30 ms | 2.2 MB | 22.3 KB |
banzi_blue | gltf-tc zstd mimap no-worker | 37.90 ms | 1.60 ms | 39.50 ms | 648kb | 22.3 KB |
banzi_blue | gltf ktx2 uastc | 534.70 ms | 1.70 ms | 536.40 ms | 684kb | 249.3 KB |
banzi_blue | glb | 32.80 QMS | 6.00 ms | 38.80 ms | 443kb | |
banzi_blue | gltf | 27.70 ms | 4.90 ms | 32.60 ms | 446kb | |
BoomBox | gltf-tc zstd mipmap worker | 153.50 ms | 23.70 ms | 177.20 ms | 6.6 MB | 22.3 KB |
BoomBox | gltf-tc zstd mipmap no-worker | 241.10 ms | 9.40 ms | 250.50 ms | 6.6 MB | 22.3 KB |
BoomBox | glb ktx2 uastc | 506.10 ms | 9.30 ms | 515.40 ms | 7.1 MB | 249.3 KB |
BoomBox | glb | 156.10 ms | 89.50 ms | 245.60 ms | 11.3 MB | |
BoomBox | gltf | 120.20 ms | 58.80 ms | 179.00 ms | 11.3 MB |
Since banzi_blue maps are less than 4, decode ZSTD in UI thread, because it will take a lot of time for worker to transfer data, compare all ZSTD decode in UI thread with KTX2Loader. Decode in Web Worker PR submitted dependency size 22.3KB obtained from online DEMO, HTTP-server –gzip does not work well
BoomBox gltF-TC ZSTD mipmap worker load+render time The time is not different from GLTF, but the model size has a big advantage
Test data under MI 8 can be viewed in the Screenshots directory
In wechat WebView, BoomBox is faster than GLB/GLTF, which is abnormal; in Chrome, it is normal; banzi_Blue is a little slower; KTX2 is still very slow
Command execution
Make sure ZSTD and Basisu are already in the PATH before using it
> npm i gltf-gpu-compressed-texture -S
# View help
> gltf-tc -h
-h --help-i --input [dir] [? Outdir] [? Compress] [? Mipmap] convert GLTF textures to GPU compressed textures and support Fallback Examples: gltf-tc -i ./examples/glb ./examples/zstd gltf-tc -i ./examples/glb ./examples/no-zstd 0 gltf-tc -i ./examples/glb ./examples/no-mipmap 1false
gltf-tc -i ./examples/glb ./examples/no-zstd-no-mipmap 0 false
# to perform
> gltf-tc -i ./examples/glb ./examples/zstd
done: 6417ms image3.png Normal:false sRGB: true
done: 13746ms image2.png Normal:true sRGB: false
donePNG: 14245ms image0.png Normal:false sRGB: true
done: 14491ms image1.png Normal:false sRGB: false
done: 577ms findi_touming01_nomarL1.jpg Normal:true sRGB: false
donePNG normal: : 568ms findi_touming01_basecoler.pngfalse sRGB: true
done: 1267ms lanse_banzi-1.jpg Normal:false sRGB: true
donePNG: 577ms Findi_touming01_basecoler.png Normal:false sRGB: true
done: 604ms findi_touming01_nomarL1.jpg normal:true sRGB: false
done: 1280ms lvse_banzi-1.jpg Normal:false sRGB: trueCost: 17.75s compress: 1, Summary: Bitmap: 11.22MB ASTC: 7.18MB ETC1:1.85MB BC7:7.16MB DXT: 3.04MB PVRTC: 2.28MBCopy the code
NPM package use
import { GLTFLoader, CompressedTexture, WebGLRenderer } from 'three-platfromzie/examples/jsm/loaders/GLTFLoader';
import GLTFGPUCompressedTexture from 'gltf-gpu-compressed-texture';
const gltfLoader = new GLTFLoader();
const renderer = new WebGLRenderer();
const scene = new Scene();
gltfLoader.register(parser= > {
return new GLTFGPUCompressedTexture(parser, renderer, {
CompressedTexture: THREE.CompressedTexture,
});
});
gltfLoader.loadAsync('./examples/zstd/BoomBox.gltf').then((gltf) = > {
scene.add(gltf.scene);
});
Copy the code
Found over
- Compressed textures minFilter and magFilter support is limited
- ZSTD is faster than PNG decode, so ZPNG format appears
- Az64 is better than ZSTD, but it’s not open source, and the actual performance is unknown
- Ktx2Loader uses zSTddec in UI thread decode, so put forward PR, implement worker pool decode
- Uint8Array(buffer, dataOffset); Uint8Array. From (new Uint8Array(buffer, dataOffset));
- Epic has similar Basis Transcode schemes and compressed oodle formats, closed source
- It is also possible to use ZSTD on top of tf models, but TF has its own data compression
- There are implementations on GPU decode Huffman, Massively Parallel Huffman Decoding on GPUs
- As mentioned earlier, basis-universal-transcoders, Babylon is already in use, but experimentally
- ZSTD WASM should be a non-SIMD version and was built last year, using the latest version of wASM but not running successfully
- IOS textures will block GIF uploads, while compressed textures will not
The last
Gltf-gpu-compressed -texture, star