August more text challenge 3, rush rush 🚀🚀🚀

The articles

  • More powerful File interface, File System Access API
  • What, web pages can communicate directly with hardware? Web Serial API! | August More text Challenge (juejin.cn)

Creation is not easy, a little like to encourage, a small like is my motivation to create!!

preface

The previous two articles introduced the File Access API and the Web Serial API, both of which dealt with byte stream processing, so this article takes a closer look at stream manipulation in JavaScript.

We are familiar with the concept of streams. The TCP protocol that Http relies on is based on byte streams. While we used to be able to send and receive requests via XMLHttpRequest, we weren’t able to process the stream directly until ES6-FETCH was published.

The Streams API enables us to work directly with data Streams received over the network or created locally in any way. Traditionally, when we request media resources, we get a binary stream, which the browser automatically converts to the corresponding encoding and renders, and if we want to process it, we usually wrap it in Blob objects and then manipulate it. Now we can handle it directly using the Streams API, and here are some of the things we can do:

  • Video effects: Read the video stream, then connect it with the conversion stream through the pipeline, and convert it frame by frame to achieve functions such as watermark, clip, add audio track, etc.
  • Data decompression: Decompresses compressed packages, videos, and pictures.
  • Image transcoding: Stream processing can be byte based, so requests to an image resource can be transcoded byte by byte. For example,JPGturnPNGEtc format conversion.

The above functions used to require server side cooperation to achieve, now the web can be easily done.

Stream is widely used in nodejs, request and the response of the server is readable and writable flows respectively, familiar with the development of the node friends more easily understand the concept of this article, even if unused node or used the flow operation, after reading this article, believe it also will be very easy, after all, the idea is similar.

The core concept

To better understand the various stream handling methods in the Steam API, you must understand some concepts of streams.

A stream is generally referred to as a data stream, which is an ordered sequence of bytes with a starting point and an ending point. A concept originally used in communications, a digitally encoded sequence of signals representing information used in transmission. The concept was first introduced in 1998 by Henzinger, who defined a data stream as “a sequence of data that can only be read once in a predetermined order.”

Streams are classified into input streams, output streams, and buffer streams by nature, and are classified into byte streams and byte streams by type. The transmission of stream is unidirectional and irreversible. Common input stream has keyboard input, output stream has printer and so on.

Stream processing is a process involving multiple processing units, similar to the assembly line in a factory, with each part responsible for different tasks.

A block of data Chunks

A data block is a single block of data that is written to or read from a stream. Before the data is transmitted, the data is divided into pieces for easy transmission and processing. In JS it can be of any type and a stream can contain different types of data blocks, for example a byte stream might contain blocks of 16KB Uint8Array objects rather than individual bytes.

Readable Streams

A readable stream represents a data source that can be read. For example, if we want to process a requested image, we can create an instance of WritableStream and use it to load the image as the read stream for the image.

Writable streams

A WritableStream represents the destination to which the data is written. Once we have created the read stream object for the image, we can create an instance of WritableStream, which can be connected through pipes and can either output directly or convert.

Transform streams

The TransformStream is the focus of the stream operation, and using it accordingly creates a TransformStream instance. The common use of TransformStream is to first connect the readable stream to get the data, and then connect the writable stream to output the converted data. It might be a little abstract

Pipeline Pipe

Streams must be piped. A readable stream can be connected directly to a writable stream via the pipeTo() method, or to one or more conversion streams via pipeThrough(), which pipes the various streams. Different Pipe chains together are called Pipe chains.

Back pressure Backpressure

Back Pressure, usually refers to the movement of a fluid in a closed container along its path (such as pipe or air passage), because of obstacles or a sharp turn lane obstruction and is applied in the opposite direction of the movement of the Pressure.

A flow is often made up of multiple types of flows that flow through different pipes into the next one. The speed of the flow in the pipe can be expressed in terms of velocity, and the ability of the streams to process data can vary. This can result in congestion, like a four-lane highway jam changing from one lane to another, as far as the eyes can see. In stream processing, when a flow in tube chain has reached the limit of the data processing, it will generate a signal, this signal back to reverse transfer each data source, until it reaches the original, tell it not to sent again, to can not stand, and then the attribute will stop to send, this is a kind of negative feedback adjustment, just like the body’s hormonal regulation.

Tee Tee

Although the translation of Tee is a ball, but understand its function, I think it is more like the life of the commonly used Tee.

👇👇👇

Tee () is a method for creating two copies of a readable stream. Since a stream can only be used once, the original stream does not exist. This method can be used for multiple processing of streams.

The flow chart

The above concept is a bit abstract, so let’s use the flow chart to understand:

By requesting a text file with fetch, we need to make the letters in it all uppercase, so we can convert it through the stream. First create a readable stream to load it, and then connect it with transformation flows through a pipe, transformation flows is in charge here to uppercase, the text after the conversion through the pipe connection can be written to the output stream, but there are two output form, but only a writable stream, and flow transfer is completed, the state will no longer exist, before it You can’t create one stream for each output, so use tee() to open two branches for the writable stream, each with a different output.

Deep understanding of

With that in mind, we can dive into some of the methods of the Stream API, the core of which is ReadableStream, WritableStream, and TransformStream objects.

ReadableStream

ReadableStream is usually the starting point for stream processing by creating an instance of ReadableStream as a data source for subsequent operations.

It usually comes from two sources:

  • Push stream: read the data pushed by the push side. For example, in the real-time video stream, the server reads the video uploaded by the user and the server push message.websocketAnd so on.
  • Pull flow: The process of pulling a server resource through a specific address. For example, send a network request.

The following is the schematic diagram of streaming media service based on RTMP protocol:

create

ReadableStream() is a constructor that takes an optional underlyingSource and represents an object with methods and properties that define some of the behavior of the ReadableStream instance.

const readableStream = new ReadableStream({
  start(controller) {
    / *... * /
  },

  pull(controller) {
    / *... * /
  },

  cancel(reason) {
    / *... * /}});Copy the code

The three methods of the underlyingSource object are developer-defined interfaces, and we just need to override them.

  • start(controller)When the constructor is called, it can access the data source to perform some initialization operations, and returns one if the procedure is called asynchronouslyPromise. To receive aReadableStreamDefaultControllerObject parameters.
  • pull(controller): called each time a data block is fetched until the internal buffer queue is full.
  • cancel(reason): triggered by a request to stop sending a stream, usually when a backpressure signal is received.

Buffer queue

Streams are read on a first come, first served basis because the readable stream has a buffer queue inside it that stores blocks of data retrieved from the data source. When we create the ReadableStream instance, we specify the enqueuing strategy for the cache queue by passing in a second parameter, the queuingStrategy object.

const readableStream = new ReadableStream({
    /* The first argument, the underlyingSource object */
  },
  // The second parameter is queuingStrategy object
  {
    highWaterMark: 10.size(chunk) {
      returnchunk.length; }},);Copy the code
  • highWaterMark: a non-negative number, which represents the maximum capacity for storing data blocks. In combination with the concept of streams, it is the highest water mark.
  • size(chunk): Calculates and returns the size of a data block for triggering a backpressure signal.

Read the data

After a ReadableStream instance, through getReader () method to create a reader object ReadableStreamDefaultReader, again through the object’s read () to obtain the data.

Stream reading is an exclusive, continuous, and asynchronous process.

  • exclusiveLock a stream while it is being read. When the stream is locked, no other reader can be acquired until that reader is released. By accessing a readable stream objectlockedDetermine whether the stream is locked:
const locked = readableStream.locked;
console.log(The current locked state of the readable stream:${locked ? 'locked' : 'unlock'}`);
Copy the code
  • To continue means to read only one section at a time until you have read it.

  • Asynchronous means that a synchronous call returns a PROMISE instance as a result.

When a promise is done, it is an object with two properties: value and done:

  • value: Is a data valueUint8Arrayobject
  • done: Stream read state,trueRead completed

Here is an asynchronous loop reading chestnut:

const readableStream = newReadableStream({... }, {... });// Create a reader
const reader = readableStream.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) {
    console.log('Data read complete');
    break;
  }
  console.log('The original value of the data block is :', value);
}
Copy the code

{value: val,done: res} That’s right, iterators that walk through objects.

An Iterator is an interface that provides a uniform access mechanism for various data structures.

If you are not familiar with iterators, you can read The Iterator and for… Of Loops – Getting Started with ECMAScript 6 (ruanyifeng.com)

Knowing that the result object is similar to the iterator result object, we can use for… The of iteration gets the data, and since it is asynchronous, we use the asynchronous iteration.

AsyncIterator is a new feature in ES-2018 and is not supported by all JS engines, so use polyfills if necessary.

Babel starts from 6.16.0, and the asynchronous iteration is included under “Babel-plugin-transform-async-generator-functions” and babel-preset-stage-3 of Babel

One solution for asynchronous iterations is to use a helper function that returns an object containing the asynchronous iteration properties.

// Create a readable stream
const readableStream = newReadableStream({... }, {... });// Async iteration helper
function streamAsyncIterator(stream) {
  // Create a reader and lock the stream
  const reader = stream.getReader();

  return {
    // This is similar to next for synchronous iterators
    next() {
	 // Read the stream data
      return reader.read();
    },
    // Unlocks the stream when an asynchronous iterator exception occurs
    return() {
      reader.releaseLock();
      return {};
    },
	// Key, declare an asynchronous iterator interface
    [Symbol.asyncIterator]() {
      return this; }}; }for await (const chunk of streamAsyncIterator(readableStream)) {
    const { done, value } = chunk
  	if (done) {
    	console.log('Data read complete');
    	break;
  	}
  	console.log('The original value of the data block is :', value);
}
Copy the code

Teeing operation

Tee () is a method on a readable stream object that creates two copies of the read/write stream and returns an array containing both copies. Since a reader reads a stream exclusively and the stream can only be used once, tee() enables a stream to be used by two readers.

// Create two copies of the readable stream A, B
const [streamA, streamB] = readableStream.tee();
// Reader A
const readerA = streamA.getReader();
// Reader B
const readerB = streamB.getReader();
Copy the code

WritableStream

A writable stream is typically the last stop in the stream operation process, and can be thought of as a container for receiving the final processed data, as well as an abstraction on top of the underlying receive (raw data being written to the underlying I/O).

create

As with readableStream, you create an instance of writableStream by calling its constructor and passing in an underlyingSink object parameter to constrain its behavior.

const writableStream = new WritableStream({
  start(controller) {
    / *... * /
  },

  write(chunk, controller) {
    / *... * /
  },

  close(controller) {
    / *... * /
  },

  abort(reason) {
    / *... * /}});Copy the code

The interface in underlyingSink is as follows:

  • start(controller)Called when the constructor is called
  • write(chunk, controller): fired whenever a data block is read.
  • close(controller): Triggered when all data has been received.
  • abort(reason): Forcibly terminates stream writing.

Write data

Call writableStream getWriter () returns a writer object WritableStreamDefaultWriter instance, like reading, writing and locking flow, flow is locked down flow cannot be written to the other.

const writer = writableStream.getWriter();
const result = await writer.write('Here's a piece of data.');
Copy the code

The writing process is also asynchronous, so a Promise instance is returned. The success status means that the data has been received, but it does not mean that it has successfully reached its destination. For example, using the Web Serial API to send data to a peripheral device, although the return result shows success, But this only means that the underlying OS has sent the data out through the serial port, and the data may be lost during the transmission of the line, and eventually did not reach the device processor, so the transmission is failed.

Obtains whether a stream is locked by the locked property of writableStream:

const locked = writableStream.locked;// true or false
Copy the code

An example of a stream that prints letters second by second can be copied to the console output.

// Create a writable stream
const writableStream = new WritableStream({
  start(controller) {
    console.log('[start]');
  },
  async write(chunk, controller) {
    await new Promise((resolve) = > setTimeout(() = > {
      console.log('[write]', chunk);
      resolve();
    }, 1000));
  },
  close(controller) {
    console.log('[close]');
  },
  abort(reason) {
    console.log('[abort]', reason); }});const writer = writableStream.getWriter();
const start = Date.now();
for (const char of 'abcdefghijklmnopqrstuvwxyz') {
  // Wait for the writer to be ready
  await writer.ready;
  console.log('[ready]'.Date.now() - start, 'ms');
  writer.write(char);
}
await writer.close();
Copy the code

Print result:

Get data from a readable stream

A readable stream can create a pipe through pipeTo() to transmit data to a writable stream.

// Create a readable stream instance
const readableStream = new ReadableStream({
  start(controller) {
    console.log('[start readableStream]');
    // Initialize the buffer queue by filling in a bit of data
    controller.enqueue('a');
    controller.enqueue('b');
    controller.enqueue('c');
  },
  pull(controller) {
    // Called when the controller buffer queue is empty.
    console.log('[pull]');
    controller.enqueue('data');
    controller.close();
  },
  cancel(reason) {
    // Called when the stream is cancelled
    console.log('[cancel]', reason); }});// Writable stream instance
const writableStream = new WritableStream({
  start(controller) {
    console.log('[start writableStream]');
  },
  // Triggered when the reader calls write()
  async write(chunk, controller) {
    await new Promise((resolve) = > setTimeout(() = > {
      console.log('[write]', chunk);
      resolve();
    }, 1000));
  },
  close(controller) {
    console.log('[close]');
  },
  abort(reason) {
    console.log('[abort]', reason); }});await readableStream.pipeTo(writableStream);
console.log('[finished]');
Copy the code

TransformStream

Transform flow object is the key of flow processing, which is used to process data.

create

Its constructor takes an object argument and defines the processing rules by overriding the interface inside.

const transformStream = new TransformStream({
  start(controller) {
    / *... * /
  },

  transform(chunk, controller) {
    / *... * /
  },

  flush(controller) {
    / *... * /}});Copy the code
  • start(controller)The: constructor is invoked and can be used to initialize the cache queue.
  • transform(chunk, controller): called during data transformation, usually in which the transformation rules are written.
  • flush(controller)Triggered when the last piece of data is complete and the write side is closed, usually when all the data to be transferred is complete and some custom data is expected to be added at the end.

Processing readable streams

The data is pipeThrough() of ReadableStream over the pipe to the conversion stream.

The following example converts lowercase letters in a readable stream to uppercase letters by converting the stream.

// Transform the stream instance
const transformStream = new TransformStream({
  transform(chunk, controller) {
    console.log('[transform]', chunk);
    // Change lowercase letters to uppercase
    controller.enqueue(chunk.toUpperCase());
  },
  flush(controller) {
    console.log('[flush]');
    // Terminate the conversioncontroller.terminate(); }});// The readable stream instance is the same as the chestnut above
const readableStream = new ReadableStream({
  start(controller) {
    // Prepopulate some data
    console.log('[start]');
    controller.enqueue('a');
    controller.enqueue('b');
    controller.enqueue('c');
  },
  pull(controller) {
    console.log('[pull]');
    controller.enqueue('data');
    controller.close(); 
  },
  cancel(reason) {
    console.log('[cancel]', reason); }});// IIFE is allowed in the console environment
// The reason for using IIFE is that await cannot currently be called at the top level
(async() = > {const reader = readableStream.pipeThrough(transformStream).getReader();
  for (let result = awaitreader.read(); ! result.done; result =await reader.read()) {
    console.log('[value]', result.value);
  }
})();
Copy the code

Processing network requests

A readable stream can also be a network request, such as FETCH:

// Returns a conversion stream object that converts each letter to uppercase
function upperCaseStream() {
  return new TransformStream({
    transform(chunk, controller){ controller.enqueue(chunk.toUpperCase()); }}); }// Return a writable stream for rendering data to the DOM
function appendToDOMStream(el) {
  return new WritableStream({
    write(chunk){ el.append(chunk); }}); }// Request a piece of text data
fetch('./text.txt').then((response) = >
  response.body
     // Default is utf8 encoding, Chinese should be converted to 'gb2312'
    .pipeThrough(new TextDecoderStream('gb2312'))
    .pipeThrough(upperCaseStream())
    .pipeTo(appendToDOMStream(document.body))
);
Copy the code

The above code requests a piece of text data via fetch, and since the body of the Response object is a ReadableStream, you can use all the pipe methods of the stream. The data is first transferred to a TextDecoderStream() object for decoding, which was used to convert Chinese characters when the serial port data was output in the previous article. After decoding, the data is transferred to the conversion stream object for conversion, and finally the data is transferred to the writable stream for output rendering to the DOM. The above process is more like a pipeline, data from read to output through a series of processing.

Other objects with built-in stream interfaces

  • Blob:BlobObject represents an immutable, raw data – like file object. Its data can be read in text or binary format, and can be passed throughstream()Converted toReadableStreamFor data manipulation.
  • File:FileInterface is based onBlob, inherits the functionality of the BLOB and extends it to support files on the user’s system.
  • Fetch:FetchtheResponse.bodyIs aReadableStreamIs aReadableStreamIs aReadableStreamIs aReadableStream.
  • The first paperThe articleTo introduce theFile System Access APIIn theFileSystemWritableFileStream
  • The second articleThe articleintroduceWeb Serial APIThe serial portport.readable.

Browser support

The Can I Use website shows the best support for Chrome and the chromium Edge replacement, with only partial support for other browsers. Even the latest versions of Firefox still don’t support readable streams and piped methods.

conclusion

The Stream API is currently supported to varying degrees in browsers, and the main features are in range, and the support on major browsers is very high, so it is fully usable in production environments. Libraries like XLXS and JS-Zip need to serialize data to binary for output, and often use bloBs to wrap the data internally. With the Stream API, this is much more concise. Because space is limited, here is not a demonstration, will have time to do a data compression conversion demo for you to use.

When WRITING this article, there is little information on the Chinese search engine, most of which is to introduce the Stream under Node, so I consulted many foreign documents. The following reference is only a part of the list, if there are mistakes, I hope you can make corrections in the comments section.

reference

  • Streams API – Web API interface reference | MDN (mozilla.org)
  • Streams Standard (whatwg.org)
  • 2016 – the year of web streams – JakeArchibald.com
  • MattiasBuelens/web-streams-polyfill: Web Streams, based on the WHATWG spec reference implementation (github.com)