This article has participated in the call for good writing activities, click to view: back end, big front end double track submission, 20,000 yuan prize pool waiting for you to challenge!

This article is published in The depth of WebGL, click to see the table of contents

If you haven’t seen the previous chapter: WebGL-02 – Linear Transformation recommendations

translation

The previous article showed that a matrix can represent a linear transformation, as well as shear, scale, and rotation transformations. Now let’s think about the translation in two dimensions. Again, consider this square:

Now if YOU want to move the square one unit to the right, which is [1,1]T[1,1]^T[1,1]T(column transpose, the same thing as writing it vertically) will change to [2,1]T[2, 1]^T[2,1]T and it’s easy to think of this as vector addition, So I’m going to add one to the x-coordinate of all the points in the square. There are several disadvantages to this:

1. The translation operation cannot be combined with other transformations

Because shearing, scaling and rotation are matrix multiplication, complex motions can be directly multiplied by multiple matrices, or you can multiply matrices first to get a transformation matrix. But if the change is interrupted by several matrix additions, the matter is not so easy to calculate.

2. It doesn’t translate to infinity

If you do matrix addition, you can’t represent this thing by shifting it to infinity.

Is there any way to represent translation as matrix multiplication?

Yes, we need to introduce homogeneous coordinates.

Homogeneous coordinates

Introducing the concept of secondary coordinates for the first time might be a little bit less intuitive. In simple terms, the current coordinate is represented by the coordinate of one more dimension. Such as 2 d points (3, 2] [3, 2] T ^ T (3, 2] T a second coordinate representation is [3, 2, 1] T [3, 2, 1] [3, 2, 1] T ^ T. They’re different coordinate representations of the same point. The notation that we often use is cartesian coordinates, and this extra dimensional notation is homogeneous coordinates.

Homogeneous coordinates and Cartesian coordinates are interchangeable, two – and three-dimensional examples:


[ x . y ] T = [ x / w . y / w ] T : = [ x . y . w ] T [x,y]^T = [x/w, y/w]^T := [x,y,w]^T

[ x . y . z ] T = [ x / w . y / w . z / w ] T : = [ x . y . z . w ] T [x,y,z]^T = [x/w, y/w, z/w]^T := [x,y,z,w]^T

As you can see, a specific point in homogeneous coordinates, there are numerous kinds of methods, such as [3, 2, 1] T (3, 2, 1) ^ T T (3, 2, 1], [6,4,2] T ^ T,4,2 [6],4,2 [6] T, ,6,3,6,3 [9] T [9],6,3 [9] T ^ T are point (3, 2] [3, 2] T ^ T (3, 2] T written legal homogeneous coordinates. W divided by w is [3,2]T[3,2]^T[3,2]T. We say that a set of homogeneous coordinate representations of a point are homogeneous to each other. The implication is that these points all have the same properties, they all represent the same thing.

Homogeneous coordinates are coordinates that are homogeneous

Why does graphics always use homogeneous coordinates to represent points? Why is it that homogeneous coordinates can be translated into matrix multiplication? Here’s an example:

Or point T (3, 2] [3, 2] [3, 2] T ^ T I want to move him [4, 2] T [4, 2] [4, 2] T ^ T this location. So we’re going to shift 👉 one unit to the right along the X-axis. We have to the point of homogeneous coordinates said [3, 2, 1] T ^ T (3, 2, 1] [3, 2, 1] T, and then define the transformation matrix:


[ 1 0 1 0 1 0 0 0 1 ] \left [ \begin{matrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \right]

Let’s calculate:


[ 1 0 1 0 1 0 0 0 1 ] [ 3 2 1 ] = [ 4 2 1 ] \left [ \begin{matrix} 1 & 0 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{matrix} \right] \left [ \begin{matrix} 3 \\ 2 \\ 1 \end{matrix} \right] = \left [ \begin{matrix} 4 \\ 2 \\ 1 \end{matrix} \right]

Got a new point of homogeneous coordinates (4, 2, 1) said (4, 2, 1) T ^ T T [4, 2, 1], the corresponding cartesian coordinate is at the same time, in addition to get [4, 2] T w (4, 2) [4, 2] T ^ T.

I showed that homogeneous coordinates do allow us to use matrix multiplication for translation, but I’m sure you still have a question: why does adding a term make it so magical? You can translate it, right? And we can actually observe that the extra term is just multiplied by the right-hand row of the matrix, which is controlled to add to x or y. To understand the above process in a more intuitive way:

Simply look at the above example, he has two meanings: one is to represent the two-dimensional space under the homogeneous coordinates of the translation transformation, at the same time or three-dimensional space under the Cartesian coordinate system under some linear transformation! Now let’s look at how the same matrix transforms in three dimensions in Cartesian coordinates.

Similarly, place a cube with sides of one unit length in the three-dimensional coordinate system:

I’m sure you can imagine what the coordinates of the vertices of this cube are. First we look at the coordinates of the base plane, since the coordinates of the base plane z=0, such as [1,1,0]T[1,1,0]^T[1,1,0]T do not change when multiplied by the matrix above. So we have the same base plane, and then we look at the points on the upper plane, and since z is equal to 1, the x of the upper plane is going to be plus 1. Z in the middle is zero, so it’s x plus zero. So the cube looks like this:

In other words, the upper edge of the square is pushed out, which is a three-dimensional shear transformation! Then we do something like this:

Just look at the top layer and “flatten” it to “ground”, and then compare the results with the results of “flatten” the top layer to “ground” before the change. We get a two dimensional translation! We get a 2-d translation by doing a 3-D shear transformation and then reducing the dimension by just taking the 2-D information of the other layer.

Let’s say the top square has sides of length 2, and when you do the math you get (you can try it with one) the top surface will scale to 1 by 1 (because you divide by 2 when you divide by w), so it’s the same thing. In fact, every tenths of the plane in an 11 square will be 1* 1 when taken out “to hit the ground”. You can also imagine a big cube, where the ground is 11. So a 1 by 1 square in two dimensions has an infinite number of slices of solid space corresponding to it, and those slices are homogeneous.

Affine transformation

We add up linear and translational transformations and we call them affine transformations. The convenient operation for affine transformation is of course to represent a coordinate in terms of homogeneous coordinates. In three dimensions. We use [x,y,z,w]T[x,y,z,w]^T[x,y,z,w]T to represent a point. The corresponding transformation matrix is, of course, 4 by 4. Common transformations are as follows:

The same

[ 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 ] \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right]
translation

[ 1 0 0 x 0 1 0 y 0 0 1 z 0 0 0 1 ] \left [ \begin{matrix} 1 & 0 & 0 & x \\ 0 & 1 & 0 & y \\ 0 & 0 & 1 & z \\ 0 & 0 & 0 & 1 \end{matrix} \right]
The zoom

[ n 0 0 0 0 n 0 0 0 0 n 0 0 0 0 n ] \left [ \begin{matrix} n & 0 & 0 & 0 \\ 0 & n & 0 & 0 \\ 0 & 0 & n & 0 \\ 0 & 0 & 0 & n \end{matrix} \right]
Rotation (about z, x, y)

[ c o s ( Theta. ) s i n ( Theta. ) 0 0 s i n ( Theta. ) c o s ( Theta. ) 0 0 0 0 1 0 0 0 0 1 ] [ 1 0 0 0 0 c o s ( Theta. ) s i n ( Theta. ) 0 0 s i n ( Theta. ) c o s ( Theta. ) 0 0 0 0 1 ] [ c o s ( Theta. ) 0 s i n ( Theta. ) 0 0 1 0 0 s i n ( Theta. ) 0 c o s ( Theta. ) 0 0 0 0 1 ] \left [ \begin{matrix} cos(\theta) & -sin(\theta) & 0 & 0 \\ sin(\theta) & cos(\theta) & 0 & 0 \\ 0 & 0 & 1 & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right] \left [ \begin{matrix} 1 & 0 & 0 & 0 \\ 0 & cos(\theta) & -sin(\theta) & 0 \\ 0 & sin(\theta) & cos(\theta) & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right] \left [ \begin{matrix} cos(\theta) & 0 & -sin(\theta) & 0 \\ 0 & 1 & 0 & 0 \\sin(\theta) & 0 & cos(\theta) & 0 \\ 0 & 0 & 0 & 1 \end{matrix} \right]

css transform

We’re all familiar with cSS3’s new Transform property that we can animate with. And it provides convenient operations such as zooming, panning and rotation. In fact, those of you who are careful know that he can also write matrix as a value. It just looks like a bunch of numbers to pass around. It looks weird. This is actually the 4 by 4 matrix that I said above.

Corresponding MDN: developer.mozilla.org/zh-CN/docs/…

We see that MDN writes:

     transform: matrix3d(
      1.0.0.0.0.1.0.0.0.0.1.0.50.100.0.1
    )
Copy the code

This is written just like the transpose that we talked about. If you rotate it around the top left, bottom right and bottom left axis it looks exactly like the matrix I showed you above. The matrix above is a translation.

The problem

  1. You can try to play with CSS Transform matrix notation to achieve some animation!
  2. How do you move a point to infinity?