Hyper-intuitive understanding of homogeneous coordinates (WebGL basics)

The mathematics of computer graphics is closely related to matrix multiplication. Examples include scaling, rotation, and shearing transformations. But translation does not correspond directly to matrix multiplication, because translation is not a linear transformation. To solve this problem, the conventional approach is to introduce homogeneous coordinates. The introduction of homogeneous coordinates may make some beginners understand not so intuitive, here I try to understand the translation of homogeneous coordinates in a very intuitive Angle.

Linear transformation and homogeneous coordinate transformation in two dimensions

First we draw a square in a two-dimensional coordinate system:

Suppose the upper right corner of the square is [1,1]. We multiply all the points of the square by the following matrix:

\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix} \tag{1}

And of course this is the identity matrix, the square is not moving. And then I’m going to change this matrix:

\begin{matrix} 1 & 1 \\ 0 & 1 \end{matrix} \tag{2}

If you put in the specific point [1,1], it becomes [2, 1], if you calculate [1, 0], it becomes [1, 0], and if you calculate [0,1], it becomes [1,1]. So matrix (2) is a shear transformation that turns this square into a parallelogram.

Then we will realize the translation transformation in two-dimensional space. According to the homogeneous coordinate definition, we will represent all points of the square with three-dimensional coordinates, such as [1,0] => [1,0,1], and then the three-dimensional matrix of translation is:

\begin{matrix} 1 & 0 & h \\ 0 & 1 & k \\ 0 & 0 & 1 \\ \end{matrix} \tag{3}

Where h is the shift in the x axis, k is the shift in the y axis. [x,y,1] => [x + h, y + k, 1], and finally divide both x and y by w (the added 1) to get the shifted coordinate [x + h, y + k]. So far we’ve used homogeneous two dimensional coordinates to translate two dimensions by matrix multiplication.

The key point comes, the above three dimensional vector and three dimensional matrix multiplication we used to represent the translation of two dimensions, but he is also three dimensional linear transformation! Let’s see if we can visualize the same thing in three dimensions. First let’s make a cube:

So if YOU take the transformation of the matrix 3, let’s say h equals 1, k equals 1. So the point [1,1,1] becomes [2,2,1] and [1,0,1] becomes [2,1,1]. These points are all points on the top surface of the cube, so the ground at the bottom, such as [1,1,0], remains the same as [1,1,0] since the third term is 0. So the cube becomes this shape:

The upper plane is shifted out, but the lower plane is stationary. This is the shear transformation of the cube. So the translation in two dimensions is when we subtract the information in three dimensions, and only leave the information in the upper plane, so we get the translation in the upper plane. That’s a shift in two dimensions.

Note: w is only a good description. It can be other numbers: 2,3,4, etc. For example, [1,1,1] [2,2,2] [3,3] are all homogeneous two-dimensional coordinates of [1,1]. This is what homogeneous means, that is, a single coordinate has numerous representations of homogeneous coordinates that essentially represent the same point. The representation in the diagram is that the cube is higher. But you just have to divide it all back.

Linear transformation and homogeneous coordinate transformation in three dimensional space

With a two-dimensional understanding, the three-dimensional is easy to understand. If a point in three dimensional space is translated, three dimensional homogeneous coordinates are introduced as thought vectors, such as [1,2,1,1]. The same. The corresponding translation matrix is:

\begin{matrix} 1 & 0 & 0 & i \\ 0 & 1 & 0 & j \\ 0 & 0 & 1 & k \\ 0 & 0 & 0 & 1 \end{matrix} \tag{3}

The above transformation will change [x,y,z,1] to [x+ I,y +j,z+k,1] and then divide by w [x/w, y/w,z/w] = [x+ I,y +j,z+k]. The figurative understanding is that the hypercube in four dimensions is cutting and changing sides, and then its “upper (cube)” is photographed in our three-dimensional space.

If you can’t imagine it in four dimensions. Here’s a reminder: If you were a two-dimensional person. A cube in three dimensions cuts into the plane you are on, and suddenly a straight line appears in your view, and then the line is constantly changing, and then suddenly disappears. Now that you are a three-dimensional person, imagine that a hypercube is cutting into the space we are in, and suddenly a cube appears in your vision, and then the cube keeps changing and then suddenly disappears.

Hyper-intuitive understanding of homogeneous coordinates (WebGL basics)

Linear transformation and homogeneous coordinate transformation in two dimensions

Linear transformation and homogeneous coordinate transformation in three dimensional space

Related Posts

HarmonyOS-HelloWorld

Overview of Web Front-end Development (Front-end Development Stack)

What the Front-end should Know about Browser security