scenario

  • At present, there is an online classroom scene in the business. The teacher draws various patterns on the online drawing board, and the students see the drawn patterns. Of course, it can also be understood as a little game of “draw and guess”.

Simple implementation

// Capture the selection event to draw with canvas
const canvas = document.getElementById('canvas');
const ctx = canvas.getContext('2d');
ctx.fillStyle = 'green';
ctx.fillRect(10.10.150.100);
// Collect drawing data
const mes = {
  type:'rect'.color:'green'.data: [10.10.150.100]}// Send to other ends through IM signaling (socket)
// The other end listens to the message
cosnt im = new IM()
im.on((mes) = >{
   // Call the drawing method
   draw(mes)
})

Copy the code

Core analysis

  • The core of the realization is to collect brush data and transfer it to other ends through the network.
  • Data structure: nothing more than the type, color, and coordinate position of the brush.

Pain points && difficult points

  • The amount of coordinate position data is too large
  • Abstract: The graph in the canvas is the collection of various points inside the canvas, and these points are defined by the X-axis and Y-axis coordinates, such as [[1.123,2.342],[3.123,4.342]]. In this way, the data can represent the collection of multiple points, and different graphs can be drawn, such as a line, etc
  • The more complex the graph, the greater the amount of data transferred to the other end
  • The more accurate the data point, such as [1.12387657,2.342687654], the higher the accuracy of the data, the higher the degree of reduction in other ends, the same amount of data will become larger
  • In a word: if you want bigger graphics and clearer graphics effects, you need more data support. Imagine what a data composed of thousands or tens of thousands of high-precision points will look like. That’s a few megabytes of JSON data, but here’s the paradox
  • If the message body is too large, whether the third-party IM communication is used or the self-built Sokect back-end API is designed, the size of the message is limited. If the data volume is too large, transmission is limited
  • And the message body is too large, transmission efficiency will also appear. This is obvious, for the same network situation, the smaller the data volume, the faster the transmission speed, let alone different networks
  • The ultimate experience requires the ultimate innovation, whether in weak network or normal network, efficient data transmission is an inevitable requirement, in order to achieve the ultimate user experience.

Common Solutions

  • Data slicing and paging are the most common operations. Instead of querying all data lists at one time, paging queries are used, which is also the most common way in front and back API design. Even with small amounts of data these days, paging design is an integral part of code robustness.
  • Normal compression algorithms, using giZP as a common way, or using Huffman coding as a compression algorithm, and some of the compression algorithms based on experience

Common compression algorithm analysis

  1. Substitution: for example ‘hello lile’; ‘Hello hanmeimei’, the word ‘hello’ in both sentences can we use the hypothesis h1 instead, so the data will become ‘h1 lile’; ‘H1 hanmeimei’ this will achieve data compression
  2. Binary: the compression above is not extreme, what if 0,1 is used? The data structure is smaller.
  3. Based on data correlation: The entire text that needs to be compressed can be analyzed. The higher the proportion of Hello appears, the shorter the corresponding compression replacement character, and the better the compression effect.
  4. Based on empirical data: for example, if it is assumed that the word “spirit, spirit, and spirit” always appears together in all articles, is it possible to use it as a word? Or if the adjacent 12345 in the data is linked together forever, is it possible to speculate the adjacent data according to 3
  5. Actually I what also can’t, these also is my shallow an understanding ha ha!!

One hundred feet ahead

  • Is this the end of me?
  • One day, I suddenly thought of a scene that can save several GIGABytes of HD video into hundreds of megabytes through machine learning, and then restore playback! Isn’t this scene very similar to me now?

Practice what you know

  • Think about it: our entire scene is definitely not a media compression scene, but it’s close
  • Change numerous for brief, began to think of the video to be sure it is not we need to target scenario, we a dimension reduction, change the video images, then the images with graphics, step by step to reduce the complexity of the problem, reduce the dimension of the data, we reduce the compression of the video scene to a compressed image of the scene, then lowered a compressed image scene to the compressed graphic scenes.
  • Abstract thinking: What is the nature of an image, and if it is a 25 pixel image, how should it be represented mathematically? Is it a 25 by 25 matrix, is each point 0 to 255 pixels, or is it too abstract? Let’s see

  • Begin with the end: Suppose images can be expressed in such abstract data [[1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4], [1, 2, 3, 4]], something to do with brush data before [[1.123, 2.342], [3.123, 4.342]] some similar, since the video can be compressed, The image can also be compressed, the simpler brush data can certainly be compressed
  • Find information on Google Docs 1, Google Docs 2, Google Docs 3 and more!!
  • But after reading, confused, do not understand (good math, really decided the upper limit of programmers)

Give up? No way!

  • Whether it is a programmer or not, I believe that a little problem solving ability is the core competency of a person in the workplace. One of my favorite responses to an interview is, “I don’t know, but let me guess. Instead of coming up I don’t think it’s over. Such an interview attitude may also be a work attitude, encounter can not solve the problem does not solve, wrong, no matter the workplace or life is actually a certain degree of results-oriented, far away!
  • Read so much also is not a little idea no? At least I learned something new called pseudocode
  • Isn’t the goal clear? Don’t we need to compress the data, transmit it, receive it, restore it, render it? The core problem we solve is the problem of input and output! The first stage is big data input and small data output. The second stage small data input, big data output!
  • The core implementation is not as follows
/ / painting pen
/ * * *@outputDraw: Compressed data *@inputDraw: Input brush data *@D: compression function */
const outputDraw = D(inputDraw)
/ / display terminal
/ * * *@outputShow: Raw data to display before decompression *@outputDraw: Incoming compressed data *@E: Decompress function */
const outputShow = E(outputDraw)
Copy the code
  • It feels like the code is done

Concrete implementation idea

  • Model 1: An encoder is needed to compress a lot of brush data into a small amount of data
  • Dimensionality reduction: one-dimensional data is used to represent the previous two-dimensional data of the brush, such as the length and width [3,4]. Can the data be represented by the area 12? Of course, the specific product is not so simple, so the brush data can be represented by one-dimensional data and the amount of data is reduced by half
  • Discard: Discard several points on a line, again reducing the amount of data
  • Machine learning model: The original data is used as the target data for training data, and the data discarded after dimensionality reduction is used as the prediction data, so we build a machine learning model of compressed data
  • Model 2: A decoder is needed to restore the brush data with small data volume to the original data
  • Machine learning model: Training data, predicting data inversion is the decoder we need, so we build a machine learning model of decompression data
  • The result: We were able to implement a machine learning model with a compression efficiency of at least half based on the user’s massive brush data, and then compress and compress again using traditional compression, omg!!

  • Expand imagination: Is it limited to brush data? Definitely not!! Could live video streaming also be used as a reminder of transmission efficiency, as in the plot of silicon Valley TV dramas?

Code implementation (parameter tuning will not be written)

# rely on
import tensorflow as tf
from sklearn.decomposition import PCA
import numpy as np

# data source
# first line
x1 =[(1.2), (3.4), (5.6), (7.8)]
# Second line
x2 =[(9.10), (11.12), (13.14), (15.16)]
# Dimensionality reduction source data
x = x1 + x2
x = np.array(x)
print(x)
[[ 1  2]
 [ 3  4]
 [ 5  6]
 [ 7  8]
 [ 9 10]
 [11 12]
 [13 14]
 [15 16]]
pca = PCA(n_components=1)
y = pca.fit_transform(x)
Use one-dimensional data to represent two-dimensional brush point data
print(y)
[[ 9.89949494]
 [ 7.07106781]
 [ 4.24264069]
 [ 1.41421356]
 [-1.41421356]
 [-4.24264069]
 [-7.07106781]
 [-9.89949494]]
 
# Coding model
# Raw brush data
x_train = [[[1.2], [3.4], [5.6], [7.8]], [[9.10], [11.12], [13.14], [15.16]]]
Data after discarding dimension reduction
y_train = [[[9.89949494], [1.41421356]], [[...1.41421356], [...9.89949494]]]
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(4.2)),
    tf.keras.layers.Dense(512, activation='relu'),  
    tf.keras.layers.Dense(256, activation='relu'),  
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(64, activation='relu'),  
    tf.keras.layers.Dense(32, activation='relu'),  
    tf.keras.layers.Dense(16, activation='relu'),   
    tf.keras.layers.Dense(2, activation='relu'),  
    tf.keras.layers.Reshape((2.1)))print(model.output_shape)
print(model.input_shape)
model.compile(
    optimizer='adam',
    loss='mse',
    metrics=['mae'.'mse']
)
model.fit(x_train, y_train, epochs=100,batch_size=1)
z =  [[(1.2), (3.4), (5.6), (7.8)]] 
train_datas = np.asarray(z)
predictions= model.predict(train_datas)
print(predictions)
[[[9.896617]
  [0.      ]]]
# the original data [[(1, 2), (3, 4), (5, 6), (7, 8)]] compressed into [[[9.896617] [0.]]]
  
# Decoding model
# Compressed data
x_train = [[[9.89949494], [1.41421356]], [[...1.41421356], [...9.89949494]]]
# Restore data
y_train = [[[1.2], [3.4], [5.6], [7.8]], [[9.10], [11.12], [13.14], [15.16]]]
model = tf.keras.models.Sequential([
    tf.keras.layers.Flatten(input_shape=(2.1)),
    tf.keras.layers.Dense(512, activation='relu'),  
    tf.keras.layers.Dense(256, activation='relu'),  
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(64, activation='relu'),  
    tf.keras.layers.Dense(32, activation='relu'),  
    tf.keras.layers.Dense(16, activation='relu'),  
    tf.keras.layers.Dense(8, activation='relu'),
    tf.keras.layers.Reshape((4.2)))print(model.output_shape)
print(model.input_shape)
model.compile(
    optimizer='adam',
    loss='mse',
    metrics=['mae'.'mse']
)
model.fit(x_train, y_train, epochs=100,batch_size=1)
z =  [[[9.896617], [0]]] 
train_datas = np.asarray(z)
predictions= model.predict(train_datas)
print(predictions)
[[[1.1542983 0.       ]
  [3.2397153 4.428882 ]
  [0.        6.42444  ]
  [7.5104036 8.566071 ]]]
# will be compressed data [[[9.896617], [0]]] back into [[[1.1542983 0.] [3.2397153 4.428882] [0. 6.42444] [7.5104036 8.566071]]]

Copy the code

conclusion

  • Technology must be based on the scene, no matter the design mode or machine learning, are based on the current business scene, can not say THAT I learned machine learning, so the back-end management system to do machine learning, no matter I want!
  • The breadth of skills and knowledge will provide more solutions, find the best solution among many solutions!
  • Whether it is based on CNN, RNN or real professional machine learning usage, it must be better than what I write. This is also the direction of my continuous efforts. The time to do one thing well is ten years ago, followed by now!
  • I hope I can be the best in machine learning in the field of front-end, come on, come on!