How to implement a word cloud

This article is from Bean skin Fan – How to achieve a word cloud

What is a word cloud?

Tag cloud or word cloud is the visual description of keywords, which is to highlight the keywords with high frequency in the text visually, form keyword cloud or keyword rendering, so as to filter out a large amount of text information, so that the web browser can appreciate the main idea of the text as long as a glance through the text.

Students who are not familiar with the word cloud can join our “visualization team”, “Doupi Fan” background reply group, welcome to consult and exchange, we will make a visual library together, check the word cloud demo for understanding.

Steps to break up

Drawing a word cloud can be roughly divided into the following steps

Data processing: Map the information in the data to the drawing properties of words, such as size, color, word weight, etc.
Layout algorithm: calculates the placement of each word.
Draw: To draw the calculated information onto the canvas.

Implementation approach

The implementation of the first step will not be expanded here, but suppose we have a set of processed data in the following format:

Const data = [{text: 'snail noodle ', fontSize: 40, color: 'red'}, {text: 'red', fontSize: 35, color: 'blue'}, {text:' snail noodle ', fontSize: 35, color: 'blue'}, {text: 'Rou jia mo ', fontSize: 35, color: 'blue'}, {text:' fried sauce noodles ', fontSize: 32, color: 'blue'}, {text: 'shaxian snacks ', fontSize: 25, color: 'blue'}, {text: 'beancurd ', fontSize: 23, color:' beancurd '}, {text: 'beancurd ', fontSize: 23, color:' beancurd '} 'blue'}, {text: 'chicken ', fontSize: 20, color: 'red'}, {text:' hot ', fontSize: 19, color: 'blue'}, {text: FontSize: 15, color: 'blue'}, {text: 'monkey ', fontSize: 12, color: 'blue'}, {text:' monkey ', fontSize: 11, color: 'monkey'}, {text: 'monkey ', fontSize: 11, color:' monkey '}, {text: 'monkey ', fontSize: 11, color:' monkey '}, {text: 'monkey ', fontSize: 11, color:' monkey '}, 'red'}, {text: 'mash ', fontSize: 10, color: 'blue'}]Copy the code

All we need to do is sort the words in order of weight from highest to lowest, for each word:

Select an initial location
Try placing to see if it overlaps with words already placed. If it can be put down, record the word placement coordinates and try to place the next word; If not, move to the next location according to the layout logic and try again until you can drop or reach the outermost edge of the placement (that is, it is no longer possible to drop the word in the following position).

This loop until all the words are tried, at this time you can get an array of words to be placed, and finally go through the array and draw it to the canvas according to the coordinate, color, font size and other information of the words.

The flow chart is as follows:

The key problem

According to the above ideas, to achieve a simple word cloud, at least two key problems need to be solved:

Text layout algorithm, which determines the path of the word to try to place, that is, to get the value of the next place coordinate.

Text collision algorithm, overlapping judgment when placing attempts, it determines whether text can be placed.

Text layout algorithm

Under normal circumstances, the layout of the word cloud starts from the center and gradually expands to the periphery in a ring, forming the effect that the weight of the text gradually decreases from the middle to the periphery.

As shown in the figure below, most of the words with significant weight are distributed near the center, and the farther out the words are, the lower the weight is, and the whole expands outward in a circular way.

Spiral of Archimedes

The Archimedes spiral (also known as the “constant speed spiral”) can easily achieve the above layout effect. This spiral starts from the center and rotates outward, and the spacing of each arm is always equal. We can take points on the cantilever as placement coordinates, start from the center point, and place words evenly from the center to the periphery along the cantilever. Its curve is drawn as follows:

The relevant formula

The equation of Archimedes spiral is as follows:

Polar equation: r = a + b theta {\ displaystyle \, r = a + b \ theta} r = a + b theta

Cartesian coordinate formula:

$X = (a + b ∗ theta) ∗ cosine theta$

$Y = (a + b ∗ theta) ∗ sine theta$

Where, A is the distance between the starting point and the polar coordinate center, and b is the pitch between the control spirals. The larger b is, the faster the radius r grows, and the spirals are sparser. By increasing the value of θ, the placement point can be obtained from the inside out of the spiral arm.

Code implementation

Implement archimedeanSpiral to get coordinate points, and the paintSpiral function is used to draw spiral lines to aid observation.

@param {*} size canvas size, [width, height] * @param {*} {step = 0.1, b = 1, @returns */ export function archimedeanSpiral(size, {step = 0.1, b = 1, a = 0 } = {}) { const e = size[0] / size[1]; Return function(t) {return [e * (a + b * (t *= step)) * math.cos (t), (a + b * t) * Math.sin(t)]; }; } @param {*} size, [width, height] @param {*} getPosition layout function, @param {*} Params {showIndex} Whether to display serial number */ export function paintSpiral (size, getPosition, {showIndex = false} = {}) {const points = [] MaxDelta = math.sqrt (size[0] * size[0] + size[1] * size[1]), // x coordinates dy; // Set the step size to step * 1. While (dxdy = getPosition(t += 1)) {dx = dxdy[0] dy = dxdy[1] if (math.min (math.abs (dx), Math.abs(dy)) >= maxDelta) break; // (dx, dy) is farther from the center than maxDelta, Return false points. Push ([dx, dy, Const canvas = document.createElement('canvas') Canvas.width = size[0] Canvas.height = size[1] canvas.style.width = size[0] canvas.style.height = size[1] const ctx = canvas.getContext('2d') ctx.fillStyle = '#f11'; ctx.strokeStyle = 'black'; Let last = [0, 0] for(let points) {ctx.beginPath(); ctx.moveTo(last[0] + size[0] / 2, last[1] + size[1] / 2) ctx.lineTo(point[0] + size[0] / 2, point[1] + size[1] / 2) last = point ctx.stroke(); ctx.beginPath(); ctx.arc(point[0] + size[0] / 2, point[1] + size[1] / 2, 2, 0, 2 * Math.PI, false); ShowIndex && ctx.fillText(point[2], point[0] + size[0] / 2, point[1] + size[1] / 2) ctx.fill() } document.body.append(canvas) }Copy the code

The plot

Call paintSpiral function to draw. The red circular marker points are the placement coordinates we obtain, and the black line is used to connect the placement points to see the shape of the spiral (only the placement points are needed in actual use).

// Canvas width/height const CANVAS_SIZE = [500, 500] // Draw spiral const getPosition = archimedeanSpiral(CANVAS_SIZE, {step: 0.1, b: 1 }) paintSpiral(CANVAS_SIZE, getPosition, { showIndex: false })Copy the code

In order to facilitate observation, increase the pitch and step length, draw a relatively sparse spiral, and mark the placement sequence of points.

const getPosition = archimedeanSpiral(CANVAS_SIZE, { step: 1, b: 10 })

paintSpiral(CANVAS_SIZE, getPosition, { showIndex: true })
Copy the code

It can be seen that when the pitch is increased, the distance between the spiral lines in each turn is further, while when the step size is adjusted, the number of mark points in each turn is reduced. Next, try to arrange the text in the order of placement points.

Implement a drawWords function to place terms according to the layout function.

/ * * * according to Archimedes spiral drawing words * @ param * @ {*} data vocabulary data param {*} getPosition layout function * / const drawWords = (data, the size, getPosition, ) => { let t = 0 const { context, canvas } = createCanvas(size[0], size[1]) data.forEach((word, index) => { const [dx, dy] = getPosition(t += 1) word.x = size[0] / 2 + dx word.y = size[1] / 2 + dy word.fontSize = Math.floor(word.fontSize /  2) word.text = `${index}-${word.text}` drawText(context, word) }) document.body.appendChild(canvas) }Copy the code

Draw spiral and vocabulary

// Draw a spiral for comparison const getPosition = archimedeanSpiral(CANVAS_SIZE, {step: 1, b: 10 }) paintSpiral(CANVAS_SIZE, getPosition, { showIndex: DrawWords (data, size, getPosition) drawWords(data, size, getPosition)Copy the code

Words can now be arranged in a spiral shape, but without collision detection, words placed near each other overlap. Then you just need to know whether the words will overlap and you can try to place them along the spiral until all the words have been tried.

Collision detection algorithm

Collision detection can be implemented in many ways, and we use per-pixel comparison. An array is used to record the occupancy of each pixel in the entire canvas, and each word keeps its own occupancy information when initialized. When placing a word, the pixel occupancy information of the word is compared with the pixel information of the corresponding position on the canvas. After the text is placed, update the pixel occupancy information for the corresponding position of the canvas.

To facilitate comparison and manipulation, a one-dimensional array is used to store pixel information, a board array is initialized globally to store the pixel occupancy of the entire canvas (length is the width and height of the canvas), and a new Sprite array is created for each term to store the pixel occupancy of its own text (length is the width and height of the text).

Collision detection process

Suppose the variable board stores the entire canvas pixel information, and each word uses Sprite to store its own pixel occupancy information.

Below is a simple diagram of placing the “L” word, with the board array on the left and the Sprite array of words on the right. You first need to find the point to place according to the text layout function. As the first point according to the layout function, in the center of the canvas.

The red dot is the position to try to place in the canvas. According to the coordinates placed and the width and height information of the text, the corresponding pixel range in the board (in the green box) can be calculated. The Sprite array is traversed and the pixels in the Sprite and the pixels in the green box of the board are compared one by one. Obviously, the word “L” in the figure below overlapped with the existing “H” in the canvas, so “L” cannot be placed at the red dot. Call the layout function to find the next place to try.

After several failed attempts, the position of the red dot in the figure below is found. After comparison, it is found that there is no overlap, so the word “L” can be placed at the red dot, and the final drawing coordinate X and Y of the word “L” can be updated.

After the “L” coordinate information is updated, meaning that the position of the word “L” has been determined when the canvas is finally drawn, the pixel occupancy information of “L” is updated to the board array. Then proceed to the next word placement attempt until all words have been placed.

How pixel data is stored

Since pixels on the canvas are two-dimensional information, and we use one-dimensional array for storage, width information needs to be recorded during preservation, that is, several elements are in a row to restore their position information on the two-dimensional canvas. 1 means occupied, and 0 means not occupied.

Taking an “L” letter as an example, assuming that the bounding box of the word “L” is 8 in width and 11 in height, we can create a one-dimensional array with a length of 88 (8 * 11) to store the pixel occupancy and record the word width of 8. The diagram below:

In this case, the time complexity of once detection is wordWidth∗wordHeightwordWidth * wordHeightwordWidth∗wordHeight

Use binary to store pixel information

The number of pixels on a canvas can be very large. For example, a canvas with a resolution of 1,500 * 500 has 250,000 pixels. If an integer is used to store information for one pixel, an array with a length of 250,000 is required to store information for the entire canvas. Operating on a large array results in a larger memory footprint and a less efficient traversal.

In the collision detection scenario, we only need to record “occupied” and “not occupied” for each pixel. These two states can be represented as binary ones and zeros, so we can use integers to represent a 32-bit binary number, where 1 means occupied and 0 means not occupied.

For a 500 by 500 canvas, you only need an array of 7812 to save it. Taking the same “L” letter as an example, after optimization, only an array of length 8 is needed to store the Sprite information of the word “L”. The diagram below:

In this case, the time complexity of placement detection is: wordWidth∗wordHeight/32wordWidth * wordHeight/32wordWidth ∗wordHeight/32

Visually view pixel information

To visualize the amount of pixels stored in the array, write a printPixelArray function to print the values of the array in two-dimensional form.

The array print function is implemented as follows:

@param {*} board * @param {*} w * @returns */ export const printPixelArray = (board, w) => { let bitStr = '' let intStr = '' for (let i = 0; i < board.length / w; i++) { for (let j = 0; j < w; j++) { bitStr += `${(board[i * w + j] >>> 0).toString(2).padStart(32,'0')}|` intStr += `${board[i * w + J] toString (10). PadEnd (32)} | `} / / integer format bitStr + = '\ n' / / binary format intStr + = '\ n'} return {bitStr, intStr}}Copy the code

With the word “spiral lion powder”, for example, here is an array “spiral lions powder” Sprite value printed as a result, according to the width of the word wrap, | between every integer division. You can see that the one-dimensional array has been reduced to a two-dimensional plane, and the spiral noodle row uses six integers to record pixel information.

By converting integers into binary formats for display, pixel occupancy can be observed more intuitively.

Converting integers to binary makes it clear how much each pixel is occupied. However, since the area occupied by the string is too large for debugging, we write a paint function to draw the pixel usage of the array to a canvas of equal proportions.

The implementation code is as follows:

/** * Paint canvas * @param {*} board * @param {*} paintSize */ export const paint = (board, paintSize) => { const curSize = paintSize const imageData = new ImageData(curSize[0], curSize[1]); let array = imageData.data for (let i = 0; i < curSize[1]; i++) { for (let j = 0; j < (curSize[0] >> 5); j++) { let value = board[i * (curSize[0] >> 5) + j] for (let k = 0; k < 32; Const MSK = 0b1 << (32-k) if (value & MSK) {// For (let l = 0; l < 4; l++) { array[i * curSize[0] * 4 + j * 32 * 4 + k * 4 + l] = 255; }} else {// Do not occupy pixels, fill black for(let l = 0; l < 3; l++) { array[i * curSize[0] * 4 + j * 32 * 4 + k * 4 + l] = 0; } array[i * curSize[0] * 4 + j * 32 * 4 + k * 4 + 3] = 255; If (k === 0) {array[I * curSize[0] * 4 + j * 32 * 4 + k * 4 + 0] = 255; array[i * curSize[0] * 4 + j * 32 * 4 + k * 4 + 1] = 0; array[i * curSize[0] * 4 + j * 32 * 4 + k * 4 + 2] = 0; } } } } const canvas = document.createElement('canvas') canvas.width = curSize[0] canvas.height = curSize[1] const ctx =  canvas.getContext('2d') ctx.putImageData(imageData, 0, 0) canvas.style.marginRight = '10px' document.body.appendChild(canvas) }Copy the code

// Paint (word.sprite, [word.w, word.h])Copy the code

The drawing effect is shown as follows:

“Occupied” pixels are drawn in white, “unoccupied” pixels are drawn in black, and the red vertical line is the dividing line of each element in the array, that is, the occupied information of 32 pixels stored as an integer between the two red vertical lines.

Initialize pixel information

The global initialization variable board is used to store pixel information for the entire canvas, the length of board is the width * height of the canvas to be drawn, and the initial padding is 0(no words are placed on the canvas).

Const board = new Array(size[0], size[1]).fill(0)Copy the code

In order to obtain the pixel information of the word, it is necessary to calculate the word width and height, draw the word on the canvas, and then use the method ctx.getimageData (Sx, SY, SW, SH) to obtain the pixel information. Its four parameters are starting point X coordinate, starting point y coordinate, intercept width, intercept height.

ImageData’s data uses four integers to represent the color of a pixel. The default values for the parts not drawn are 0, 0, 0, 0. We just need to know if the current pixel is occupied, so we just need to take the value of alpha, 1 for occupied, 0 for occupied.

Measure the width of the text using the CTx. measureText method. To avoid truncation, use the size * 2 as the word height. The width and height of the text determines the size of the Sprite array.

In order to minimize the operation of canvas to save performance, the scheme of obtaining pixel information is similar to that of CSS Sprite diagram. First initialize a large canvas, and then draw text on a large canvas as much as possible ata time. Use ctx.getimageData (0, 0, canvas width, canvas height) to get the pixel information array of the entire canvas, and then draw coordinate and width and height information according to the text. The pixel occupancy information corresponding to the text is captured in the entire canvas array and saved in the Sprite array of the term.

Note that the Sprite of words is not acquired all at once. When you try to place a term, you try to get the corresponding Sprite for that term, and if the Sprite is not initialized, you start a round of Sprite initialization with the current term as the starting index. The initial canvas size is 2048 * 2048, stop drawing when no more can be drawn, update the drawn term Sprite, and then try to place it. The first batch Sprite fetch is made until the Sprite that placed the word does not exist.

Get word pixel occupancy information (Sprite array) flowchart:

Unable to copy content being loaded

Code implementation:

@param {*} contextAndRatio canvas context * @param {*} d word information * @param {*} data all words * @param {*} Index */ function cloudSprite(contextAndRatio, d, data, di) {// If (d.sprite) return; Var c = contextAndRatio. Context, ratio = contextAndRatio. Ratio; c.clearRect(0, 0, (cw << 5) / ratio, ch / ratio); Var x = 0, y = 0, maxh = 0, n = data.length, w, var x = 0, y = 0, maxh = 0, n = data.length, w, var x = 0, y = 0, maxh = 0, n = data. --di; while (++di < n) { d = data[di]; c.save(); c.font = d.style + " " + d.weight + " " + ~~((d.size + 1) / ratio) + "px " + d.font; // set the text attribute w = c.easureText (d.ext + "m").width * ratio; // get the text width h = d.size << 1; // Since there is no API to get the height of text, the default height is fontSize * 2 to ensure the integrity of the pixels captured. In order to improve the width and height of the system, figure out if (D.otate) {var sr = math.sin (D.otate * cloudRadians), cr = math.cos (D.otate * cloudRadians), wcr = w * cr, wsr = w * sr, hcr = h * cr, hsr = h * sr; w = (Math.max(Math.abs(wcr + hsr), Math.abs(wcr - hsr)) + 0x1f) >> 5 << 5; h = ~~Math.max(Math.abs(wsr + hcr), Math.abs(wsr - hcr)); } else { w = (w + 0x1f) >> 5 << 5; } // if (h > maxh) maxh = h; If (x + w >= (cw << 5)) {x = 0; if (x + w >= (cw << 5)) {x = 0; y += maxh; maxh = 0; } if (y + h >= ch) break; C ranslate((x + (w >> 1))/ratio, (y + (h >> 1))/ratio); if (d.rotate) c.rotate(d.rotate * cloudRadians); c.fillText(d.text, 0, 0); if (d.padding) { c.lineWidth = 2 * d.padding; c.strokeText(d.text, 0, 0); } c.restore(); // when the word is drawn, record its relative position and range on the canvas. d.height = h; d.xoff = x; d.yoff = y; X0, x1, y0, y1 are the relative coordinates of the four angles with respect to the central point d.x1 = w >> 1; d.y1 = h >> 1; d.x0 = -d.x1; d.y0 = -d.y1; d.hasText = true; // move x to the right and wait for the next word to draw x += w; Var pixels = c.getimageData (0, 0, (CW << 5)/ratio, ch/ratio).data, Sprite = []; While (--di >= 0) {d = data[di]; if (! d.hasText) continue; w = d.width; w32 = w >> 5; h = d.y1 - d.y0; // Zero the buffer for (i = 0; i < h * w32; i++) sprite[i] = 0; x = d.xoff; if (x == null) return; y = d.yoff; var seen = 0, seenRow = -1; Sprite for (j = 0; Sprite for (j = 0; j < h; j++) { for (i = 0; i < w; // In pixels, only the alpha channel is selected. Var k = w32 * j + (I >> 5), m = Pixels [((y + j) * (CW << 5) + (x + I)) << 2]? 1 << (31 - (i % 32)) : 0; sprite[k] |= m; / / update the Sprite corresponding pixel information seen | = m; If (seen) seenRow = j; if (seen) seenRow = j; Else {// If no shading is found on the current line, omit the line (height --, y coordinate ++, upper left relative coordinate ++) d.y0++; h--; j--; y++; } } d.y1 = d.y0 + seenRow; D.sprite = sprite.slice(0, (d.y1-d.y0) * w32); }} // Get word width and height, left and right boundary coordinates, pixel occupancy, etc. data.forEach((word, index) => cloudSprite(contextAndRatio, word, data, index))Copy the code

The drawn canvas is displayed as follows:

Here is a Sprite array of words drawn using the paint function:

The processed word object is as follows:

const word = { w: number; / / h: wide number; / / high x0: number; // x left offset, <= 0 x1: number; Y0: number; // y <= 0 y1: number; Sprite: number[]; Sprite: number[]; // The array length is w * h / 32 x: number; // Draw coordinate y: number; // Draw coordinates}Copy the code

Offset processing for collision detection

Collision detection calculation

Once you have the Sprite information for the word, you can use it for collision detection with the board.

The important thing to note here is that there are now two concepts of units.

The units of a real canvas, called pixels

The text drawing coordinates obtained through the layout function are in pixels.

The smallest unit of storage, an integer, records 32 pixels

When calculating whether words overlap, the smallest unit is the whole number. When determining whether there are pixels overlapping in two integer units, only the “and” operation is performed on the two integers. If the result is 1, it indicates overlap; if the result is 0, it indicates no overlap.

In collision detection, it is usually necessary to perform bit operation on integers to judge overlap and obtain specific values.

Basic knowledge of bit operation

Next, we may need to use some simple bit operations. Let’s review them first

Suppose we have two binary numbers A and B, 0b being the binary prefix

A = 0b1001

B = 0b0011
Copy the code

The bitwise and (&)

Perform the “and” operation on each digit, and the result is 1 only when the corresponding position is 1; otherwise, the result is 0.

The and operation can be used for pixel comparison.

Bitwise or (|)

Perform the “or” operation on each digit, and the result is 0 if one of the corresponding positions is 0, otherwise it is 1.

The or operation can be used to combine two integers into one.

Left-shift operator (<<)

Each binary all left shift a number of bits, high discard, low fill 0

Right shift operator (>>)

Each binary all right shift several bits, to the unsigned number, high – position complement 0.

Left and right shifts can be used to intercept the left or right portion of the value.

In the above collision detection, it is assumed that the calculation is carried out in the unit of pixels. In the case of pixels, it only needs to be found in the board

Generic processing of lexical collision detection

In actual word placement, word coordinates are in pixels, which will cause that in collision detection, the integer of board and the integer of Sprite are misplaced, and the “and” operation cannot be directly carried out to obtain the collision result.

At this point, pixel information of the corresponding position needs to be extracted to form a new integer to be calculated with the value in the board.

In the figure above, you actually need to compare the pixels within the yellow transparent rectangle (canvas) with the green rectangle (word). For this case, the need for 1, 2, 3 respectively in the column of pixels, because words pixel rectangle with rectangular existence of dislocation of the canvas, with the first 1 as example, the need to remove the word B area of the pixel, on the left zero padding, forming a new integer, then the corresponding position of the integer arithmetic is canvas. For the second column, take the pixels to the right of the first integer of the word and the pixels to the left of the second cell to form an integer calculation.

For a 32-bit binary number, we can conveniently use >> or << to preserve the left part of the information or to preserve the right part of the information, respectively after the calculation of a new or operation to get a new 32 bits.

Get the pseudocode occupied by the first line of pixels in the red transparent rectangle:

// Set wordSpriteLeft as the first value in line 1 // wordSpriteRight as the second value in line 1 // Set offset to x // Get the right part of the first value const leftPartInBoard = wordSpriteLeft < < (32 - x) / / for the second numerical part on the left side of the const rightPartInBoard = wordSpriteRight > > x / / merge of new numerical const newValue = leftPartInBoard | Return onCollide = oncollide & board[I]Copy the code

Collision detection code implementation

@collide (@param {*}) {keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep keep board, sw) { sw >>= 5; Var Sprite = tag. Sprite, w = tag.width >> 5, lx = tag.x - (w << 4), Sx = lx&0x7f, // Word offset (px), MSX = 32-sx, // Number of elements to remove from Sprite h = tag.y1-tag.y0, X = (tag.y + tag.y0) * sw + (lx >> 5); For (var j = 0; j < h; j++) { last = 0; for (var i = 0; i <= w; // (last = Sprite [j * w + I]) >>> sx (last = Sprite [j * w + I]) // Perform "or" operations on the above two parts to form the complete 32-bit pixel information // perform "and" operation on the newly merged number and the corresponding array of board, if the value is 0, it does not overlap, return true, Otherwise it returns false if (((the last < < MSX) | (< I w? (last = sprite[j * w + i]) >>> sx : 0)) & board[x + i]) return true; } x += sw; } return false; }Copy the code

Place word function code implementation

Export const place = (board, word, bounds, size, getPosition) => {const startX = word.x; // The initial value is the midpoint of canvas x const startY = word.y; Const maxDelta = math.sqrt (size[0] * size[0] + size[1] * size[1]) const s = getPosition; // Archimedes spiral function const dt = math.random () <.5? 1:1; let t = -dt; let dxdy, dx, dy; while (dxdy = s(t += dt)) { dx = ~~dxdy[0]; // x offset dy = ~~dxdy[1]; If (math.min (math.abs (dx), math.abs (dy)) >= maxDelta) break; if (math.min (math.abs (dx), math.abs (dy)) >= maxDelta) break word.x = startX + dx; Y = startY + dy; word.y = startY + dy; / / retrieve text words in the y coordinate of the canvas / / beyond the canvas to skip the if (word. X + word. X0 < 0 | | word. The y + word. Y0 < 0 | | word. X + word. X1 > size [0] | | word.y + word.y1 > size[1]) continue; // If (! bounds || ! Cloudmind (word, board, size[0]) {if (! Bounds | | collideRects (word, bounds)) {/ / words of pixel will take the information update to the board, let Sprite = word. The Sprite, w = word. Width > > 5, Sw = size[0] >> 5, lx = word. X - (w << 4), sx = lx & 0x7f, H = word.y1-word.y0, x = (word.y + word.y0) * sw + (lx >> 5), last; // for (let j = 0; j < h; j++) { last = 0; for (let i = 0; i <= w; i++) { board[x + i] |= (last << msx) | (i < w ? (last = sprite[j * w + i]) >>> sx : 0); } x += sw; } word.sprite = null; // Can place the word return true; }}} // This word cannot place return false; }Copy the code

Draw the word cloud

Rendering function

/** * const renderWordCloud = (size, renderWordCloud); data) => { const center = [size[0] / 2, size[1] / 2] const results = [] const board = new Array((size[0] >> 5) * size[1]).fill(0); const getPosition = archimedeanSpiral(size); // const getPosition = archimedeanSpiral(size, {step: 1, b: 10}); let bounds = null let i = 0 // data = data.map((data, i) => {data.text = `${i}${data.text}`; return data}) while (i < data.length) { var d = data[i]; D.x = center[0] d.y = center[1] cloudSprite(cloudSpriteCanvasInfo, d, data, I, size[0] >> 5, size[1]); if (d.hasText && place(board, d, bounds, [...size], getPosition)) { results.push(d); if (bounds) cloudBounds(bounds, d); else bounds = [{x: d.x + d.x0, y: d.y + d.y0}, {x: d.x + d.x1, y: d.y + d.y1}]; // Temporary hack d.x -= size[0] >> 1; d.y -= size[1] >> 1; } i++ } const resultCanvasInfo = createCanvas(size[0], size[1]) results.map(word => { word.x = word.x + center[0]; word.y = word.y + center[1]; return word }).forEach(word => drawText(resultCanvasInfo.context, word)) paint(board, size) document.body.appendChild(resultCanvasInfo.canvas) }Copy the code

A simple word cloud does the trick

Other functions supported

Text size is adaptive

In practical use, the word with the largest weight will be long, and the word cannot be placed on the canvas due to the large size. To solve this problem, you can scale the whole thing by expanding the board array and trying again until the maximum scale ratio is reached or all words can be placed. After obtaining all the word placement coordinates, multiply all word size and coordinate positions by the scaling scale.

Custom word cloud shapes

Users can upload pictures in the way of custom shape, in the implementation, only need to obtain the part of the picture with drawing content, stored as an array. In subsequent collision detection, a comparison with this array can meet the need to place words in the image.

Performance optimization

The word cache

After testing, when there are thousands of words, the calculation time will be very long. After investigation, it is found that most of the time is consumed in the steps of placing words. For this reason, we add a word cache. Under the same rotation Angle, the maximum width and height of the cache can not be placed. When the width and height of the current word are found to be higher than or equal to the minimum width and height of the cached word, the attempt to place the word is skipped. This strategy has obvious optimization effect when there are a large number of words with the same font size and the same number of words.

The resources

Github.com/jasondavies…

www.jasondavies.com/wordcloud/a…

Front-end team of Bytedance data platform, responsible for r&d of big data-related products within the company. We have a strong passion in front-end technology. In addition to research and development related to data products, we have a lot of exploration and accumulation in data visualization, mass data processing optimization, Web Excel, WebIDE, private deployment and engineering tools. If you are interested, please feel free to contact us.