This is the third day of my participation in the August Essay Challenge.
This series of D3. Js data visualization is the ancient liu, the logic of according to oneself want to write to write, too, can and online tutorials as to how many piece will write, write what, old liu also completely heart didn’t count, although it is heading for a beginner can easily understand the goals, but really you finish see what it feels like to think, old liu also not clear, So I’m looking forward to a lot of feedback, as well as improvements in future articles, and another video tutorial based on this series if there’s an opportunity, but that’s for another day.
Supporting code and used data will be open source to this warehouse, welcome everyone Star, any other questions can be exchanged in the group: github.com/DesertsX/d3…
preface
D3.js data Visualization series (1) – Niuyi Ancient Liu – 2021.07.30 and d3.js Data Visualization series (2) – Niuyi Ancient Liu – 2021.08.10 are mainly aimed at bringing you familiar with d3.js drawing SVG Elements and other operations, so other areas as simple as how to do, such as data with a directly generated array of natural numbers is enough, avoid introducing more concepts, not in the tutorial at one time too much content, but try to break up the knowledge points.
const dataset = d3.range(30)
Copy the code
Now that you are familiar with drawing elements on canvas, Gu Liu will continue to explain how to read real data sets, process data accordingly, draw elements based on data, map category attributes to corresponding colors, as well as the use of scale, text element drawing, legend implementation and other related content.
Of course, everything is still based on the previous two articles, so this time the rectangle is still the main visual element.
At the beginning, Gu Liu’s idea was that it would be better to have category attributes in the data, so as to facilitate the explanation of color scale and the realization of legends about the number of categories, as well as pave the way for subsequent articles.
The idea was to use data sets like books (or movies) so that when you compile your list of books you’ve read (doGE, if you’ve read a lot of books) at the end of the year, you might be able to follow the code in this article and visualize what kind of books you’ve read.
However, Gu Liu did not think of a suitable book data set. Later, he thought that the data of the top 100 Up masters of STATION B in 2020 was ok and he could take them to see what partitions they were. Of course, this article does not involve the steps to obtain data, a very lengthy explanation, will write a follow-up introduction.
Here we just need to know that the partition data is obtained from the “Video” under the “Contribution” column of the Up main homepage, and we simply use the region with the largest number as the Up main partition, which may not be accurate, just as the example demonstrated in the tutorial.
So let’s take a look at the final image,
Based on the code
This time the style is slightly different from the previous two, with the div#chart element centered and the SVG canvas set to a fixed width and height. But these are not very critical, depending on their own needs how to set up.
<! DOCTYPEhtml>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="Width = device - width, initial - scale = 1.0">
<title>D3.js Data Visualization Series (3) - Ancient Liu</title>
<style>
* {
margin: 0;
padding: 0;
}
body {
background: #f5e6ce;
height: 100vh;
display: flex;
justify-content: center;
align-items: center;
}
</style>
</head>
<body>
<div id="chart"></div>
<script src="./d3.js"></script>
<script>
function drawChart() {
// ...
}
drawChart()
</script>
</body>
</html>
Copy the code
Read the data
Most of the time, the data used for visualization is stored in CSV or JSON files and can be read directly with d3.csv() or d3.json(). However, since reading data is an asynchronous operation, we need to add await keyword to ensure that we can execute the following code after reading data. Meanwhile, we also need to add async keyword outside the function. We will not explain asynchronous operation and synchronous operation, macro task and micro task and other concepts here, you can understand by yourself.
async function drawChart() {
let dataset = await d3.json('2020_bilibili_upzhu.json')
console.log(dataset[0])
console.table(dataset)
}
drawChart()
Copy the code
All you need to know is that you can change the async await mode of reading data like below to the async await mode above in the future, and it will be more comfortable to write.
d3.csv("data.csv".function (error, dataset) {
console.log(dataset)
});
d3.json('data.json').then(dataset= > {
console.log(dataset[0])
console.table(dataset)
})
Copy the code
The data format
Here is the data format. There are 100 up data in the JSON file. This paper only uses nickname name and partition data tlist for the time being, and two new attributes field and fieldId will be added after data processing for future use.
[{name: "Hello, teacher. My name is Mr. He.".uid: "163637592".tlist: [{tid: 160.count: 4.name: "Life" },
{ tid: 188.count: 32.name: "Science and technology" },
{ tid: 217.count: 1.name: "Animal pen" },
{ tid: 36.count: 5.name: "Knowledge"},].likes: 28123374.view: 216333794.desc: "Put 6 million fan ids into a group photo and make a kitten running with 10,000 lines of memos -- he is a student who makes funny videos of thieves on holidays. He is as curious about the past as he is about the future, honing his wild ideas into neat contributions. As a fan of He, I will not only pay attention to the progress of technology, but also care about the human life itself affected by digital.".face: "http://i0.hdslb.com/bfs/activity-plat/static/af656f929a9b11da0afaad548cc50dcf/F8frVz9MD.jpg".// field: "Technology ",
// fieldId: 10,},]Copy the code
The data processing
Field, that is, the primary partition to which Up belongs, is obtained by sorting the partition array in descending order based on the number of counts and taking the name of the first partition. The specific processing process is as follows.
dataset.forEach(d= > {
if(d.tlist ! = =0) {
d.tlist.sort((a, b) = > b.count - a.count)
} else {
// uid: '466272' tlist: 0
d.tlist = [{ tid: 129.count: 100.name: "Fashion" }]
}
d.field = d.tlist[0].name
})
Copy the code
Has overturned because of the one hundred large Up several cool cool, so we need special treatment, such as “wit party sister” removed all video, unknown partition data, and the willow crawl data set its tlist become zero, so here after filtered, manual again set to “fashion” area, and the count number it doesn’t matter, Tid is the official data of station B. Copy it by referring to the up master data of other fashionable areas, and save it in array format, so that the first partition can be taken by index uniformly. While other cool up master data are still normal, there is no extra processing here.
Now that we have all the up primary partition data, let’s count the number of partitions.
let fieldCount = {}
const fields = dataset.map(d= > d.field)
fields.forEach(d= > {
if (d in fieldCount) {
fieldCount[d]++
}
else {
fieldCount[d] = 1}})// console.log(fieldCount)
Copy the code
Convert the Object format of the statistics result to an array format through object.entries (), where each element is an array format, sorted backwards by number of partitions (fieldCountArray will also be used in /legend).
let fieldCountArray = Object.entries(fieldCount)
fieldCountArray.sort((a, b) = > b[1] - a[1])
// console.log(fieldCountArray)
// fieldCountArray[["Game".20],
["Life".15],
["Food".11],
["Knowledge".11],
["Animation".8],
["Fashion".7],
["Music".6],
["Ghost cattle".5],
["The film and television".5],
["Dance".4],
["Science and technology".4],
["Animal pen".2],
["China creates".1],
["Car".1]]Copy the code
Finally, based on the partition attribute field of up master, set its index in fieldCountArray as fieldId to the original dataset. In this way, the dataset can also be sorted in descending order according to the number of partitions. Otherwise, due to the large number of partitions and the following colors, it will be too garish and difficult to identify if it is randomly arranged.
dataset.map(d= > d.fieldId = fieldCountArray.findIndex(f= > f[0] === d.field))
dataset.sort((a, b) = > a.fieldId - b.fieldId)
Copy the code
The above is data processing related operations, know what is needed, and then process the corresponding format of data, as for the intermediate process, how to write the code may be everyone has their own way of implementation, these are not a problem.
Set the canvas
The width and height of the canvas are fixed, there is nothing to say about this, depending on the actual needs of the canvas can be set.
One difference is that margin is also set this time, which is generally used to leave corresponding space for the upper, lower and left sides of the drawing area. For example, there is usually a Y axis on the left side and an X axis on the lower side. At this time, it is necessary to leave space for coordinate axes, scales, labels, etc., so that the left and bottom will be set larger accordingly.
const width = 1400
const height = 700
const margin = {
top: 100.right: 320.left: 30
}
const svg = d3.select('#chart')
.append('svg')
.attr('width', width)
.attr('height', height)
.style('background'.'#FEF5E5')
const bounds = svg.append('g')
.attr('transform'.`translate(${margin.left}.${margin.top}) `)
Copy the code
This time, the top space is reserved for the title, and the right space is reserved for drawing examples, so the top and right will be larger, and the left side is also empty to avoid too close to the edge.
After adding the SVG canvas, by adding a G element to the SVG, group, and then shifting it horizontally to the right and vertically down the corresponding pixels, the origin of the coordinates of subsequent elements drawn in G will start at the top left corner of the box area in the diagram, not the top left corner of the canvas.
G element may be the designer’s mouth “play a group”, will not actually render the content in the page, but it is convenient to distinguish between different areas of the web page “play a group”, but also convenient to translate the elements within a group unified operation, is a very useful element, will be frequently used later.
Color data
The array of colors will correspond to the partitions counted in fieldCountArray. The color scheme used in the beginning was changed to this color scheme after receiving feedback from many people that it was not a good color, which will be discussed in the sidebar.
const colors = [
'#5DCD51'.'#51CD82'.'#51CDC0'.'#519BCD'.'#515DCD'.'#8251CD'.'#CD519B'.'#CD519B'.'#CD515D'.'#CD8251'.'#CDC051'.'#B6DA81'.'#D2E8B0'.'#A481DA'
]
Copy the code
Add the title
Text in SVG needs to be implemented by adding text elements, as well as titles. Here I put the title up to the left, and the X /y coordinates are easy to understand; .text() is the specific text content; Font related CSS styles, such as font size and weight, need to be set with.style().
const title = svg.append('text')
.attr('x', margin.left)
.attr('y', margin.top / 2)
.attr('dominant-baseline'.'middle')
.text('Top 100 Up Main Zones in Station B in 2020')
.style('font-size'.'32px')
.style('font-weight'.600)
Copy the code
It is important to note that Settings are requireddominant-baseline: middle
Align the text horizontally with the center axisx/y
Coordinate point alignment. This attribute ancient willow is also seen recentlyFullstack D3
Just know, learn to use, the effect of other Settings as shown in the figure. The same vertical central axis alignment coordinate points can be settext-anchor: middle
This should be used more often, and we’re going to use it. Links:Developer.mozilla.org/en-US/docs/…
Draw a visual body diagram
Next, you should be familiar with the operation of drawing elements based on data, which has been mentioned several times in the previous two articles. The rectWidth width is 50px, the rectHeight height is 80px, the space between the left, right and bottom of the rectangle is 10px, and each row has a maximum of 17 rectangles. The layout is done by assigning the coordinates of each rectangle to the mod operation.
Note that you are adding to the bounds element that has been panned horizontally and vertically overall, not in SVG; A group G was added first to distinguish it from other regions. If you add rectangles directly to the bounds because there are rectangles in the subsequent legend, then the bounds. SelectAll (‘rect’) check the rectangles, and you need to set the class style name to avoid this. I’ll show you when I add the legend below, but it doesn’t hurt to “group” more.
const rectTotalWidth = 60
const rectTotalHeight = 90
const rectPadding = 10
const rectWidth = rectTotalWidth - rectPadding
const rectHeight = rectTotalHeight - rectPadding
const columnNum = 17
const rectsGroup = bounds.append('g')
const rects = rectsGroup.selectAll('rect')
.data(dataset)
.join('rect')
.attr('x'.(d, i) = > i % columnNum * rectTotalWidth)
.attr('y'.(d, i) = > Math.floor(i / columnNum) * rectTotalHeight)
.attr('width', rectWidth)
.attr('height', rectHeight)
.attr("fill".d= > colors[fieldCountArray.findIndex(item= > item[0] === d.field)])
Copy the code
In addition, when filling the rectangle color with fill, you need to find the index value from fieldCountArray according to the field partition data of each UP master, and then extract the corresponding color from the color array colors. This is mainly because the JS writing method may be difficult for beginners who are not familiar with it.
The bound data can be in a variety of formats
Array data bound to elements, or D3 selection sets, can be in a variety of formats. Just remember to set the attributes in.attr() or the styles in.style(). If it is related to data, it is specified by the callback function, where the function arguments (d, I) are each element in the array and the element index, respectively.
.selectAll('rect')
.data(dataset)
.attr('x'.(d, i) = > d * 10)
Copy the code
Let’s say that each item in the array is a number, so d is a number; An array is a nested array, each element is an array and d is an array; Arrays are filled with objects, and d is objects… And then when you set the specific callback function, you get the data from d.
dataset => [0.1.2.3] => d => [['games'.21], [' '.10], [' ',]] => d => dataset => [{name: ' '.field: ' ' }, { name: ' '.field: ' ' }, { name: ' '.field: ' '}] => d is the objectCopy the code
The up primary name is displayed
Then add the main name of up in the center of each rectangle, and both text-anchor and morale-baseline are set as middle, so that the text can be displayed in the center. Of course, the effect here is not good enough, there is a problem of text overlap, because it is only a small example in the tutorial, just for a cursory look at those up masters, so it is not much optimization.
const texts = rectsGroup.selectAll('text')
.data(dataset)
.join('text')
.attr('x'.(d, i) = > rectWidth / 2 + i % columnNum * rectTotalWidth)
.attr('y'.(d, i) = > rectHeight / 2 + Math.floor(i / columnNum) * rectTotalHeight)
.text(d= > d.name)
.attr('text-anchor'.'middle')
.attr('dominant-baseline'.'middle')
.attr('fill'.'# 000')
.style('font-size'.'9.5 px.)
.style('font-weight'.400)
// .style('writing-mode', 'vertical-rl')
Copy the code
Add a legend
Next draw a legend on the right side of the canvas to show the number of hundred Ups for each partition. Originally, 320px was reserved for the right side, but there was still some space on the right side of the left main drawing, so when adding g element to the legend, it was horizontally shifted to the right. The details can be adjusted after drawing.
const legendPadding = 30
const legendGroup = bounds.append('g')
.attr('class'.'legend')
.attr('transform'.`translate(${width - margin.right - legendPadding}`, 0))
Copy the code
Also, the left and right sides of the rectangle in the legend on the right are reserved for adding partition text and corresponding numbers.
In order to map the partition value to the pixel value of the width of the region on the right, we need to use the useful scale in d3.js, which is essentially a function. Linear scale is a linear function. We set the minimum and maximum values in the data through.domain(). The maximum value is automatically calculated by specifying the second element attribute, partition statistics, from the nested fieldCountArray array d3.max(), and by setting the pixel size of the area on the canvas using.range(). The minimum value is also 0. The maximum value is the space on the right minus the reserved size of legendPadding on both sides. Notice that this is all passed in as an array. (Scale here may not be clear enough, the subsequent article will explain)
const legendWidthScale = d3.scaleLinear()
.domain([0, d3.max(fieldCountArray, d= > d[1])])
.range([0, margin.right - legendPadding * 2])
Copy the code
LegendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight = legendTotalHeight Then legendBarTotalHeight equals the legendBarHeight of the legend rectangle plus the legendBarPadding of the lower spacing.
const legendBarPadding = 3
const legendTotalHeight = (Math.floor(dataset.length / columnNum) + 1) * rectTotalHeight - rectPadding
const legendBarTotalHeight = legendTotalHeight / fieldCountArray.length
const legendBarHeight = legendBarTotalHeight - legendBarPadding * 2
Copy the code
Finally, draw the legend rectangle, partition name and corresponding value respectively. .selectAll() contains a class element, especially if there are two groups of text, so it must be added. Join (‘text’).attr(‘class’, ‘legend-label’).
LegendWidthScale () : legendWidthScale() : legendWidthScale() : legendWidthScale() : legendWidthScale() Most of the other attributes have been covered before, so just be careful where you put them.
const legendBar = legendGroup.selectAll('rect.legend-bar')
.data(fieldCountArray)
.join('rect')
.attr('class'.'legend-bar')
.attr('x'.30)
.attr('y'.(d, i) = > legendBarPadding + legendBarTotalHeight * i)
.attr('width'.d= > legendWidthScale(d[1]))
.attr('height', legendBarHeight)
.attr('fill'.(d, i) = > colors[i])
const legendLabel = legendGroup.selectAll('text.legend-label')
.data(fieldCountArray)
.join('text')
.attr('class'.'legend-label')
.attr('x'.30 - 10)
.attr('y'.(d, i) = > legendBarTotalHeight * i + legendBarTotalHeight / 2)
.style('text-anchor'.'end')
.attr('dominant-baseline'.'middle')
.text(d= > d[0])
.style('font-size'.'14px')
const legendNumber = legendGroup.selectAll('text.legend-number')
.data(fieldCountArray)
.join('text')
.attr('class'.'legend-number')
.attr('x'.d= > 35 + legendWidthScale(d[1]))
.attr('y'.(d, i) = > legendBarTotalHeight * i + legendBarTotalHeight / 2)
.attr('dominant-baseline'.'middle')
.text(d= > d[1])
.style('font-size'.14)
.attr('fill'.'# 000')
Copy the code
summary
In this article, Gu Liu leads us to draw a visualization using real data sets, which also explains more usage of D3.js. The final effect drawing may have many problems, for example, some friends mentioned that in the legend, the large value can be set to dark color, and the small value can be set to light color, which may be better. But preparing this article has spent a lot of time, want to talk about the content of all can be said, further optimization is left to everyone to achieve it.
As usual,
Post updates may not be that frequent, but the visual exchange group does share excellent works or articles every day, welcome to join the exchange. Add guliu wechat “xiaoaizhj” note “visual group” can be! If you have questions about any part of this article, please ask them in the group.
Welcome to follow gu Liu’s official account “Niu Yi Gu Liu”, and set a star tag so that you can receive updates as soon as possible.