This article is a detailed guide to the must-see data visualization technology at the front end, from Xiaoqing Dong, experience Technology Department of Ant Financial.
This is an introductory guide for data visualization development, which introduces the problems to be solved by visualization and the tools that can be directly used. I will introduce it to you from the following aspects, and illustrate the visualization team and resources of Ali/Ant with examples: 1. What is data visualization? 2. How to perform data visualization? 3. Data visualization scenarios and tools 4. Common problems during data visualization
What is data visualization
Data visualization studies how to transform data into interactive graphics or images, which can be expressed in a way that can be felt visually, to enhance people’s cognitive ability and achieve the purpose of discovery, interpretation, analysis, exploration, decision-making and learning.
“Data Visualization and Infographics are two similar professional terms. In the narrow sense, data visualization refers to the presentation of data in the form of statistical charts, while information visualization refers to the visualization of non-digital information. The former is used to convey information, while the latter is used to represent abstract or complex concepts, techniques and information. Data visualization in a broad sense is a general term for data visualization, information visualization, and scientific visualization.” — The Beauty of Data Visualization
Data visualization in the broad sense involves information technology, natural science, statistical analysis, graphics, interaction, geographic information and other disciplines.
“Scientific Visualization, Information Visualization and VisualAnalytics are generally regarded as the three main branches of Visualization. The integration of these three branches into a new discipline “data visualization” is a new starting point in the field of visualization research.” — Data Visualization
Here we make a brief introduction to scientific visualization, information visualization and visual analysis:
Scientific Visualization is the earliest and most mature interdisciplinary research and application field in the field of Visualization [Shi Jiaying 1996]. The fields are mainly natural sciences, such as physics, chemistry, meteorology and climate, aerospace, medicine, biology and other disciplines, which usually need to interpret, manipulate and process data and models, in order to find patterns, characteristics, relationships and anomalies [Schroeder2004].
The objects of Information Visualization are abstract data sets, which originate from statistical graphics and are related to modern technologies such as Information graphics and visual design. Its expression form is usually in two-dimensional space, so the key problem is to convey a large amount of abstract information in an intuitive way in a limited display space. Compared with scientific visualization, scientific visualization processes data with natural geometric structure (such as magnetic induction lines, fluid distribution, etc.), and information visualization pays more attention to abstract, high-dimensional data. Bar chart, trend chart, flow chart, tree chart, etc., are the most commonly used visual expression of information visualization. The design of these graphs transforms abstract data concepts into visual information.
Visual Analytics is defined as an analytical reasoning science based on Visual interaction [Thomas2005]. It a combination of graphics, data mining, and human-computer interaction technology, such as with visual interface for the channel, will people perception and cognition into the data in the form of visual processing, forming the human intelligence and machine intelligence advantage complementary and mutual promotion, establish a spiral information sharing and knowledge extraction ways, complete and effective analytical reasoning and decision making.
Scientific visualization, information visualization, and visual analytics have overlapping goals and techniques, and the boundaries between these fields have not yet been clearly agreed upon.
The goal of data visualization
The essence of data visualization is to map data into graphics through various visual channels, which enables users to understand data faster and more accurately. Therefore, the problem to be solved in data visualization is how to express the data in a visually observable way. At the same time, it needs to consider aesthetics and comprehensibility, solve the problems of coverage, clutter and conflict in the limited display space (canvas), and then view the details of the data in an interactive form.
How do you visualize data
Use a classic diagram to illustrate how data visualization works:
The data visualization process can be divided into the following steps: 1. Define problem 2 to be solved. Determine the data and data structure to be displayed 3. Determine the dimension (field) of the data to be displayed 4. Determine the type of chart to use. 5. Determine the interaction of the chart
1. Define the problem. First of all, make clear that data visualization is to let users understand the data and understand the data. So it is important to define the problem to be solved before you begin data visualization. For example: I’d like to see the change in sales over the past two weeks. Is it an increase or a decrease? You can define your problem in terms of trend, contrast, distribution, flow, timing, space, relevance, etc.
2. Determine the data to be displayed for data visualization first requires data. Due to the limitation of canvas size, excessive data cannot be displayed directly, so it is necessary to determine the data to be displayed: (1) Whether the data to be displayed has been processed, and whether there are empty values? (2) List data or tree data? (3) How big is the data? (4) Whether to aggregate the data and display the data in layers? (5) How to load the page and whether to process the data in the front end?
3. Determine the data dimension to be displayed. Select the fields for visualization.
There are many types of charts you can use, but the type of chart you want to display depends on the problem you want to solve, the structure of your data, and the data dimensions you choose:
How to choose the chart type can reference: • AntV chart usage: antvis. Making. IO/vis/doc/cha… • Chart guidelines: www.yuque.com/mo-college/…
Data visualization scenarios and tools
At present, Internet companies usually have the following categories of visualization requirements: 1. General report 2. Mobile graph 3. Large screen visualization 4. Graph editing & Graph analysis 5
1. More than 85% of the requirements in the development process of general report requirements are general report requirements. General chart libraries can be used to meet daily development requirements, including Highcharts, Echarts and amCharts, etc. AntV has opened source chart libraries based on graph syntax: G2 github.com/antvis/g2
G2 has the following characteristics: (1) Ever-changing and free combination. Starting from the data, only a few lines of code can easily obtain the desired chart display effect (2) vivid, easy to implement. Based on a large number of product practices, it provides a drawing engine, complete graphics syntax, professional design specifications (3) rich interactive capabilities. Provides the ability to customize interactions based on graphical syntax
At present, Ali Group has a large number of charts libraries based on G2 encapsulation, which are encapsulated for specific frameworks and business scenarios, some of which are open source: • BizCharts bizcharts.net/index Produced by Alibaba International UED team, G2 react package, which focuses on e-commerce business chart visualization and precipitation of e-commerce business line visualization specifications. Implement common charts and custom charts in the React project. • Viser Viserjs.github. IO/Ali Data Platform Technology Department, supporting vue, React, AngularJS frameworks.
2. Visualization on mobile If you are facing a scenario that requires compatibility on both PC and mobile then use G2 and adapt to the mobile screen, but if you are using H5 or applets on mobile APP, then choose F2: github.com/antvis/f2 F2, a mobile focused, out-of-the-box visual solution that perfectly supports H5 environments and is compatible with multiple environments (Node, applets, WEEX). Complete graph syntax theory, to meet your various visualization needs. Professional mobile design guide to bring you the best mobile graphics experience.
F2 supports multiple platforms, and other teams in Alibaba Group have also made some packages, such as My-F2, which is a version for small program packaging, and is now open source:
Github.com/antvis/my-f…
3. Big-screen visualization Big-screen visualization focuses on the display of various businesses such as conference and exhibition, business monitoring, risk warning, geographic information analysis, etc. It has high requirements in graphic rendering and visual design. Currently, the big-screen visualization team within Alibaba Group includes: • Graphics and Art Lab of Ant Financial • DataV team of Ali Cloud • Ali Data Technology and Products Department — The Beauty of Data
At present, large screen has almost become the standard of TO B project, and its application scenarios are more and more extensive.
Diagram visualization has two major areas: (1) diagram editing: scenarios for diagram modeling (ER diagrams, UML diagrams), flow charts, brain diagrams, etc., requiring users to be deeply involved in the creation, editing, and deletion of relationships (2) Diagram analysis: Used for relationship discovery in risk control, security, marketing scenarios, and business interpretation of some basic concepts of graphs, such as rings, critical links, and flux. • JointJS focuses on graph editing, including common flow charts and BPMN diagram functions, easy to use, out of the box but very difficult to secondary development. • D3.js is a very low-level visualization library with a large number of cases of graph analysis scenarios, which costs a lot to get started and has a large distance between demo and business.
At present, AntV has opened the graph basic framework G6: github.com/antvis/g6 in the direction of graph visualization, which mainly completes the following functions: (1) rendering of nodes and edges, including custom nodes and edges; (2) event interaction mechanism, with a large number of common interactions embedded; (3) common layout, including tree layout and forced-guide layout
On top of G6 we also provide G6-Analyzer and G6-Editor for graph editing and graph analysis.
5. Geographic visualization Geographic data visualization is mainly the visualization of spatial data domains, which mainly includes three fields: (1) Information graph: it is mainly used to display reports related to location, information graph, path change and so on. (2) Big-screen application: Big-screen display generally takes geographical data as the carrier, such as data visualization of buildings, roads and tracks. (3) Geographical analysis applications: such applications tend to be interactive analysis of massive geographical data, user-location-based user recommendation, new promotion, activation and other business operation systems, or site selection, risk monitoring and other systems.
AntV G2 and L7 both provide geographic data visualization solutions, where: • G2 provides support for general geographic data charting. • L7 is a more professional geographic data visualization solution. It adopts WebGL rendering technology, supports visual analysis of massive geographic data, and supports vector tile scheme of multi-thread operation. It can meet the requirements of large screen visual geographical analysis applications.
Alibaba Group’s other geographic visualization frameworks include: • Autonavi’s Loca • Cainiao’s bird maps
Frequently asked Questions
Chart misuse Chart misuse is the most common problem, see the following scenarios: Example 1: Scenario with too many categories. The figure below is the proportion of the population of each province. Because this map contains too many categories, the problem mentioned in the introduction appears. It is difficult to clearly compare the proportion of the population data of each province, so in this case, we recommend using a horizontal bar chart.
Example 2: Let’s take a scenario where sales of different game types are compared, and use a bar chart rather than a line chart to represent data for comparison by category.
2. Mobile and PC Charts AntV provides G2 and F2 statistical chart frameworks. Users often encounter the business of mobile terminal and PC terminal at the same time, so they will have to choose between the two frames. G2 is essentially a chart library designed for traditional middle and background products. In addition to general report display, it also provides a lot of interaction and strong analysis ability. F2, on the other hand, was developed specifically for mobile, focusing on code size, performance, and performance. So we have the following suggestions: (1) If your users are mainly from PC, use G2, which supports more chart types and interactions. (2) If you are using H5 and applets on heavy apps like wallet, please use F2. We’ve done a lot of compatibility work with a lot of platforms on mobile. (3) If you are developing a BI analysis system and need some analysis ability in addition to report function, use G2. (4) If you are developing monitoring system and the main users are from PC, use G2. (5) If you are developing a report system, the main users of the mobile terminal to view the chart, then please use F2 (PC terminal can also view).
The visualization we do in the front end is only visualization of small-scale data. If you want to visualize super-large data, you can choose: (1) data stratification (2) data aggregation (3) data sampling
conclusion
This is an introduction to visualization, which introduces the problems to be solved and the tools to be used directly. If you are interested in visualization, you can follow the ink Institute: antv.alipay.com/zh-cn/vis/i…