background
Of all the lightweight BI tools I have worked with so far, my favorite is MS Power BI.
The reasons for liking are blunt and direct:
-
It comes with a lightweight ETL tool, Power Query, which is especially useful in data warehouse scenarios.
-
Visual modeling, drag and drop, WYSIWYG;
-
The powerful DAX can not only define measures, but also write advanced queries such as loops.
-
Embed into Web applications by way of Embed, seamless integration in their own applications;
If only for personal use, I think this is a perfect tool. However, the current version of Power BI has a few inconveniences for enterprise applications. For me, the most annoying thing is that the data needs to be fully imported into the PBI file before it can be used (Direct Query mode is officially available and only partial data sources are supported).
Therefore, I have been looking for a lightweight open source BI product that is easy to use. Although tried Superset, Metabase and other well-known products, but always feel all kinds of inconvenient. Recently, I saw a domestically developed product, CBoard. After simple use, I felt quite amazing. Although CBoard still has various imperfections, I still think it is worth a try.
Preparation and installation work
JDK1.8
slightly
Maven 3
The CBoard project needs to be compiled using Maven, so install Maven first
# extract maven
cd /usr/localThe tar ZXVF apache maven - 3.3.9 - bin. Tar. GzSet environment variables
vi /etc/profile Add the following configuration
export MAVEN_HOME=/usr/local/ apache maven -- 3.3.9export PATH=${MAVEN_HOME}/bin:${PATH}
# Configure the Maven repository. Here we recommend a Maven mirror for Aliyun
vi /usr/local/apache-maven-3.3.9/conf/settings. XML <mirror> <id>alimaven</id> <name> Aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <mirrorOf>central</mirrorOf> </mirror>Copy the code
PhantomJS 2
To use kanban export and mail functions, PhantomJS 2.1+ needs to be installed
Fontconfig-devel is the main package to rely on
yum install -y gcc gcc-c++ make git openssl-devel freetype-devel fontconfig-devel
# There may be garbled characters when using PhantomJS to capture Chinese pages. Install the font to solve the problem
yum install -y bitmap-fonts bitmap-fonts-cjk
# decompressionThe tar XVF phantomjs 2.1.1 - Linux - x86_64. Tar..bz2# Create soft links
ln -sf /usr/local/ phantomjs - 2.1.1 - Linux - x86_64 / bin/phantomjs/usr /local/bin/phantomjs
Copy the code
Download the source code
The version used is 0.4.2
git clonehttps://github.com/yzhang921/CBoard.git git checkout branch - 0.4.2Copy the code
Initialize metadata
The current version (0.4) only supports MySQL and Oracle metadata stores, so MySQL is used first. Two sample files are provided to initialize the demo, or you can create it from scratch using a blank database. Because you’ve already tried the demo library, you created it from scratch this time.
create database cboard;
use cboard;
source sql/mysql/mysql.sql;
Copy the code
Modifying a Configuration File
The configuration file can be modified before the code is compiled. The parts that need to be modified include
- Metadata connection information (in this case, MySQL’s Cboard library)
- Mail server configuration (scheduled task for exporting Kanban)
vi src/main/resources/config.properties
Copy the code
The default display language of the platform is Chinese
vi src\main\webapp\org\cboard\Settings.js
Copy the code
Var Settings = {preferredLanguage: "en", preferredLanguage: "en", preferredLanguage: "en"}Copy the code
The JDBC driver for SQLServer is included by default in the source code dependencies, but it does not exist in the public Maven repository. SQLServer database is not involved in my environment, so remove the following dependencies from the pom
vi pom.xml
Copy the code
<dependency>
<groupId>com.microsoft.sqlserver</groupId>
<artifactId>sqljdbc4</artifactId>
<version>4.0</version>
</dependency>
Copy the code
Compile source code
mvn clean package
Copy the code
The deployment of
-
Copy CBoard\target\cboard.war to tomcat’s webApp directory
-
Start tomcat
-
The default user name and password are admin and root123
Use the CBoard
Each step is explained in great detail in the official Chinese language manual. So I’m only going to record part of my testing process here. Here still want to praise, domestic open source software in the documentation can do this degree, really very good.
The data source
CBoard supports multiple data sources (JDBC, Kylin, Saiku, etc.), but JDBC is the one we use most often. Fill in the parameters in the standard JDBC connection string format. In the following figure, I use PostgreSQL as the data warehouse.
The data set
The data set in CBoard is called Cube. So the process of creating a data set is the process of designing dimensions, layers, measures.
Unlike other BI platforms, there is no need to specify dimension tables and fact tables in CBoard. Instead, star model results are obtained directly using query statements. As a lightweight BI, I think this approach is appropriate, with the immediate benefit of not having to rely on the topic model of the data warehouse layer, and even in scenarios without a star model, you can use SQL to fetch data and adjust SQL statements as your needs change.
The steps for designing the dataset are shown below
- Provide SQL statements to get the results of the star model, where I used views directly.
- If a real-time interval is provided, the data set is refreshed automatically, and if I use a time function in SQL to get data within 5 minutes, I need to set a refresh period here.
- All fields in the query result are candidates for dimensions and metrics.
- Dimensions support both a single dimension (such as date in the figure above) and a hierarchical dimension (such as YQMD in the figure above), and anyone who knows OLAP knows the difference.
- There is no need to specify the calculation method in the indicator column, that is, only fields are provided, and aggregation functions are not provided. Here also reflects the lightweight design idea, the action of selecting aggregate function is determined at the time of use, the advantage is that the number of indicators will be smaller.
- Alternative expressions can perform simple calculations such as sum(A)/count(B), but cannot perform complex operations such as year-over-year comparisons.
- Set common filter criteria, which will be handy when designing diagrams later.
Graphic design
Chart design is as simple as designing an EXCEL PivotTable: drag releases, columns, metrics, and then preview tweaks. Experience with pivottables is a quick start, as shown below
Kanban design
In the current version, I think the kanban design is a big flaw. Kanban design is always a bit awkward because there is no wySIWYG drag-and-drop. In CBoard, you divide the page into rows, then divide the columns in each row, and place multiple charts on the same row by specifying the width of the column. This approach can achieve extensive design intent, but for some refined design is powerless.
The following figure shows the kanban in the design phase and the display phase respectively. You can see how the elements on the kanban are placed in the corresponding positions
Timing task
Scheduled tasks allow kanban content to be emailed at a fixed time, which is very useful. One problem I found, however, was that kanban data was sent ignoring kanban level parameters and using only data set and chart level parameters.
The scheduled task setup is very simple, so I won’t take screenshots here.
drawback
In the version I tested (0.4.2), I also found several imperfections that I hope will be improved in future releases.
- The data set cannot be directly set year-on-year comparison and other operations. This is a tall order for lightweight BI, and I currently know of only two methods that can set complex metrics directly in the Cube layer: MDX and DAX.
- Drag and drop is not yet implemented in kanban design, so some refined layouts are not possible. The new full-screen monitor kanban, though, already allows for drag-and-drop layouts (I haven’t used this feature in depth yet).
- Export/mail function, cannot export according to the filter set on kanban. In my tests, each email I exported was a complete set of data contents, and I didn’t find a way to export only a portion of the data. I don’t know if it’s my posture.