background

Of all the lightweight BI tools I have worked with so far, my favorite is MS Power BI.

The reasons for liking are blunt and direct:

  • It comes with a lightweight ETL tool, Power Query, which is especially useful in data warehouse scenarios.

  • Visual modeling, drag and drop, WYSIWYG;

  • The powerful DAX can not only define measures, but also write advanced queries such as loops.

  • Embed into Web applications by way of Embed, seamless integration in their own applications;

If only for personal use, I think this is a perfect tool. However, the current version of Power BI has a few inconveniences for enterprise applications. For me, the most annoying thing is that the data needs to be fully imported into the PBI file before it can be used (Direct Query mode is officially available and only partial data sources are supported).

Therefore, I have been looking for a lightweight open source BI product that is easy to use. Although tried Superset, Metabase and other well-known products, but always feel all kinds of inconvenient. Recently, I saw a domestically developed product, CBoard. After simple use, I felt quite amazing. Although CBoard still has various imperfections, I still think it is worth a try.

Preparation and installation work

JDK1.8

slightly

Maven 3

The CBoard project needs to be compiled using Maven, so install Maven first

# extract maven
cd /usr/localThe tar ZXVF apache maven - 3.3.9 - bin. Tar. GzSet environment variables
vi /etc/profile Add the following configuration

export MAVEN_HOME=/usr/local/ apache maven -- 3.3.9export PATH=${MAVEN_HOME}/bin:${PATH}

# Configure the Maven repository. Here we recommend a Maven mirror for Aliyun
vi /usr/local/apache-maven-3.3.9/conf/settings. XML <mirror> <id>alimaven</id> <name> Aliyun maven</name> <url>http://maven.aliyun.com/nexus/content/groups/public/</url> <mirrorOf>central</mirrorOf> </mirror>Copy the code

PhantomJS 2

To use kanban export and mail functions, PhantomJS 2.1+ needs to be installed

Fontconfig-devel is the main package to rely on
yum install -y gcc gcc-c++ make git openssl-devel freetype-devel fontconfig-devel

# There may be garbled characters when using PhantomJS to capture Chinese pages. Install the font to solve the problem
yum install -y bitmap-fonts bitmap-fonts-cjk

# decompressionThe tar XVF phantomjs 2.1.1 - Linux - x86_64. Tar..bz2# Create soft links
ln -sf /usr/local/ phantomjs - 2.1.1 - Linux - x86_64 / bin/phantomjs/usr /local/bin/phantomjs
Copy the code

Download the source code

The version used is 0.4.2

git clonehttps://github.com/yzhang921/CBoard.git git checkout branch - 0.4.2Copy the code

Initialize metadata

The current version (0.4) only supports MySQL and Oracle metadata stores, so MySQL is used first. Two sample files are provided to initialize the demo, or you can create it from scratch using a blank database. Because you’ve already tried the demo library, you created it from scratch this time.

create database cboard;
use cboard;
source sql/mysql/mysql.sql;
Copy the code

Modifying a Configuration File

The configuration file can be modified before the code is compiled. The parts that need to be modified include

  • Metadata connection information (in this case, MySQL’s Cboard library)
  • Mail server configuration (scheduled task for exporting Kanban)
vi src/main/resources/config.properties
Copy the code

The default display language of the platform is Chinese

vi src\main\webapp\org\cboard\Settings.js
Copy the code
Var Settings = {preferredLanguage: "en", preferredLanguage: "en", preferredLanguage: "en"}Copy the code

The JDBC driver for SQLServer is included by default in the source code dependencies, but it does not exist in the public Maven repository. SQLServer database is not involved in my environment, so remove the following dependencies from the pom

vi pom.xml
Copy the code
<dependency>
    <groupId>com.microsoft.sqlserver</groupId>
    <artifactId>sqljdbc4</artifactId>
    <version>4.0</version>
</dependency>
Copy the code

Compile source code

mvn clean package
Copy the code

The deployment of

  • Copy CBoard\target\cboard.war to tomcat’s webApp directory

  • Start tomcat

  • The default user name and password are admin and root123

Use the CBoard

Each step is explained in great detail in the official Chinese language manual. So I’m only going to record part of my testing process here. Here still want to praise, domestic open source software in the documentation can do this degree, really very good.

The data source

CBoard supports multiple data sources (JDBC, Kylin, Saiku, etc.), but JDBC is the one we use most often. Fill in the parameters in the standard JDBC connection string format. In the following figure, I use PostgreSQL as the data warehouse.

The data set

The data set in CBoard is called Cube. So the process of creating a data set is the process of designing dimensions, layers, measures.

Unlike other BI platforms, there is no need to specify dimension tables and fact tables in CBoard. Instead, star model results are obtained directly using query statements. As a lightweight BI, I think this approach is appropriate, with the immediate benefit of not having to rely on the topic model of the data warehouse layer, and even in scenarios without a star model, you can use SQL to fetch data and adjust SQL statements as your needs change.

The steps for designing the dataset are shown below

  1. Provide SQL statements to get the results of the star model, where I used views directly.
  2. If a real-time interval is provided, the data set is refreshed automatically, and if I use a time function in SQL to get data within 5 minutes, I need to set a refresh period here.
  3. All fields in the query result are candidates for dimensions and metrics.
  4. Dimensions support both a single dimension (such as date in the figure above) and a hierarchical dimension (such as YQMD in the figure above), and anyone who knows OLAP knows the difference.
  5. There is no need to specify the calculation method in the indicator column, that is, only fields are provided, and aggregation functions are not provided. Here also reflects the lightweight design idea, the action of selecting aggregate function is determined at the time of use, the advantage is that the number of indicators will be smaller.
  6. Alternative expressions can perform simple calculations such as sum(A)/count(B), but cannot perform complex operations such as year-over-year comparisons.
  7. Set common filter criteria, which will be handy when designing diagrams later.

Graphic design

Chart design is as simple as designing an EXCEL PivotTable: drag releases, columns, metrics, and then preview tweaks. Experience with pivottables is a quick start, as shown below

Kanban design

In the current version, I think the kanban design is a big flaw. Kanban design is always a bit awkward because there is no wySIWYG drag-and-drop. In CBoard, you divide the page into rows, then divide the columns in each row, and place multiple charts on the same row by specifying the width of the column. This approach can achieve extensive design intent, but for some refined design is powerless.

The following figure shows the kanban in the design phase and the display phase respectively. You can see how the elements on the kanban are placed in the corresponding positions

Timing task

Scheduled tasks allow kanban content to be emailed at a fixed time, which is very useful. One problem I found, however, was that kanban data was sent ignoring kanban level parameters and using only data set and chart level parameters.

The scheduled task setup is very simple, so I won’t take screenshots here.

drawback

In the version I tested (0.4.2), I also found several imperfections that I hope will be improved in future releases.

  • The data set cannot be directly set year-on-year comparison and other operations. This is a tall order for lightweight BI, and I currently know of only two methods that can set complex metrics directly in the Cube layer: MDX and DAX.
  • Drag and drop is not yet implemented in kanban design, so some refined layouts are not possible. The new full-screen monitor kanban, though, already allows for drag-and-drop layouts (I haven’t used this feature in depth yet).
  • Export/mail function, cannot export according to the filter set on kanban. In my tests, each email I exported was a complete set of data contents, and I didn’t find a way to export only a portion of the data. I don’t know if it’s my posture.