This article is published on the Nebula Graph Community public account

“Summer of open source

Open Source Software Supply Chain Lighting Program – Summer 2021 (hereinafter referred to as “Open Source Summer”) is a summer activity for college students jointly organized by the Institute of Software, Chinese Academy of Sciences and openEuler Community. It aims to encourage college students to actively participate in the development and maintenance of open source software and promote the vigorous development of excellent open source software community. Together with Nebula Graph and the national open source community, CAS is offering projects for the development and maintenance of key open source software for students from universities around the world. After the students choose the project freely, they will communicate the implementation plan with the community tutor and write the project plan. Selected students will complete the development work as planned under the guidance of community mentors and contribute to the community. Depending on the difficulty and completion of the project, participants will receive project bonuses ranging from 6,000 to 12,000 from the sponsor.

Official website: summer.iscas.ac.cn/

Nebula Graph supports THE JDBC protocol for the Nebula Graph project.

Project information

The Nebula Graph database supports the JDBC protocol

Project details

Give Nebula Graph access to the JDBC protocol, implementing the Nebula JDBC Driver, and the associated interfaces to JDBC. Requirements: Users can operate the Nebula service directly using the JDBC driver, and the project REPo has automated unit tests.

Nebula Graph description

A reliable distributed graph database with linear expansion and high performance; The world’s only graph database solution capable of holding hundreds of billion vertices and trillions of edges and providing millisecond query latency. Nebula Graph characteristics

  • Open source: working with the community to popularize and promote graph databases;
  • Security: With role-based permission control, access can only be authorized;
  • Expansibility: Support Spark, Hadoop, GraphX, Plato and other peripheral ecological tools;
  • High performance: Nebula Graph can maintain high throughput while still delivering low latency reads and writes;
  • Expansion: Linear expansion based on the Shared-nothing Distributed Nebula Graph;
  • Compatibility with Nebula Graph: Step by step compatibility with openCypher9;
  • High availability: Supports various methods to recover abnormal data, ensuring high availability of services in the case of partial failures.
  • Stable release: passed the business test of the first-line Internet giants, such as JINGdong, Meituan and Xiaohongshu in the production environment.

Nebula Graph has an active community and timely technical support available on the nebula-graph.com.cn website and at the GitHub repository: github.com/vesoft-inc/… , welcome to Nebula Graph as a Nebula Database Contributor!!

The project to the ground

Scheme described

Understand Nebula Graph’s features and basic usage; Investigate JDBC driver development, read JDBC specification documents, understand some interfaces to be implemented; Neo4j-jdbc: github.com/neo4j-contr… Nebula – Java: github.com/vesoft-inc/… Project, learn source code, understand the main logic of the project code and code style; Nebula – Java: github.com/vesoft-inc/… Communicate with the database, write code for Nebula Graph to implement JDBC interfaces, and write unit tests.

Implementation description

The idea for this project is clear: implements a set of interfaces in the JDBC specification (mostly in the java.sql package) that implement methods in the interface. All the classes in the JDBC specification add up to hundreds of methods that need to be implemented. Nebula Graph is a next-generation Graph database with fewer features than the traditional relational databases for which JDBC is primarily designed, but with a lot more features than the relational databases with which IT has been developed. So the methods in the JDBC specification are both redundant (without actually implementing them) and inadequate for Nebula Graph. (Implements are required but not defined in the relevant interface.) In a concrete implementation, you define some abstract classes that directly implement the main interface of the specification, and then define the implementation classes that implement some important methods of the interface, so that the methods in the implementation class are not too messy to read. For methods that need to be implemented in the interface:

for(method: interface method){if(Method BELONG_TO does not require a specific implemented method){/ / such as the Statement: : getGeneratedKeys ()Override in the abstract class, the method in the body to throw out a SQLFeatureNotSupportedException; }else if(Method BELONG_TO needs to be implemented but not a core method){/ / such as the Statement: : isClosed ()Override in the abstract class; }else if(Method BELONG_TO needs to be implemented and is a core method){Statement::execute(String nGql)Override} in the concrete implementation classelse if(Method BELONG_TO is not defined in interface but needs to be implemented){NebulaResult::getNode getEdge getPath NebulaResult::getNode getEdge getPathImplement in a concrete implementation class}}Copy the code

The main implements and extends relationships in the project are as follows :(the blue solid lines are the extends relationships between classes, the green solid lines are the implements relationships between interfaces, and the green dotted lines are the implements relationships between abstract classes and interfaces).

Analysis of main methods in workflow and class:

NebulaDriver takes care of NebulaDriver, one of the NebulaPool attributes that takes service to the NebulaDriver database
NebulaDriver provides two service constructors, the no-parameter NebulaPool constructor takes the default NebulaPool, and the NebulaPool constructor that takes one of the Properties type arguments can customize their NebulaPool configurationpublic NebulaDriver(a) throws SQLException {
    this.setDefaultPoolProperties();
    this.initNebulaPool();
    // Register yourself with DriverManager
    DriverManager.registerDriver(this);
}
​
public NebulaDriver(Properties poolProperties) throws SQLException {
    this.poolProperties = poolProperties;
    this.initNebulaPool();
    // Register yourself with DriverManager
    DriverManager.registerDriver(this);
}
​
After registration drive / / users can DriverManager: : getConnection (String url) for the connection. NebulaConnection's constructor takes an April Session from the NebulaDriver and then connects to the drawing space specified in the URL/ / get the Connection after the user can use the Connection: : createStatement and Connection: : get the Statement or a PreparedStatement prepareStatement NebulaResult object, which calls its execute method to send orders to the database. The results of this command are encapsulated in NebulaResult, and NebulaResult calls its various methods to get data of different typesNebulaResult currently implements these NebulaResult data acquisition methods, with the different data types in Nebula Graph being implemented
public String getString(a);
public int getInt(a);
public long getLong(a);
public boolean getBoolean(a);
public double getDouble(a);
public java.sql.Date getDate(a);
public java.sql.Time getTime(a);
public DateTimeWrapper getDateTime(a);
public Node getNode(a);
public Relationship getEdge(a);
public PathWrapper getPath(a);
public List getList(a);
public Set getSet(a);
public Map getMap(a);
Copy the code

The project schedule

Work done

  • Deploy Nebula Graph and master its basics;
  • Read JDBC: download.oracle.com/otn-pub/jcp… Specification documents, clear implementation requirements;
  • Learn nebula- Java: github.com/vesoft-inc/… The source code.

Complete the following implementation:

Problems encountered and solutions

How to communicate with the database:

During the early stages of the project, I had no idea how to communicate with the database. After working on the Neo4J-JDBC implementation of Neo4j, I used the Http framework to communicate with the Database through the Nebula Graph API (in a rough way). When I was done, I contacted my mentor to see if the idea would work, and he told me I could communicate with Nebula Graph via RPC using the nebula Java wheel I already had.

The data statistics adopted the method of calculating composite indicators to calculate the scores of each enterprise in the four dimensions of enterprise scale, social impact, development potential and social responsibility, and then determined the ranking after weighted average.

About getting a Connection:

NebulaPoolConfig some parameters of the class is configurable, my thoughts are configured in the form of specified in the connection string, such as: “JDBC: nebula: / / IP: port/graphSpace? MaxConnsSize = 10 & reconnect = true “.

After consulting the tutor, the tutor suggests that you support two interfaces for users to obtain connections. One is to use the default configuration, and the other is to allow users to specify the configuration. For example:

// default configuration
DriverManager.getConnection(url, username, password);
​
// customized configuration
DriverManager.getConnection(url, config);
Copy the code

Questions about PreparedStatements:

Relational databases support the function of pre-compiling query statements. Preparedstatements can send SQL to the DBMS for pre-compiling and then transmitting parameters, which improves performance and prevents SQL injection attacks. At this time, Nebula Graph does not have this functionality, so placeholders in the nGQL are parsed locally and the parameters are filled in, essentially the same as Statement.

Nebula – Java Version Problems:

The 2.0.0 version of the dependency that was introduced in the project at the beginning was found to be inconsistent with the result returned by the console in a query. After consulting my mentor, I found that this was a bug in this version and changed to the latest 2.0.0-snapshot version.

UpdateCount problem:

Some methods in the JDBC interface require the return value to be the amount of data affected by the method. However, the server does not return updateCount statistics to the user. If the user inserts multiple points or edges in an insert statement, there may be some partial success, but the server will only return the message that the user failed, when in fact the user may be able to retrieve some data. The updateCount returns 0, then adds a comment to the interface stating that it is not supported.

NebulaPool initialization problem:

First I initialized NebulaPool and took the Session when I first initialized it, and confused the configuration for NebulaPool with the configuration for the Session. NebulaPool doesn’t make sense to reinitialize NebulaPool every time a Connection is fetched. I submitted the code to Gitlab tutor Review and pointed out my mistake. Recommend that I take NebulaPool initialization and shutdown to the NebulaDriver, and raise the service’s default configuration and the NebulaPool custom configuration to initialize NebulaPool.

Follow-up work arrangement

  • Complete methods that should be implemented in the interface but are not;
  • Improve code comments;
  • Complete unit tests;
  • Write instructions.

Thank you

This event promoted the development of open source software and the construction of excellent open source software community, increased the activity of open source projects, and promoted the development of open source ecology; Thanks to the organizers of @Open Source Summer for providing the platform and opportunity for this event.

Thanks to my tutor @Laura. Ding for carefully reviewing my PR code and giving me careful guidance to let me know my shortcomings. Thank you to @Nebula Graph running The Little Sister for sending me the neighborhood, LUCKY!

This is an original article by Zheng Dongyang.

The Shenzhen Meetup is underway, so if you want to come and meet with Nebula’s Technology team this Saturday, click the link below to sign up

NMeetup registration link