The pain of comparing data
I don’t know if you, like me, have to do some database comparison work while writing code happily.
Admittedly, general data differences, such as the content of each line is different, there are mature comparison tools such as Compare2 in the market.
But if you’re comparing whether each column of the data is the same, this becomes a little more cumbersome.
V1.0 pure manual comparison
When we do some functions such as data migration, we often need to compare whether the data is correct. The most common method is manual comparison column by column.
At the beginning, old Ma is so right with colleagues, after a few days of feeling efficiency is really low, but also easy to see.
So I thought, this thing should be a lot easier to compare programmatically.
V2.0 semi-manual comparison
Insert into XXX (xx, xx, xx) values (xx, xx, xx); The utility class.
Then compare the two sides of the data, this for the comparison of hundreds of fields in a table, some become much faster, the accuracy is also much higher.
Don’t ask me why there are hundreds of fields, they are the treasures of history…
Ps: Whether the insert into statement is manually exported using the database connection tool.
Later, I found another problem: there were too many tables. If I wanted to change a data comparison, it would take me tens of minutes to manually export again. The key was repetition and boredom.
Since it is repetitive, can it be implemented programmatically?
V3.0 compares basic automation
So I stayed up late after work to implement this version: The Java program implements the data export persistence, and then makes changes before and after the difference comparison.
Below I share their ideas, as well as the core source code, at the end of the download benefits.
I hope I can help you with your work and study.
The whole idea
I want this tool to be the MVP concept, moving from simple to complex and gradually enriching features.
To have scalability, currently support mysql/ Oracle/SQL Server and other mainstream databases, users can customize development.
With as few dependencies as possible, using native JDBC without introducing frameworks like Mybatis.
The core depends on
Here are the core dependencies I used:
Fastjson is used to persist data to JSON
Mysql-connector-java database connection driver
Jsqlparser is not required to parse SQL
Implementation approach
-
Automatically selects the corresponding JDBC implementation based on the specified JDBC connection information.
-
Execute the corresponding SQL, parse the result into a map, and carry out JSON persistence
-
The difference comparison of persistent JSON shows the difference results
With this idea, everything becomes unpretentious.
Of course, before this, we need to implement the code, the following to enter the BUG link:
JDBC implementation
The core interface
Considering different database implementations later, we define a unified query interface
/** * JDBC access layer *@authorThe old horse shouts the west wind@date2017/8/1 * /
public interface JdbcMapper {
/** * Execute the query statement *@param querySql
* @return* /
ResultSet query(String querySql);
}
Copy the code
Abstract implementation
The basic abstract implementation is provided here.
Subclasses only need to implement the corresponding connection to get information.
public abstract class AbstractJdbcMapper implements JdbcMapper {
protected JdbcVo jdbcVo;
public AbstractJdbcMapper(JdbcVo jdbcVo) {
this.jdbcVo = jdbcVo;
}
/** * get database connection *@return* /
protected abstract Connection getConnection(a);
@Override
public ResultSet query(String querySql) {
ResultSet rs = null;
Connection connection = getConnection();
try {
Statement stmt = null;
stmt = connection.createStatement();
rs = stmt.executeQuery(querySql);
} catch (Exception e) {
System.out.println("SQL: " + querySql);
throw new ExportdbException(e);
}
returnrs; }}Copy the code
JdbcVo Connection information
This object is mainly a database connection information object:
public class JdbcVo {
/** * Driver class name */
private String driverClassName;
/** * database link */
private String url;
/** * User name */
private String username;
/** * Password */
private String password;
//getter & setter
}
Copy the code
Mysql implementation
Mysql is used as an example:
import com.github.houbb.exportdb.dto.JdbcVo;
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.SQLException;
/** * mysql implementation *@authorThe old horse shouts the west wind@date2017/8/1 * /
public class MySqlJdbcMapper extends AbstractJdbcMapper {
public MySqlJdbcMapper(JdbcVo jdbcVo) {
super(jdbcVo);
}
@Override
protected Connection getConnection(a) {
try {
Class.forName(jdbcVo.getDriverClassName());
return DriverManager.getConnection(jdbcVo.getUrl(),
jdbcVo.getUsername(),
jdbcVo.getPassword());
} catch (ClassNotFoundException | SQLException e) {
e.printStackTrace();
}
return null; }}Copy the code
Here is mainly the initialization of the connection, connection to different databases, need to introduce the corresponding data source.
Line data export implementation
Here is the exported core implementation:
The interface definition
public interface IExportdb {
/** * query *@paramContext *@param sql sql
* @returnResults *@since0.0.1 * /
QueryResultVo query(final ExportdbContext context, final String sql);
}
Copy the code
This specifies the SQL that needs to be executed.
Context in order to facilitate later expansion, currently only JdbcMapper.
Return QueryResultVo, which is the query result, as defined below:
public class QueryResultVo {
/** * table name */
private String tableName;
/** * Database name **@sinceHundreds * /
private String databaseName;
/** * result set */
private List<Map<String, Object>> resultMaps;
/** * execute SQL */
private String sql;
//getter & setter
}
Copy the code
The default implementation
The default export implementation is as follows:
import com.github.houbb.exportdb.core.ExportdbContext;
import com.github.houbb.exportdb.core.IExportdb;
import com.github.houbb.exportdb.dal.JdbcMapper;
import com.github.houbb.exportdb.dto.QueryResultVo;
import com.github.houbb.exportdb.exception.ExportdbException;
import com.github.houbb.heaven.util.lang.StringUtil;
import net.sf.jsqlparser.JSQLParserException;
import net.sf.jsqlparser.parser.CCJSqlParserUtil;
import net.sf.jsqlparser.statement.Statement;
import net.sf.jsqlparser.statement.insert.Insert;
import net.sf.jsqlparser.statement.select.PlainSelect;
import net.sf.jsqlparser.statement.select.Select;
import java.sql.ResultSet;
import java.sql.ResultSetMetaData;
import java.sql.SQLException;
import java.util.ArrayList;
import java.util.LinkedHashMap;
import java.util.List;
import java.util.Map;
/ * * *@author binbin.hou
* @since0.0.1 * /
public class Exportdb implements IExportdb {
@Override
public QueryResultVo query(ExportdbContext context, String sql) {
try {
final JdbcMapper jdbcMapper = context.jdbcMapper();
ResultSet resultSet = jdbcMapper.query(sql);
List<Map<String, Object>> maps = new ArrayList<>();
String tableName = null;
while (resultSet.next()) {
final ResultSetMetaData metaData = resultSet.getMetaData();
// Set the table name
if(tableName == null) {
tableName = metaData.getTableName(1);
}
Map<String, Object> map = new LinkedHashMap<>();
If the value is null, an error is reported if the value is greater than 1
// Total number of columns
int columnCount = metaData.getColumnCount();
for (int i = 1; i <= columnCount; i++) {
String columnName = metaData.getColumnName(i);
Object value = resultSet.getObject(columnName);
map.put(columnName, value);
}
maps.add(map);
}
if(StringUtil.isEmptyTrim(tableName)) {
Statement statement = CCJSqlParserUtil.parse(sql);
Select select = (Select)statement;
PlainSelect plainSelect = (PlainSelect) select.getSelectBody();
tableName = plainSelect.getFromItem().toString();
}
return QueryResultVo.newInstance().tableName(tableName)
.databaseName("")
.sql(sql)
.resultMaps(maps);
} catch (SQLException | JSQLParserException throwables) {
throw newExportdbException(throwables); }}}Copy the code
In fact, the implementation is very simple, we mainly talk about two points:
(1) Table name
After the test, you can obtain mysql in the following ways:
resultSet.getMetaData();
tableName = metaData.getTableName(1);
Copy the code
When I was testing Oracle, I found it was not available. So sqlParser is used to parse our query.
For the time being, it mainly supports queries, so some of these are fixed and can be optimized later.
if(StringUtil.isEmptyTrim(tableName)) {
Statement statement = CCJSqlParserUtil.parse(sql);
Select select = (Select)statement;
PlainSelect plainSelect = (PlainSelect) select.getSelectBody();
tableName = plainSelect.getFromItem().toString();
}
Copy the code
(2) Column information
For each query, there may be multiple records.
Let’s look at the construction of each record:
while (resultSet.next()) {
final ResultSetMetaData metaData = resultSet.getMetaData();
Map<String, Object> map = new LinkedHashMap<>();
If the value is null, an error is reported if the value is greater than 1
// Total number of columns
int columnCount = metaData.getColumnCount();
for (int i = 1; i <= columnCount; i++) {
String columnName = metaData.getColumnName(i);
Object value = resultSet.getObject(columnName);
map.put(columnName, value);
}
maps.add(map);
}
Copy the code
This often write JDBC partners must be familiar.
You say now all use mybatis, who still write JDBC ah, so low.
So, your own handwriting a Mybatis, these are also will.
Hand-write mybatis (a) MVP version from scratch
Differences in contrast
Use of exports
We can export one row of data, we can export it before and after we modify it.
If you export to different libraries, different tables, then export between different library tables.
Once you export the results, you need to compare them.
Compare the implementation
The interface definition
You can choose what to do with the exported results based on your own situation.
For example, export to CSV/JSON/INSERT, etc., comparison differences can also be customized according to their own requirements.
public interface IQueryResultHandler {
/** * Result processing class *@paramQueryResultVo Query result */
void handler(final QueryResultVo queryResultVo);
}
Copy the code
persistence
Here is a simple and practical way: JSON persistence.
import com.alibaba.fastjson.JSON;
import com.alibaba.fastjson.serializer.SerializerFeature;
import com.github.houbb.exportdb.dto.QueryResultVo;
import com.github.houbb.exportdb.support.result.IQueryResultHandler;
import com.github.houbb.heaven.util.io.FileUtil;
import java.util.ArrayList;
import java.util.List;
import java.util.Map;
/ * * *@authorThe old horse shouts the west wind@since0.0.1 * /
public class FileJsonQueryResultHandler implements IQueryResultHandler {
/** * Default file output path ** Automatically set according to the operating system *@since0.0.1 * /
private final String dir;
public FileJsonQueryResultHandler(String dir) {
this.dir = dir;
}
public FileJsonQueryResultHandler(a) {
this("D:\\exportdb\\");
}
/** * Result processing class **@paramQueryResultVo Query result */
@Override
public void handler(final QueryResultVo queryResultVo) {
String path = dir+queryResultVo.tableName()+".edb";
System.out.println("File path:" + path);
List<Map<String, Object>> list = queryResultVo.resultMaps();
List<String> lines = new ArrayList<>(list.size()+1);
lines.add("--"+queryResultVo.sql());
for(Map<String, Object> map : list) { lines.add(JSON.toJSONString(map, SerializerFeature.WriteMapNullValue)); } FileUtil.write(path, lines); }}Copy the code
We will row data persistence to a file, note here specifies the JSON toJSONString (map, SerializerFeature. WriteMapNullValue);
This allows null fields to be printed as well, making it easier to compare.
File difference comparison implementation
Above we assume that the output file to two files, the following file path can be used for comparison:
/** ** *@paramOldPath Original path *@paramNewPath newPath */
public static void differ(final String oldPath, final String newPath) {
List<String> oldLines = FileUtil.readAllLines(oldPath);
List<String> newLines = FileUtil.readAllLines(newPath);
System.out.println(FileUtil.getFileName(oldPath)+"The comparison begins ---------------");
for(int i = 0; i < oldLines.size(); i++) {
String oldL = oldLines.get(i);
String newL = newLines.get(i);
if(oldL.startsWith("--")) {
continue;
}
System.out.println("The first" + (i+1) +"Line comparison:");
differMaps(oldL, newL);
}
System.out.println(FileUtil.getFileName(oldPath)+"End of comparison ---------------");
System.out.println();
}
private static void differMaps(final String oldMap, final String newMap) {
Map<String, Object> om = JSON.parseObject(oldMap);
Map<String, Object> nm = JSON.parseObject(newMap);
for(Map.Entry<String, Object> entry : om.entrySet()) {
String key = entry.getKey();
Object oldV = om.get(key);
Object newV = nm.get(key);
// Skip null comparisons
if(oldV == null && newV == null) {
continue;
}
if(! ObjectUtil.isEquals(oldV, newV)) { System.out.println("Difference column:" + key +", old value:" + oldV + ", new value:"+ newV); }}}Copy the code
The difference content is output directly to the console console.
folder
Of course, we can also compare the contents of the two folders.
The implementation is as follows:
public static void differDir(final String oldDir, final String newDir) {
File[] oldFiles = new File(oldDir).listFiles();
for(File file : oldFiles) { String fileName = file.getName(); String aop = file.getAbsolutePath(); String anp = newDir+fileName; differ(aop, anp); }}Copy the code
The bootstrap class
convenience
We’ve got the core implementation all figured out, but it’s still not user-friendly enough. Because the configuration and so on are not elegant.
So we introduce a bootstrap class to help users quickly use:
/ * * *@authorThe old horse shouts the west wind@since0.0.1 * /
public class ExportdbBs {
private ExportdbBs(a){}
/** * export implementation *@since0.0.1 * /
private final IExportdb exportdb = new Exportdb();
/** * Driver class name */
private String driverName = DriverNameConstant.MYSQL;
/** * database link */
private String url = "jdbc:mysql://localhost:3306/test";
/** * User name */
private String username = "root";
/** * Password */
private String password = "123456";
public static ExportdbBs newInstance(a) {
return new ExportdbBs();
}
public ExportdbBs driverName(String driverName) {
this.driverName = driverName;
return this;
}
public ExportdbBs url(String url) {
this.url = url;
return this;
}
public ExportdbBs username(String username) {
this.username = username;
return this;
}
public ExportdbBs password(String password) {
this.password = password;
return this;
}
/** * query *@param sql sql
* @returnResults *@since0.0.1 * /
public QueryResultVo query(final String sql) {
/ / 1. Construct the vo
JdbcVo jdbcVo = new JdbcVo(driverName, url, username, password);
/ / 2. Get the mapper
final JdbcMapper jdbcMapper = getJdbcMapper(jdbcVo);
//3. Build context
final ExportdbContext context = ExportdbContext.newInstance().jdbcMapper(jdbcMapper);
return this.exportdb.query(context, sql);
}
/** * query and process *@paramQueryResultHandler queryResultHandler *@param sql sql
* @since0.0.1 * /
public void queryAndHandle(final IQueryResultHandler queryResultHandler,
final String sql, final String... otherSqls) {
QueryResultVo queryResultVo = this.query(sql);
queryResultHandler.handler(queryResultVo);
// Do the same with other SQL statements
for(String os : otherSqls) {
QueryResultVo vo = this.query(os); queryResultHandler.handler(vo); }}/** * query and process *@paramQueryResultHandler queryResultHandler *@paramSqlList SQL list *@sinceHundreds * /
public void queryAndHandle(final IQueryResultHandler queryResultHandler,
List<String> sqlList) {
// Do the same with other SQL statements
for(String sql : sqlList) {
System.out.println("Commence execution:" + sql);
QueryResultVo vo = this.query(sql); queryResultHandler.handler(vo); }}private JdbcMapper getJdbcMapper(JdbcVo jdbcVo) {
if(DriverNameConstant.MYSQL.equalsIgnoreCase(driverName)) {
return new MySqlJdbcMapper(jdbcVo);
}
if(DriverNameConstant.ORACLE.equalsIgnoreCase(driverName)) {
return new OracleJdbcMapper(jdbcVo);
}
if(DriverNameConstant.SQL_SERVER.equalsIgnoreCase(driverName)) {
return new SqlServerJdbcMapper(jdbcVo);
}
throw newUnsupportedOperationException(); }}Copy the code
This section provides users with the most basic configuration of mysql and common query processing methods.
test
Here’s how it works:
Direct query
QueryResultVo resultVo = ExportdbBs.newInstance().query("select * from user;");
System.out.println(resultVo);
Copy the code
Query and process
final String sql = "select * from user;";
final IQueryResultHandler handler = new FileJsonQueryResultHandler();
ExportdbBs.newInstance().queryAndHandle(handler, sql);
Copy the code
You can specify file paths for two exports, for example:
D: \ exportdb \ old \ and D: \ exportdb \ new \
Compare the two results
final String oldP = "D:\\exportdb\\old\\";
final String newP = "D:\\exportdb\\new\\";
CompareUtil.differDir(oldP, newP);
Copy the code
The difference result is output to the console.
Everything is going well, but the revolution is not yet successful, students still need to work overtime
Deficiency in
This is a V0.0.1 release, and there’s a lot left to be desired.
Such as:
-
Export to CSV
-
The export is an INSERT/UPDATE statement
-
Name of the exported file User-defined policy
-
You can specify whether multiple SQL is generated in the same file
-
The export path is automatically changed based on the operating system
-
More convenient to use, for example, the page specified data source + SQL, the page shows the corresponding difference results.
But it’s basically usable, in line with our initial vision.
summary
How do you compare data at ordinary times?
If you need this tool, you can pay attention to [Lao Ma Xiao Xifeng], background reply [contrast] can be.
Hope you found this article helpful, and if you have any other ideas, please share them in the comments section.
All geeks’ likes, favorites and forwarding are the biggest motivation for Lao Ma to continue writing!