Trade-offs between multiple and dynamic data sources

In fact, in the system design, you should try to avoid a project access to multiple data sources. We should try to converge the users of a database, so as to reduce the risks and difficulties in the subsequent work of data migration and database reconstruction. Of course, this is not an absolute case, the so-called “existence is reasonable”. The use of multiple data sources, on the other hand, can greatly reduce coding convenience. We no longer need to use Dubbo, SpringCloud, etc., to get relevant data from other systems. Just recently, I came across the corresponding use scenarios in my work. Here are two solutions I considered:

Multi-data source scheme

In fact, this solution is the one we can think of at the first time. We can directly inject multiple SqlSessionFactory into the project (if you use Mybatis) and store the domain and DAO of multiple data sources by package. This allows you to access different databases by configuring @Mapperscan and setTypeAliasesPackage for different SQlsessionFactories. The following code snippet configures the parameters for one of the data sources:

/** * db_b2b data source * @returnB2b library data source */ @bean (name ="b2b")
	public DataSource b2b () throws SQLException {
		MysqlXADataSource mysqlXADataSource=new MysqlXADataSource();
		mysqlXADataSource.setUrl((dbB2BProperties.getUrl()));
		mysqlXADataSource.setPinGlobalTxToPhysicalConnection(true);
		mysqlXADataSource.setPassword((dbB2BProperties.getPassword()));
		mysqlXADataSource.setUser((dbB2BProperties.getUserName()));
		mysqlXADataSource.setPinGlobalTxToPhysicalConnection(true);
		AtomikosDataSourceBean xaDataSource=new AtomikosDataSourceBean();
		xaDataSource.setXaDataSource(mysqlXADataSource);
		xaDataSource.setUniqueResourceName("b2b");
		return xaDataSource;
	}

	@Bean(name = "sqlSessionFactoryB2B")
	public SqlSessionFactory sqlSessionFactoryB2B(@Qualifier("b2b")DataSource dataSource) throws Exception {
		MybatisSqlSessionFactoryBean bean = new MybatisSqlSessionFactoryBean();
		bean.setDataSource(dataSource);
		bean.setTypeAliasesPackage("com.mhc.lite.dal.domain.b2b");
		MybatisConfiguration configuration = new MybatisConfiguration();
		configuration.setJdbcTypeForNull(JdbcType.NULL);
		configuration.setMapUnderscoreToCamelCase(true);
		configuration.setCacheEnabled(false); bean.setConfiguration(configuration); Bean.setplugins (new Interceptor[]{paginationInterceptor // add pagination});return bean.getObject();
	}

	@Bean(name = "sqlSessionTemplateB2B")
	public SqlSessionTemplate sqlSessionTemplateB2B(
			@Qualifier("sqlSessionFactoryB2B") SqlSessionFactory sqlSessionFactory) throws Exception {
		return new SqlSessionTemplate(sqlSessionFactory);
	}
Copy the code

For those of you who are unfamiliar with MysqlXADataSource and AtomikosDataSourceBean, these are two classes used in JTA distributed transactions that are explained at the end of this article. Note that the DataSource and SqlSessionFactory Bean names need to be separated and selected with @qualifier. Otherwise, it will lead to confusion of calls between various data sources.

Dynamic data source scheme

Another way to think about it is that using multiple SQlsessionFactories to connect to different data sources is very limited. When we have a large number of data sources, the template code similar to the above will fill the whole project, which will be tedious to configure. Also, consider that we don’t need to operate on every data source all the time, and each data source has a basic number of idle connections. In this way, precious system memory and CPU resources are wasted, so a second scheme emerges — dynamic data sources. Let me take a vivid example in daily life: the drill bit used by workers. In fact, the drill only needs one. We only need to replace different drill bits on the drill according to different wall materials and hole shapes, so that we can adapt to different scenes. All we did was buy two or more RIGS (a bit of a luxury!). . Here’s how to do it:

/** * db_base data source * @return
	 */
	@Bean(name = "base")
	@ConfigurationProperties(prefix = "spring.datasource.druid.base" )
	public DataSource base () {
		returnDruidDataSourceBuilder.create().build(); } /** * db_b2b data source * @return
	 */
	@Bean(name = "b2b")
	@ConfigurationProperties(prefix = "spring.datasource.druid.b2b" )
	public DataSource b2b () {
		returnDruidDataSourceBuilder.create().build(); } /** * Dynamic data source configuration * @return
	 */
	@Bean
	@Primary
	public DataSource multipleDataSource (@Qualifier("base") DataSource base,
					      @Qualifier("b2b") DataSource b2b ) {
		DynamicDataSource dynamicDataSource = new DynamicDataSource();
		Map< Object, Object > targetDataSources = new HashMap<>();
		targetDataSources.put(DBTypeEnum.DB_BASE.getValue(), base );
		targetDataSources.put(DBTypeEnum.DB_B2B.getValue(), b2b);
		dynamicDataSource.setTargetDataSources(targetDataSources);
		dynamicDataSource.setDefaultTargetDataSource(base);
		return dynamicDataSource;
	}

	@Bean("sqlSessionFactory")
	public SqlSessionFactory sqlSessionFactory() throws Exception {
		MybatisSqlSessionFactoryBean sqlSessionFactory = new MybatisSqlSessionFactoryBean();
		sqlSessionFactory.setDataSource(multipleDataSource(base(),b2b()));
		MybatisConfiguration configuration = new MybatisConfiguration();
		configuration.setJdbcTypeForNull(JdbcType.NULL);
		configuration.setMapUnderscoreToCamelCase(true);
		configuration.setCacheEnabled(false); sqlSessionFactory.setConfiguration(configuration); sqlSessionFactory.setPlugins(new Interceptor[]{ //PerformanceInterceptor(),OptimisticLockerInterceptor() PaginationInterceptor () // Add pagination}); sqlSessionFactory.setGlobalConfig(globalConfiguration());return sqlSessionFactory.getObject();
	}
Copy the code

All we need now is a SqlSessionFactory (” rig “), but a DynamicDataSource is added. It can be thought of as a list of data sources, and we will be able to switch dynamically according to the data source names in the list later. So how do we know which database to use and when? Then look at:

Public class DynamicDataSource extends AbstractRoutingDataSource {/ * * * core method, data source context switch * / @ Override protected ObjectdetermineCurrentLookupKey() {
		returnDbContextHolder.getDbType(); } } public class DbContextHolder { private static final ThreadLocal contextHolder = new ThreadLocal<>(); /** * set data source */ public static voidsetDbType(DBTypeEnum dbTypeEnum) { contextHolder.set(dbTypeEnum.getValue()); } /** * get the current data source */ private static StringgetDbType() {
		return(String) contextHolder.get(); } /** * clear context data */ private static voidclearDbType() { contextHolder.remove(); }} @slf4j @aspect @order (-100) @Component public class DataSourceSwitchAspect {"execution(* com.mhc.polestar.dal.manager.*.*(..) )")
	private void ownMethod(){} /** * @pointcut ();"execution(* com.baomidou.mybatisplus.service.*.*(..) )")
	private void mpMethod() {}

	@Before( "ownMethod() || mpMethod()" )
	public void base(JoinPoint joinPoint) {
		String name = joinPoint.getTarget().getClass().getName();
		if (name.contains("B2b")){
			log.debug("Switch to B2B data source...");
			DbContextHolder.setDbType(DBTypeEnum.DB_B2B);
		}else {
			log.debug("Switch to base data source..."); DbContextHolder.setDbType(DBTypeEnum.DB_BASE); }}}Copy the code

Remember the idea of section-oriented programming, that every time we want to “change the drill bit”, the action can be abstracted as a section. We choose which data source to switch to according to the name of the Manager used at the pointcut, path (some rules need to be made in advance) and other information, so as to work normally. At the same time, @Order is used to ensure that the execution Order of the aspects is first. At this point, the general idea is expressed, but I run into a problem in practical use. While doing things like the above can achieve zero intrusion into your code, consider the following case:

@Override @ValidateDTO @Transactional(rollbackFor = Exception.class) public APIResult<Boolean> addPartner(PartnerParamDTO paramDTO) {... }Copy the code

We often need to enable transactions to ensure business consistency, but if we enable transactions on the caller Service of the Manager, savePoint will be set at the Service layer, which means that the underlying context must be fixed and cannot be changed. In this way, when exceptions occur, rollBack can ensure consistency. However, our section is set in the Manger layer, so we can not switch the data source, is bound to occur error! Later, I implemented a custom annotation in the Service layer to switch the data source. This problem was solved, but this method would cause some invasion to the code (we need to annotate all the places where we switch the data source, which has nothing to do with the original business. And we need to use multiple data sources in a Service method, then need to bring the bad taste of code further into the Manager layer, use DbContextHolder. SetDbType () to switch from the manual). So that’s the end of the description of this problem that I’ve come across.

Distributed transaction

MysqlXADataSource and AtomikosDataSourceBean were mentioned earlier in the multi-data source scenario, which are actually two key points in JTA distributed transactions. As you know, traditional Spring transactions can only manage consistency for a single data source. Now that we are using multiple or dynamic data sources in our projects, we have to consider JTA distributed transactions if we want to continue using transactions. It is designed to ensure transaction synchronization between multiple data sources, and here I use Atomiko for a simple demonstration:

@Configuration
@EnableTransactionManagement
public class TxManagerConfig {

	@Bean(name = "userTransaction")
	public UserTransaction userTransaction() throws Throwable {
		UserTransactionImp userTransactionImp = new UserTransactionImp();
		userTransactionImp.setTransactionTimeout(10000);
		return userTransactionImp;
	}

	@Bean(name = "atomikosTransactionManager", initMethod = "init" , destroyMethod = "close")
	public TransactionManager atomikosTransactionManager() {
		UserTransactionManager userTransactionManager = new UserTransactionManager();
		userTransactionManager.setForceShutdown(false);
		return userTransactionManager;
	}

	@Bean(name = "transactionManager")
	@DependsOn({ "userTransaction"."atomikosTransactionManager" })
	public PlatformTransactionManager transactionManager() throws Throwable {
		returnnew JtaTransactionManager(userTransaction(),atomikosTransactionManager()); ** * db_b2b data source * @returnB2b library data source */ @bean (name ="b2b")
	public DataSource b2b () throws SQLException {
		MysqlXADataSource mysqlXADataSource=new MysqlXADataSource();
		mysqlXADataSource.setUrl((dbB2BProperties.getUrl()));
		mysqlXADataSource.setPinGlobalTxToPhysicalConnection(true);
		mysqlXADataSource.setPassword((dbB2BProperties.getPassword()));
		mysqlXADataSource.setUser((dbB2BProperties.getUserName()));
		mysqlXADataSource.setPinGlobalTxToPhysicalConnection(true);
		AtomikosDataSourceBean xaDataSource=new AtomikosDataSourceBean();
		xaDataSource.setXaDataSource(mysqlXADataSource);
		xaDataSource.setUniqueResourceName("b2b");
		return xaDataSource;
	}
Copy the code

You read that right, distributed transactions can be implemented with such a simple configuration in the SpringBoot project. We use MysqlXADataSource and AtomikosDataSourceBean logo can be used for management of the data sources provide the management of the affairs of the PlatformTransactionManager to JTA.

Tradeoff of options

1. The multi-data source scheme has the advantages of simple configuration and minimal intrusion on business codes. Its disadvantages are also obvious: we need to occupy some resources in the system, but these resources are not always needed, which will cause a waste of resources to some extent. If you need to use data from multiple sources simultaneously in a single piece of business code and you need to consider atomicity of operations (transactions), then this is definitely for you. 2. The configuration of the dynamic data source scheme looks slightly complicated, but it well conforms to the design principle of “take and use, use and return”. We regard multiple data sources as a pool and then consume them. Its disadvantages are as exposed above: we often need to compromise with the requirements of the transaction. In addition, due to the need to switch the environment context, the resource competition on the high concurrency system is prone to deadlock and other active problems. It is often used for “read/write separation” of databases, so that multiple data sources do not need to be operated on simultaneously in a single business. 3. If you need to use transactions, remember to use distributed transactions to replace Spring’s own transaction management, otherwise consistency control will not be possible! Write here this article also end, haven’t written an article for a long time many things to consider is not very detailed, thank you for your criticism!

Trade-offs between multiple and dynamic data sources

Multi-data source scheme

Dynamic data source scheme

Distributed transaction

Tradeoff of options

Related Posts

Share carefully! 10 Java interview questions to help you prepare!

Database design SqlServer details on 14 August 2020

How to realize semi-automatic decoupling and ORM in Mybatis