Technology

Transparent datasource routing in Spring

Pawel Szczerbicki, Sep 03, 2019

Spring framework is said to be extremely flexible. But what does it mean? Authors claim that Spring allows you to adjust almost any piece of behavior to your needs and make the framework works totally for you. Nothing different with DataSource interface which can be overridden in any manner. Question is how to manage multiple datasources when endpoint depends on the user?

Use case

One day I was working on a medium-size SaaS B2B system for clients like banks or call centers. One of our key concerns was data privacy — our users were crazy about it. Most of them couldn’t even accept to mix their data in one database with other clients so logical separation was not an option. We decided to go for physical data separation. Each client had his own database totally separated from others.

In this case, three challenges were identified:

1.How to deal with routed datasources?

2.How to manage them dynamically at runtime when a new client comes?

3.How to make it transparent in code?

Databases that can be added or removed by the user are not a common problem faced by developers but they are extremely important when you deal with multiple DB’s. What is more, we wanted to get rid of the human factor so no single developer should be aware of datasource routing per client while implementing new features. How did I solved that? Let’s dig into it deeper.

Solution

To route between datasources, all clients have to be stored along with a database URL. Datasource has to be extremely fast in reads so we decided to go for DynamoDB with the local cache. DB instances for clients are created in advance. New clients are stored as they register into the app in the following steps:

Get random database from databases pool
Store client with database endpoint
Send event to a router with a new client
Add brand new DB to pool

Then every request is processed as follows:

Add client key (ex. name) to request header
Read client key before Spring controller and store it in thread-local
Process request like always
Get current client domain, retrieve the db connection string and connect to DB in RoutableDatasourcewhich implements standardDatasourceinterface

With his approach developer does not have to be aware which client is currently serviced besides asynchronous calls or actions on more than one client. Let’s see how it’s coded

Transparency

Servlet process requests in a single thread. Spring uses ThreadLocal to store information about the request, so did I. Using @ControllerAdvice and @InitBinderit's easy to save client data.

@ControllerAdvice
public class DomainBinder { @InitBinder
public void bindDomain(@RequestHeader Optional<String> domain) {
domain.ifPresent(ThreadLocalClientResolver::setClientDomain);
}

This example shows how to get client domain from request header, but we can extract this data from path, cookie or any other part of the request as well. @InitBinder allows to inject HttpServletRequestand get any data.

To store client domain I’m using ThreadLocalClientResolverwhich is a simple wrapper for ThreadLocal

public class ThreadLocalClientResolver { private static final ThreadLocal<String> CLIENT_DOMAIN = new ThreadLocal<>();
public Optional<String> resolve() {
String domain = CLIENT_DOMAIN.get();
if (StringUtils.isBlank(domain)) return empty(); return Optional.of(domain);
} public static void setClientDomain(String clientDomain) {
CLIENT_DOMAIN.set(clientDomain);
} public static String getClientDomain() {
return CLIENT_DOMAIN.get();
} public static void cleanClientDomain() {
CLIENT_DOMAIN.remove();
}

Now we can get the domain in almost any part of code BUT NOT in @Async code. The asynchronous function uses a new thread to service request but DOES NOT rewrite ThreadLocalvariables.

Async functions

To use ThreadLocal variables in async code we need to rewrite all variables passing to async function and setting as a thread-local variable. To simplify this operation simply create a function that takes action as a parameter

public static <T> T call(String domain, Callable<T> action) throws Exception {
String old = ThreadLocalClientDomainResolver.getClientDomain();
try {
ThreadLocalClientDomainResolver.setClientDomain(domain);
return action.call();
} finally {
ThreadLocalClientDomainResolver.setClientDomain(old);
}

This approach can be used while executing actions for all clients or for changing the context for a while, for example:

public void runForAllClients(Callable task) {
clientDao.findAll().forEach(c -> call(c.getDomain(), task));
}

Dynamic datasource routing

We know how to set the client before the controller function. To have code fully transparent it’s crucial to provide client datasource just before the database call. My solution is to use AbstractRoutingDataSource from package org.springframework.dbc.datasource.lookup

Initialising is as simple as

public class DataSourceRouter extends AbstractRoutingDataSource { public DataSourceRouter(Map<String, DataSource> clientsDataSources) {
setTargetDataSources((Map) clientsDataSources);
} @Override
protected String determineCurrentLookupKey() {
return ClientDomainResolver.resolve();
}

This class extends AbstractRoutingDataSource which extends AbstractDataSourcewhich implements javax.sql.DataSourceand can be provided like usual datasource for Hibernateor JOOQ which gives you full transparency. The database is determined just before the real call to the endpoint.

Initialize Datasource

Before you will route to any database you need to create router at application start like any other datasource. It can be done as follows:

public DataSourceRouter create(List<Client> clientsData) {
Map<String, DataSource> clientDataSources = clientsData.stream()
.collect(toMap(Map.Entry::getKey, e -> create(e.getValue())));
return new DataSourceRouter(clientDataSources);
}

This approach does not affect the connection pool or any other datasource features. createfunction is using Apache DBCP2 datasource with connection pooling

public static DataSource create(Client c) {
PoolableConnectionFactory factory = new PoolableConnectionFactory(
new DriverManagerConnectionFactory(c.endpoint, c.user, c.password), null);
GenericObjectPool<PoolableConnection> connectionPool = new GenericObjectPool<>(factory);
connectionPool.setConfig(poolConfig(maxTotalConnections, maxIdleConnections));
connectionPool.setAbandonedConfig(abandonedConfig()); factory.setPool(connectionPool);
factory.setMaxOpenPrepatedStatements(maxTotalConnections);
factory.setConnectionInitSql(singletonList(INIT_SQL)); return new PoolingDataSource<>(connectionPool);

Summary

Routing datasources is a feature that comes with Spring out of the box thanks AbtractRoutingDataSourceclass which implements DataSource. To make it transparent for developers add client identifier to thread-local and read it in determineCurrentLookupKeymethod.