If MY guess is correct: many data source configurations, parameters, etc. are written casually in the current online running projects. In the event of a failure (database down, slow SQL, network jitter, etc.), your parameters do not protect the service and may even cause a series of avalanches.
Druid 1.2.5 druid 1.2.5 Druid 1.2.5 DruID 1.2.5 DruID 1.2.5 The results of pressure measurement under various accident scenarios are given. For other types of data sources, you can also refer to them.
Preface, function of data source
In short, data sources are used to store database connections. The properties in Druid that store data source connections are, well, just an array.
/** com.alibaba.druid.pool.DruidDataSource */
private volatile DruidConnectionHolder[] connections;
Copy the code
So, the thing we use it for the most is to get connections:
Connection getConnection() throws SQLException;
Connection getConnection(String username, String password)
throws SQLException;
Copy the code
Maximum number of active connections in the pool: maxActive
The best understood parameter in the data source. This parameter determines the maximum number of connections a data source can create.
Initial number of connections in the pool: initialSize
The initial number of connections created when the data source is initialized.
Code in com. Alibaba. Druid. Pool. DruidDataSource# init
AsyncInit determines whether to create synchronously or asynchronously.
if (createScheduler != null && asyncInit) {
for (int i = 0; i < initialSize; ++i) {
submitCreateTask(true);
}
} else if (!asyncInit) {
// init connections
while (poolingCount < initialSize) {
try {
PhysicalConnectionInfo pyConnectInfo = createPhysicalConnection();
DruidConnectionHolder holder = new DruidConnectionHolder(this, pyConnectInfo);
connections[poolingCount++] = holder;
} catch (SQLException ex) {
...
}
}
}
Copy the code
Maximum time to wait for a connection: maxWait
There are two cases of connecting from a data source:
-
Currently connected, return directly.
-
Currently no connection ->
- Connection pool full -> Waiting for other threads to release connections.
- The connection pool is not full ->
- Create a connection synchronously and return directly after completion
- Create a connection asynchronously and notify when it’s done
This parameter is designed for the situation where there is no connection at present. Let’s look at the detailed process with the code:
Outretrycount (notFullTimeoutRetryCount)
The input to this method is maxWait, which takes the main logic. In getConnectionInternal, there is a very important parameter: notFullTimeoutRetryCount, which takes the number of retries for connection failures when the connection pool is insufficient.
As you can see from the following code logic, druid is written to at least one retry.
public DruidPooledConnection getConnectionDirect(long maxWaitMillis) throws SQLException { int notFullTimeoutRetryCnt = 0; for (;;) { // handle notFullTimeoutRetry DruidPooledConnection poolableConnection; try { poolableConnection = getConnectionInternal(maxWaitMillis); } catch (GetConnectionTimeoutException ex) { if (notFullTimeoutRetryCnt <= this.notFullTimeoutRetryCount && ! isFull()) { notFullTimeoutRetryCnt++; continue; } throw ex; }... }... }Copy the code
Create a connection synchronously
If the connection is created synchronously, the following conditions must be met:
- There are currently no connections available in the connection pool
- The current number of active connections is less than the maximum number of connections
- No connection is currently being created synchronously (that is, only one thread is creating a connection synchronously at a time).
if (createScheduler != null
&& poolingCount == 0
&& activeCount < maxActive
&& creatingCountUpdater.get(this) == 0
&& createScheduler instanceof ScheduledThreadPoolExecutor) {
ScheduledThreadPoolExecutor executor = (ScheduledThreadPoolExecutor) createScheduler;
if (executor.getQueue().size() > 0) {
createDirect = true;
continue;
}
}
Copy the code
Due to the above three conditions, only a few threads can enter the synchronous connection creation process.
if (creatingCountUpdater.compareAndSet(this, 0, 1)) { PhysicalConnectionInfo pyConnInfo = DruidDataSource.this.createPhysicalConnection(); holder = new DruidConnectionHolder(this, pyConnInfo); . creatingCountUpdater.decrementAndGet(this); directCreateCountUpdater.incrementAndGet(this); . }Copy the code
There is an important condition variable conversion that occurs during synchronous connection creation, which is covered in failFast below.
Create a connection asynchronously
Let’s start with two semaphores:
protected Condition notEmpty;
protected Condition empty;
Copy the code
The two semaphores are connected non-empty notifications in the current pool, and connected empty notifications.
When it is decided to create connections asynchronously, two different branches are used depending on whether maxWait is greater than 0:
final long nanos = TimeUnit.MILLISECONDS.toNanos(maxWait); . if (maxWait > 0) { holder = pollLast(nanos); } else { holder = takeLast(); }Copy the code
Looking at pollLast(Nanos) first, if there are no connections in the pool, use the semaphore: Empty and create an asynchronous thread to create the connection.
private void emptySignal() {
...
if (createTaskCount >= maxCreateTaskCount) {
return;
}
if (activeCount + poolingCount + createTaskCount >= maxActive) {
return;
}
submitCreateTask(false);
}
Copy the code
Then, use the notEmpty semaphore itself to enter the wait:
estimate = notEmpty.awaitNanos(estimate); // signal by
// recycle or
// creator
Copy the code
The maximum time estimate for this wait is calculated from maxWait.
Why is this waiting? Because it will wake up when a connection is recycled or created. However, if the thread does not grab the connection after waking up and the total wait time has not reached maxWait, it enters await again.
TakeLast () without arguments is much simpler:
The process is almost identical to pollLast(NanOS), with the only difference being that it waits for no timeout and can continue execution only after the thread is recycled or the creation thread wakes up.
while (poolingCount == 0) {
...
notEmpty.await(); // signal by recycle or creator
...
}
Copy the code
Asynchronous thread creation
Going back to the emptySignal() method, in the last line of code, a thread named CreateConnectionTask is created and submitted to the thread pool for execution.
try { physicalConnection = createPhysicalConnection(); } catch (OutOfMemoryError e) { errorCount++; if (errorCount > connectionErrorRetryAttempts && timeBetweenConnectErrorMillis > 0) { // fail over retry attempts setFailContinuous(true); if (failFast) { lock.lock(); try { notEmpty.signalAll(); } finally { ock.unlock(); }}}... }Copy the code
SetFailContinuous is also related to failFast, which we’ll talk about later.
Connect the recycling
When DruidPooledConnection’s close() method is called, it enters the connection recycling process (so a connection closure in the data source is not really closed).
protected void recycle(DruidPooledConnection pooledConnection) throws SQLException { ... result = putLast(holder, currentTimeMillis); . }Copy the code
The process is similar to fetching a connection, where the putLast method is called to return the connection.
Similarly, notempty.signal () is called to wake up a thread waiting for a connection.
boolean putLast(DruidConnectionHolder e, long lastActiveTimeMillis) { ... e.lastActiveTimeMillis = lastActiveTimeMillis; connections[poolingCount] = e; incrementPoolingCount(); . notEmpty.signal(); notEmptySignalCount++; return true; }Copy the code
MaxWait summary
This parameter is the maximum wait time for a single connection acquisition if the data source is not currently available.
This timeout is spent either creating a connection or waiting for another thread to release the connection.
Also, if the connection pool is full, there will be at least one retry mechanism (which cannot be turned off).
MaxWaitThreadCount Specifies the maximum number of waiting threads
This is much easier once you understand MaxWait.
First, we know that Druid has no connections available in the pool, and when the total number of connections does not reach maxActive, Druid creates a new asynchronous thread to create connections while the main thread waits.
So, how many threads can be waiting at the same time? Is maxWaitThreadCount.
Relevant code in com. Alibaba. Druid. Pool. DruidDataSource# getConnectionInternal.
if (maxWaitThreadCount > 0
&& notEmptyWaitThreadCount >= maxWaitThreadCount) {
connectErrorCountUpdater.incrementAndGet(this);
throw new SQLException("maxWaitThreadCount " + maxWaitThreadCount + ", current wait Thread count "
+ lock.getQueueLength());
}
Copy the code
That is, if the number of threads currently waiting for a connection is greater than maxWaitThreadCount, the connection fails, and the connection is not waiting to be created or reclaimed.
Waiting for a quick failure failFast
As the name suggests, fail fast. The change configuration item is a switch, and its value is true or false.
Triggered when there is no connection in the current pool, with the following code:
private DruidConnectionHolder pollLast(long nanos) throws InterruptedException, SQLException { long estimate = nanos; for (;;) { if (poolingCount == 0) { emptySignal(); // send signal to CreateThread create connection if (failFast && isFailContinuous()) { throw new DataSourceNotAvailableException(createError); }... }... }Copy the code
Similarly, takeLast has this string of logic.
DruidConnectionHolder takeLast() throws InterruptedException, SQLException { ... while (poolingCount == 0) { emptySignal(); // send signal to CreateThread create connection if (failFast && isFailContinuous()) { throw new DataSourceNotAvailableException(createError); }... }... }Copy the code
As you will notice, there is another key parameter to determine whether this request fails quickly: isFailContinuous().
isFailContinuous
This variable uses atomic update 0/1 to indicate whether the current fast failure status is met.
In the catch part of the code that creates the connection, it is flipped to 1, indicating that the quick failure condition is met.
In this way, when fast failure configuration is enabled, all requests to create connections through the asynchronous process will fail directly. Instead of entering the asynchronous/wait process.
When will this flag be flipped to 0?
Of course, after a thread has successfully created a connection.
Relevant code in com. Alibaba. Druid. Pool. DruidAbstractDataSource# createPhysicalConnection.
public PhysicalConnectionInfo createPhysicalConnection() throws SQLException { ... conn = createPhysicalConnection(url, physicalConnectProperties); initPhysicalConnection(conn, variables, globalVariables); . initedNanos = System.nanoTime(); validateConnection(conn); validatedNanos = System.nanoTime(); setFailContinuous(false); setCreateError(null); . }Copy the code
So, what opportunities are there to flip the state when fast failure is turned on?
-
Asynchronous connection creation process
Don’t forget that even if you fail to start fast, you create a background asynchronous thread before it fails.
-
Synchronize the connection creation process
There are few connections created through synchronization, but there are.
FailFast summary
FailFast works with maxWait. When a connection fails to be created, the thread that wants to create a connection asynchronously will not wait for maxWait and will fail quickly because the pool is empty.
At the same time, failFast is disabled when a thread successfully creates a connection.
Physical connection timeout duration phyTimeoutMillis
Maximum hold time of a single physical connection.
Because the data source itself holds/caches database connections, it generally does not initiate port connections to the DB. However, if you configure phyTimeoutMillis to a value of >0, a physical connection will be closed when it has been open longer than configured (ensuring that it is not currently in use).
if (phyTimeoutMillis > 0) { long phyConnectTimeMillis = currentTimeMillis - holder.connectTimeMillis; if (phyConnectTimeMillis > phyTimeoutMillis) { discardConnection(holder); return; }}Copy the code
MaxCreateTaskCount Maximum number of asynchronous connection creation tasks
When connections are insufficient and need to be created asynchronously, there are two situations in which no asynchronous task is added to create new connections.
- Number of current active connections + Number of current connections in the pool + Number of current connection creation tasks >
maxActive
At the right time. - Number of current connection creation tasks >
maxCreateTaskCount
At the right time.
However, even though the current main thread is not submitting an asynchronous connection creation task, it will also wait for the connection using notempty.await.
The upper level caller doesn’t care if the CreateTask task is actually submitted.
private void emptySignal() {
...
if (createTaskCount >= maxCreateTaskCount) {
return;
}
if (activeCount + poolingCount + createTaskCount >= maxActive) {
return;
}
submitCreateTask(false);
}
Copy the code
Connect the keep alive keepAlive & keepAliveBetweenTimeMillis
The keepAlive option in Druid is similar to that in TCP.
The keepAlive option is also defined in Druid to prevent a database connection from being left unused for too long and being closed by another underlying service.
If keepAlive open, when a connection is idle for more than keepAliveBetweenTimeMillis, will use validationQuery performs a query.
if (keepAlive && idleMillis >= keepAliveBetweenTimeMillis) { keepAliveConnections[keepAliveCount++] = connection; }... for (int i = keepAliveCount - 1; i >= 0; --i) { DruidConnectionHolder holer = keepAliveConnections[i]; Connection connection = holer.getConnection(); holer.incrementKeepAliveCheckCount(); boolean validate = false; try { this.validateConnection(connection); validate = true; } catch (Throwable error) { if (LOG.isDebugEnabled()) { LOG.debug("keepAliveErr", error); } // skip } ... }Copy the code
If this validationQuery execution fails, the connection is closed and discarded.
Data source contraction timeBetweenEvictionRunsMillis & minEvictableIdleTimeMillis & maxEvictableIdleTimeMillis
When the Druid data source is initialized, a runtime DestroyTask is created.
The main purpose of this task is to close a connection that has idle time and meets the closing condition.
if (idleMillis >= minEvictableIdleTimeMillis) { if (checkTime && i < checkCount) { evictConnections[evictCount++] = connection; continue; } else if (idleMillis > maxEvictableIdleTimeMillis) { evictConnections[evictCount++] = connection; continue; }}... if (evictCount > 0) { for (int i = 0; i < evictCount; ++i) { DruidConnectionHolder item = evictConnections[i]; Connection connection = item.getConnection(); JdbcUtils.close(connection); destroyCountUpdater.incrementAndGet(this); } Arrays.fill(evictConnections, null); }Copy the code
As you can see from the above code, the selection logic for idle connections to close is as follows:
If (checkTime & & I “checkCount) : for free time > minEvictableIdleTimeMillis connection, only will be shut down before poolingCount – minIdle, behind the connection will not be affected. (checkCount indicates the number of minIdle connections in the current pool.)
For free time > maxEvictableIdleTimeMillis connection, will be closed directly.
Finally, timeBetweenEvictionRunsMillis is the timing task contraction operation interval:
long period = timeBetweenEvictionRunsMillis;
if (period <= 0) {
period = 1000;
}
destroySchedulerFuture = destroyScheduler.scheduleAtFixedRate(destroyTask, period, period,
TimeUnit.MILLISECONDS);
Copy the code
At the same time, the minimum free time minEvictableIdleTimeMillis is to close the connection.