background
Kerberos is a network authentication protocol designed to provide powerful authentication services for client and server applications through a key system. As a trusted third-party authentication service, Kerberos uses traditional cryptographic techniques (such as shared keys) to perform authentication and is trusted by both Client and Server. KDC is a concrete implementation of the third-party authentication service in the protocol. As one of the core services of Meituan-Dianping data platform, KDC has been widely used in permission authentication of open source components such as Hive, HDFS, and YARN. The authentication key is deployed on the nodes of the cluster in advance. When the cluster or a new node is started, the corresponding node uses the key to authenticate. Only authenticated festivals can be accepted by the cluster. The node attempting to impersonate is unable to communicate with the nodes inside the cluster due to lack of relevant key information, thus effectively ensuring the security of data and reliability of nodes.
However, with the rapid growth of platform business, the problem of insufficient processing capacity and unreliable monitoring of the current online KDC has become increasingly serious: what is the maximum QPS that a single online KDC server can bear? Which KDC service is going to be stressed out? Why is the KDC overloaded with idle machine resources? How to optimize? Where are the bottlenecks after optimization? How to ensure the comprehensiveness, reliability and accuracy of monitoring indicators? These are the questions that need to be answered in this paper. According to the final result of this optimization work, the processing performance of a single server is improved by about 16 times per second. In addition, an interface is designed to obtain various core indicators of KDC through shared memory, which further improves the availability of the service.
For your convenience, Table 1 summarizes and explains the abbreviations of professional terms that will be covered later in this article.
Figure 1 shows the current service architecture of MEituan-Dianping data platform KDC. At present, the entire KDC service is deployed in the same IDC.
This section describes the principles of KDC
In the authentication phase, Client, KDC, and Server mainly include AS between Client and KDC, TGS between Client and KDC, and interaction between Client and Server. The following describes the interaction between the three processes in detail. In figure 2, steps 1 and 2, 3 and 4, 5 and 6 respectively correspond to parts A, B and C below.
A. Interaction between agent and AS
1. The user sends his/her own information and the TGT Principal (default TGT of KDC: KRBTGT /REALM@REALM) to the AS service in plain text.
2. After verifying that the user information exists in the database, the AS service returns two pieces of information to the client:
- 1) The TGT and a Session Key applied by the user are encrypted with the user’s Key and returned to the user. After obtaining the encrypted information, the user decrypts the TGT and Session Key applied by the user with his own Key (hereafter referred to as SKCandK).
- 2) Encrypt the TGT, SKCandK and user’s own information applied by the user with KDC’s own key; {TGT}Ktgs for short. This information is stored locally by the Client.
B. Interaction between software and TGS
1. The Client accesses the TGS and obtains a ticket request for accessing a Server on the network. The Client sends the {TGT}Ktgs locally stored in the first part and the Client information encrypted with SKCandK to the TGS module of KDC.
2. After receiving the request, TGS checks the existence of the request Server in the database, decrypts the SKCandK in TGT with its own key, and then decrypts the user’s information and verifies its validity; After passing, TGS generates a new Session Key, SKCandS for short. Two parts of information are returned: 1) SKCandS encrypted with SKCandK and ticket that can access Service. 2) Client info, SKCandS and other information encrypted with the Service key, {TService}KService for short.
C. Interaction between Server and Server
-
After receiving a ticket to access the Service, the Client sends a request to the Service and sends the following information to the Server: 1) Client info encrypted by SKCandS. 2) The second part of TGS returns the client {TService}KService;
-
After receiving the request information from the Client, the Server decrypts the TService information with its own key. Then it can decrypt the Client information encrypted by SKCandS and compare it with the Client information in TService. After passing the request, the whole KDC authentication process ends. The Client communicates with the Service normally.
Major optimization work
Through the analysis of the KDC principle, it is easy to determine that only the first two parts are likely to directly bring pressure to THE KDC service, so the work involved in this article will be focused on the first two parts of the analysis. This optimization work adopts Grinder, an open source pressure measurement tool, to conduct pressure tests for AS and TGS respectively, using the same model (to ensure the consistency of hardware) in different scenarios.
Single process started by online KDC service before optimization; In order to complete the fusion of Meituan and comment data with the lowest risk, keytab in KDC is enabled with PREAUTH attribute. RAID is not configured for some servers that host the KDC service. When the KDC service fails, the entire machine resources are idle. It is suspected that the processing capacity of a single process has reached the upper limit. The PREAUTH attribute further improves the security of THE KDC service, but may bring some performance overhead. If only a small amount of keytab information is loaded on the online server, the data that is not loaded into memory must be read to disk, resulting in a certain AMOUNT of I/O consumption. Therefore, in this paper, the following three conditions are changed and tested respectively: 1. Whether the physical model carrying KDC service is RAID10; 2. Check whether the requested keytab has the PRAUTH attribute in the library. 3. Check whether the KDC enables multiple processes (the number of multiple processes is the same as the number of physical machine cores). (Several tests were carried out in the actual test work)
A. Pressure measurement of interaction between agent and AS
Table 2 shows an average set of test data for the AS pressure test, using a physical machine with 40 cores, so the multi-process test started 40 processes.
Analyzing the data in Table 2, it is easy to raise the following questions that require further exploration:
1. Compare the first and second lines, and the third and fourth lines in Table 2. Why does RAID make little difference to the result?
The four groups of data (the test results are rows 49, 53, 100 and 104 in Table 2) failed to be certified after reaching the processing capacity limit for a period of time. The performance data of the machine is analyzed. The memory, network adapter, and disk resources do not become the bottleneck of the system, and the CPU resources are idle except for one CPU that is occasionally filled up. The authentication logs of the client and server are analyzed. No obvious exceptions are found on the server, but a large number of Socket Timeout errors (the Socket Timeout period is set to 30s) are found on the client. During the test, the client output pressure is always greater than the maximum processing capability of the KDC. AS on the KDC are always in full load state, and the requests that cannot be processed temporarily must be queued. When the queued request wait time exceeds the set 30s, the authentication will start to timeout and a CPU of the machine will be full (figure 3). Obviously, the processing capacity of KDC single-process service has reached the bottleneck and there is the processing capacity of single-core CPU in the bottleneck, so it decides to conduct optimization test in the direction of multi-process.
Figure 4 is A general model of this pressure test. It is assumed that the maximum processing capacity of KDC per unit time is A, and the request rate from the client is stable at B and B>A. The yellow area in the figure indicates the number of queued requests. If a queued request exceeds 30 seconds, a Socket Timedout error occurs.
2. Compared to lines 1 and 3, 2 and 4, and 7 and 8 in Table 2, why are authenticated QPS with PREAUTH roughly half as powerful as those without it?
If the Client keytab does not have the PREAUTH attribute in the KDC library, the Client sends the request, and the KDC AS module verifies the validity and returns the correct result. The whole process only requires two links to interact. The PREAUTH attribute indicates that the keytab’s authentication initiates the concept of pre-authentication in Kerberos 5. If an incorrect request packet is sent to the Client, the Client “understands” that the AS of the KDC needs to be authenticated in advance. In this way, the Client obtains the timestamp of its server and encrypts the KDC with its own key. After decrypting the KDC, the Client compares the time with that of its own server. If the error is within the tolerance range, the Client sends the KDC with its own key. Return the correct TGT response package to the Client. The process is shown in Figure 5.
3. According to the analysis of question 2, the ratio of the values in line 5 and line 7 in Table 2 should be approximately 1:2. Why is the value in line 5 only 115?
In the KDC library, the PREAUTH attribute is enabled for the keytab of the client. Each time the client authenticates, the KDC writes the authentication timestamp and other information to the Log of the BDB database on the disk. With the PREAUTH attribute turned off, each authentication only needs to read data from the library. As long as the MEMORY allocated to the BDB database is large enough, the interaction with the disk can be minimized. KDC40 process and PRAUTH was started, the QPS of its AS processing capacity was only 115. After analyzing the relevant indicators of machine performance, it was found that the bottleneck was indeed single-disk IO, AS shown in FIG. 6. Using the tools provided by BDB, the BDB cache hit ratio of KDC service of Meituan-Dianping data platform is 99%, as shown in Figure 7.
4.KDC AS processing capability When RAID is used for multiple processes, does the PREauth attribute exist and does the KDC service have bottlenecks? If so, where?
After several experiments, the AS processing capacity of KDC is limited by the current CPU processing capacity of physical machines. Figure 8 shows the screenshot of CPU usage with PREAUTH attribute, and the results without PREAUTH attribute are consistent.
B. Pressure measurement of interaction between agent and TGS
Table 3 shows the test data of a group of average levels of TGS pressure measurement:
By analyzing the data in Table 3, it can be found that the processing capability of THE TGS request by THE KDC is independent of whether the host makes RAID or not. Combined with the principle of TGS request in the KDC, it is easy to understand that TGS request does not need to interact with the disk under the condition that the BDB cache hit ratio is high enough. Further experiments also fully verified this point. The disk IO of the machine did not change significantly during the whole test process. As shown in Figure 9, the IO occasionally generated by the operating system itself did not constitute the service bottleneck of KDC. The processing bottleneck of KDC is the same AS that of AS in the comparison of single-process and multi-process, both of which are limited by the PROCESSING capacity of THE CPU (single-process fills a CERTAIN CPU, while multi-process occupies almost the CPU resources of the whole machine). From the design principles of Kerberos, it is easy to understand that whether keytab in the KDC library has PREAUTH or not has little impact on TGS processing logic, and the results of the pressure test confirm this from a practical perspective.
C. Other issues
The Client supports TCP and UDP for the interaction with the KDC. Under the condition of good network environment, the test results of KDC of the two protocols are almost the same in theory and practice. However, in the native code, using TCP protocol, the Client caused a certain pressure to KDC for about 6s, the Client began to authentication error, far from reaching the timeout, the Client appeared “socket reset” class error. Possible SYN flooding on port 8089(KDC service port).sending cookies New words & Expressions Netstat -s found that XXXX times the listen queue of a socket overflowed flowed new words & expressions The main principle is shown in Figure 10:
The value of /proc/sys/net/ipv4/tcp_max_syn_backlog is 2048.
All queues: 1) the system parameter/proc/sys/net/core/somaxconn = 65535, see the code listen () function of the incoming value to 5!
The goal is to make the second backlog parameter of the LISTEN function controllable and passable.
KDC monitorable design and implementation
The open source community does not expose the monitorable interface to the Kerberos KDC at all. In the initial online scene, the relevant indicators are monitored mainly through Log retrieval. It is difficult to accurately monitor the statistical service QPS and various errors. In order to achieve accurate and comprehensive monitoring of KDC, the secondary development of KDC is carried out, and an interface is designed to obtain monitoring indicators. Monitoring design, mainly from the following three aspects of consideration and design.
A. Design tradeoffs
1. The design of monitoring should minimize or minimize the impact on online services in any scenario. In this paper, a shared memory is finally established to record the information of each KDC process, and the architecture is shown in Figure 11. Each KDC process corresponds to an area in the shared memory. N arrays are used to store service indicators of n PROCESSES in the KDC. After a KDC process processes a request, the impact of the request on monitoring indicators is directly updated to its Slot array. The update process is not affected by the lock wait update, and the KDC call to the monitoring point is just an update in the memory block, with almost negligible impact on the service. Compared with other methods, the implementation is simpler and easier to understand.
2. Record the service status of each KDC process so that you can accurately view the request processing status of each process, which helps locate faults in various cases and shorten the fault locating time. For example, it can accurately reflect whether the requests of each process are evenly distributed, and locate exceptions in request processing.
B. Scalability of the program
Any index collection is changed with the demand, if the program design does not have good scalability, the follow-up index expansion will bring great trouble. The collection of monitoring indicators of the first version of KDC only distinguishes the two types of successful and failed requests. All keytabs in the KDC library of Meituan-Dianping data platform have the PREAUTH attribute. According to the above, after removing the PREAUTH attribute, the QPS of AS request can be doubled. AS the service scale grows further, the PREAUTH attribute will be removed if the processing capacity of AS requests becomes a bottleneck. In order to accurately monitor whether and how many requests failed during the process of removing the PREAUTH attribute, an extended monitoring metric was needed, hence the second version of KDC monitoring. The whole process only needs to modify three places, complete the realization of two functions: 1. Add indicators; 2. 2. Add dot logic.
The whole modification process is simple and clear, so the design of the KDC monitor program has very good expansibility. FIG. 12 lists and notes the monitoring indicators.
C. Design of interface tool Kstat
There are two kinds of interface tools for obtaining KDC monitoring indicators:
1. Obtain the cumulative value of each indicator of each KDC process. This function is combined with THE monitoring platform Falcon of SMU to facilitate indicator reporting and processing of cumulative value and minute-level rate value; 2. Obtain the instantaneous rate of each process monitoring index within the specified time interval, and the minimum statistical interval can reach the level of seconds, which is convenient for operation and maintenance personnel to log in the machine to check the current KDC service status without delay, so that they can analyze the current service problems when the company’s monitoring system is unavailable. See Figure 13 for specific usage.
conclusion
Through the pressure test and analysis of THE KDC service, the optimal performance adjustment scheme of KDC is summarized as follows:
1. The KDC service itself needs to enable multi-process and make full use of the CPU resources of multi-core machines, and ensure that the MEMORY resources of BDB are enough to ensure that its cache hit ratio reaches a certain proportion (the higher the better, otherwise the query library will bring a large number of disk read IO);
2. Select a RAID array for the physical machine. Otherwise, if the keytab in the library has the PREAUTH attribute, a large number of write operations will occur, which may cause the performance bottleneck of the KDC. By establishing a block of shared memory without lock, KDC multi-process index collection is realized, and its good scalability and data accuracy greatly improve the reliability of KDC service.
Compared with the original online single process processing capacity, the current single server processing performance increased by more than 10+ times. This work does not discuss in detail how to set the parameters related to the half queue and full queue in TCP protocol to achieve the optimal, combined with the service itself, what is the impact of the change of each parameter? Given the complexity of TCP itself, we will discuss this issue in more detail in a future article.
Reference documentation
http://blog.csdn.net/m1213642578/article/details/52370705
http://grinder.sourceforge.net/
http://www.cnblogs.com/Orgliny/p/5780796.html
http://blog.csdn.net/wulantian/article/details/42418231
Author’s brief introduction
Peng Fei, head of Data platform Big Data SRE group and offline computing SRE group in Basic Data Department of Meituan-Dianping, joined Meituan-Dianping in November 2015.
If you are interested in how to ensure the stability of massive data services and large-scale operation and maintenance of massive servers, and want to experience the explosive growth of Internet big data, please join us. Welcome to join the Big Data SRE group of Meituan-Dianping. Interested students can send their resumes to: chenpengfei#meituan.com. If you are interested in our team, you can follow our column.