Write in the front

Technology selection is determined by the technical direction and the business scenario trade-off, which makes no sense apart from the business scenario. Therefore, this paper only describes the process of database selection of The Technical team of Tongyu, which is not a direct comparison between MySQL, MongoDB and TiDB. It can only show that TiDB is more suitable for the business scenario and technical planning of Accompany fish. In addition, because TiDB is a very new database technology, it can also reflect the attitude of accompany fish technical team to the new technology, the understanding of technology late-comer advantage, the equity of cost and efficiency, and the thinking of technology ecology and dividend.

Why You quit MongoDB

Companion Fish was founded in 2015, back when NoSQL was in its heyday, and relational databases had to deal with the sheer volume of data with intrusive sub-tables. Although Google published the NewSQL database Spanner paper in 2012, But the industry did not have a NewSQL database that could be used. Based on the circumstances at that time, Companion fish chose MongoDB.

However, between 2015 and 2017, MongoDB is indeed a good choice for companion fish, mainly for the following reasons:

  • More efficient development: The company is in the exploratory stage at the early stage, and the product iteration is very fast. MongoDB is a NoSQL database, and there is no need to do DDL operations such as database building and table building. Especially in the rapid product iteration, it is more efficient when fields need to be added or deleted frequently. It does little to check whether what is written is legitimate, and the interpretation of the data Schema is in the application code, resulting in no guaranteed constraints for writing data.
  • More efficient operation and maintenance: At that time, the research and development of the company was very little. During this period, there were only two engineers in the whole back-end and no full-time operation and DBA. However, the stand-alone performance of MongoDB was much higher than that of MySQL, and the operation and maintenance cost of the database was much lower. Other libraries, MongoDB, can directly shoulder the traffic pressure, eliminating the middle Cache layer, making development and operation more efficient.
  • There are few scenarios with transaction requirements: At that time, MongoDB 2.x and 3.x were used, which only provided the choice of data consistency (strong consistency, monotonous consistency and final consistency) and atomic operation. In a few scenarios, such as those related to transaction, the mechanism of MVCC was implemented at the application layer by selecting strong consistency and atomic operation. Simple transaction requirements can be met.

In general, in the exploration period of the accompanying fish product, it is worthwhile to sacrifice a little data constraint and transaction capability for efficiency. However, after the direction of the accompanying fish product is clear at the end of 2017, the business scenario changes from the exploration period to the rapid development period, and the demand for the database changes from efficiency priority to efficiency, transaction capability and ecology.

  • Have affairs demand scenarios shock: transaction scenario associated with money from the initial trade expands to some virtual currency, and due to the increased amount of concurrency, no transaction protection scenario appears before competition more and more, also in the application layer through MVCC mechanism to realize simple transaction is very inefficient, and in the application layer to realize the transaction correctness is also difficult to guarantee. (An interesting story: Jeff Dean once said that one of his biggest regrets about Bigtable is that it doesn’t support inter-bank transactions, that businesses are trying to do their own transactions at the top, and that most of the distributed transactions implemented by businesses are wrong. In the later Spanner database, Jeff Dean directly provided official distributed transaction support.
  • The demand for big data ecology increased rapidly: in the product exploration period, there was also a strong demand for data analysis, but the amount of data was small at that time, and direct analysis from the hidden database of MongoDB was sufficient. However, in the rapid development period of the product, the amount of data increased sharply, and OLAP operation in the OLTP database was insufficient. However, there is a very cruel reality for MongoDB to analyze data through big data ecology. Basically, all big data ecology is built around MySQL ecology. If you want to access MongoDB data, it means that a large number of wheels need to be built again.
  • Higher requirements for data binding: Due to the rapid development of business, services may be maintained and transferred by multiple people. If the stored data is not constrained, it means that the stored data Schema is not controllable, which is easy to crash and fall into a pit for engineers involved in the future. At this time, the constraint of data becomes a higher level of excellent requirement. The relational database write mode becomes the better choice.

To the rapid development of the product, because the business scene to the database needs have changed a lot, so at this time, the technical team began to carefully consider the problem of database selection, our ideal database is like this:

  • High availability;
  • High throughput;
  • ACID transactions are supported;
  • Big data is eco-friendly;
  • Ability to expand horizontally and try not to intrude into business;

Based on these requirements, we began the path of database re-selection.

I met TiDB

As early as in 2015 when I was very focused on distributed database, has experienced high concurrent high QPS scene at that time, the distributed architecture to address a stateless high concurrency QPS scenario is not complex, but distributed storage due to involve the consistency problem is very challenging, if you still want to support an ACID transaction that even harder, Therefore, I was particularly interested in distributed database technology at that time, and began to pay attention to OceanBase and collect and study related theories and architecture documents.

Later, a colleague recommended a database called TiDB, which was also used by some companies and received good feedback, so we decided to investigate it. The documentation on the TiDB website is very friendly, both theoretical and architectural articles are very complete, and I almost read all the articles in one sitting (there were fewer articles then than now). The complete theoretical support, elegant architectural design, and design ideas that are in line with Google Spanner made us very optimistic about TiDB’s prospects and fully met our requirements in terms of functions. So we decided to pay long-term attention to TiDB and prepared for preliminary verification.

Preliminary validation

Through investigation, we found that TiDB guaranteed the consistency of multiple copies of data by raft protocol (C in ACID), and guaranteed the atomicity of transactions by 2PC protocol (A in ACID). The repeatable read transaction isolation level (I in ACID) achieved by optimistic locking with MVCC means that the cost per transaction of TiDB is much higher than that of MySQL, especially when there are transaction conflicts (the cause of optimistic locking), so performance is a key point to verify.

At that time, all the business of Companion Was deployed on Aliyun (now there is a self-built machine room), so it directly purchased machines on Aliyun according to the configuration requirements of TiDB, and TiDB 1.x version was installed at that time. Because the TiDB website already had Sysbench stress test data that met our requirements, we decided to run a full simulated, long-term test of our business scenario: The high concurrency and write-spread design of the Sysoid IM made it suitable for validation. Double write the INBOX table of IM business, write the business to MongoDB synchronously and TiDB asynchronously, and read the business only to MongoDB. In this way, if there is a problem with TiDB, online business will not be affected.

When double write is enabled for IM services during the low peak period, the 999 and 99 lines monitored by TiDB still meet the requirements, but the IO usage of all TiKV nodes fluctuates around 90% all the time. This will definitely cause problems during the peak period. By re-reading TiKV configuration, After the configuration item sync-log = false was changed, the TiKV I/O usage remained below 5%, and everything was normal at the peak of the day.

After that, we observed the double write of IM for 2-3 months. After confirming that everything was normal, we changed the synchronous read and write to TiDB and asynchronously wrote to MongoDB, and everything was normal and continued to observe.

The sync-log configuration is used to control multiple copies of TiKV data for raft synchronization. If sync-log = false, then ack is returned when the processing is complete. For three copies of TiKV data, no data will be lost if a single node fails. The failure of two copies of the same RAFT set at the same time can cause data loss, which is acceptable for other business scenarios except for scenarios with high data security requirements such as finance, and the clustering solution for other databases such as MySQL is more problematic when the master node fails.

The depth of the communication

In the previous preliminary verification of TiDB, a seemingly serious problem can be solved by adjusting one configuration, which made us find that our understanding and control of TiDB is not enough. Besides the understanding and research of each configuration, there are some questions we are very concerned about, but there is no official answer. Without official answers to these questions, it would be risky for us to use TiDB directly, so we decided to have an in-depth conversation with the TiDB team.

The list of issues of great concern to us at that time was:

  • What is the linear expansion capability of TiKV?
  • In a two-to-three-center architecture, how much latency can TiDB tolerate between data centers?
  • At present, what is the maximum number of nodes, data volume and QPS of TiKV and TiDB of the largest TiDB cluster in the industry?
  • Which TiDB configurations need special attention and adjustment? …

I collected more than 20 questions. Thanks to the fact that both Tongyu and TiDB are in Beijing, they are very close to each other. After contacting online and making an appointment, I had a deep communication with TiDB for the first time.

About one day in the first half of 2018, I spent a whole morning talking with 3-4 colleagues at TiDB, mostly throwing out the questions I had collected one by one for discussion. The whole communication process answered many of our concerns and learned about the current use of TiDB in the industry, which greatly enhanced our confidence in TiDB, which is a very critical thing for database selection.

Special thanks to the TiDB colleagues who communicated with me at that time: Fang Xiaole and the other 2-3 students whose names I do not know (I am very sorry).

Why not select MySQL

After the investigation, trial and in-depth communication of TiDB, we need to make our own choice between the traditional relational database MySQL and the NewSQL database TiDB. This is not only the choice between the two databases. In fact, it also reflects the companion fish’s attitude to new technology, understanding of technology late-mower advantage, equity of cost and efficiency and thinking of technology ecology and dividend.

Attitudes to new technology

Companion Fish has a very positive attitude towards new technologies. If the business scenario requires new technologies, we will understand them, study them and master them. We believe in our ability to judge and control the trend of new technologies, so MySQL is indeed a very stable choice in the selection process of TiDB and MySQL. And the demand for we currently have a ready-made solution, such as high availability, such as extending ability, just not very elegant solution, but TiDB both in theoretical level and architectural level higher than MySQL an era (MySQL is a stand-alone database design oriented, is very good in the field of database, In this dimension, IT is true that TiDB is better, but this is not a problem with MySQL, because they have different design goals), but the stability and maturity will be a little worse than MySQL. At this time, We choose to trust our judgment and control over the direction of NewSQL technology, trust TiDB’s ability to evolve, trust that time is on our side, and let the bullets fly a little longer.

Understanding of technological late-mover advantage

The database used by companion fish before is MongoDB, and neither MySQL nor TiDB has ever been used. If we judge that TiDB is more future-oriented database, then we will start with MySQL, go through the path of MySQL, and migrate to TiDB in the foreseeable future. Or directly study and master TiDB, directly All in TiDB?

Start-up companies are far behind some mature companies in terms of technology precipitation and accumulation, which is the first mover advantage of mature companies in technology. When there is no technological change, we have no choice, but when there is a major technological change, if we still do the same technology selection, It will take the same time and cost to reach the level of a mature company, and then when everyone starts to migrate to the new technology, the accumulation and accumulation of technology can become technical debt.

Therefore, startups should anticipate the technology trend, choose future-oriented technology, and overtake in the corners of technology to avoid their own technical debt. This is the understanding of the technology late-mover advantage of companion Fish technology team.

Cost and efficiency equity

Cost and efficiency are the key points of technology selection, especially for database, because the cost of resources such as machines required by database will account for a large part of the total resource cost, so the technical team of Accompany Fish has carried out a deep evaluation of cost and efficiency when choosing TiDB and MySQL.

Unix philosophy is a design principle tempered with time and practice, and often practiced by technical teams, such as Rule of Economy: “It takes a machine a minute, not a programmer a second.” In terms of technology selection, we always expect the basic software to do more things and the business RESEARCH and development to do less things. If the business research and development needs to do what the basic software should do at the business layer, it is actually the abstraction leakage of the basic software. If the underlying software abstraction leaks, the business layer is bound to repeatedly address the problem, which is actually a very large hidden cost.

Compared with TiDB, MySQL cluster high availability and large tables need to be divided into different libraries and tables, which is actually an abstract leak of MySQL facing current requirements. MySQL cluster high availability needs to be solved at the cost of DBA and infrastructure team. MySQL’s large table partitioning scheme requires DBA, infrastructure team and business R&D team to spend costs to solve, but these are hidden costs, unlike in the database cluster, TiDB may require more machines than MySQL to build a simple and straightforward, so it is easy to ignore.

HengQuan, so, for the cost and efficiency with fish technical team pay more attention to the engineer’s efficiency, pay more attention to the engineer’s mood (repeat in the business to solve the leaking of some of the underlying software abstraction is very affect the mood), pay more attention to the recessive cost, and not just the book clearly can compare digital resources, especially in the cheaper machine, Talent is more and more valuable under the trend.

Thinking on technology ecology and dividend

When we choose a technology, we also choose the technology ecology. If the technology ecology is perfect, we will get twice the result with half the effort and greatly improve the research and development efficiency. TiDB has done a very good job in this area, fully compatible with the MySQL protocol, so that TiDB users can enjoy the capabilities of NewSQL as well as the MySQL ecosystem. This is the right decision, MySQL ecosystem is decades of accumulation, not overnight can be achieved.

On the other hand, when choosing a future-oriented, elegant and efficient solution, or a mature but less elegant and efficient one, if the mature solution is chosen, the technical control will be higher, but the efficiency will continue to make efforts; If we choose a future-oriented solution, it will take time and effort to master the new technology, but the new technology will solve the problem gracefully and efficiently, which we think is the dividend of technology. For example, for the solution of large tables, MySQL provides the solution of separate database and separate table. Business development and DBA work together to solve this problem very inefficiently, but for NewSQL’s TiDB, the single table can be understood to be almost infinite (there are more than 10 billion tables in the industry), which fundamentally solves the problem. Now the large tables of the accompanying fish are all migrated from MongoDB to TiDB. Business r&d and DBA no longer keep dividing libraries and tables for the increase of data, which is a huge technical bonus.

Therefore, based on the above discussion and thinking, the partner decided that All in TiDB, MongoDB will not add new libraries and tables, and continue to use the services that are using MongoDB, and carry out planned migration of large tables on MongoDB to avoid the operation of dividing libraries and tables.

The pit of tread

Enjoying the benefits of a new technology before fully mastering it comes at a cost, especially when the partner fish decides All in early in the TiDB comparison, which tests the ability of the technical team to learn and evolve, the new technology community and the technical support provided by the authorities. In the case of TiDB, both the technical team and the technical support team of TiDB have done a good job, but we still stepped in some holes in the process from TiDB 1.x to the current 3.x:

The optimizer selects the indexing problem

  • The single table data is 30W+, and the query request concurrency is about 10+. When a service goes online and an index is added, the original query index is incorrectly selected, and the CPU of the machine where the TiKV instance resides is quickly tapped out, causing a fault.
  • For a large online table, the number of requests is large. Occasionally, certain conditions fail to reach the index, resulting in full table scanning. As a result, the interface response time jitter is caused, affecting services.
  • For a large table of 1.4 billion online, the query conditions are highly differentiated. On a certain day, a specific condition suddenly fails to reach the index, resulting in a full table scan failure. After the TiDB student investigation, the bug caused.

In the process of TiDB from 1. X to 3. X, the optimizer performance is getting better and better.

Big data synchronization problems

  • For data analysis, we have pumped each upstream TiDB cluster into a single TiDB cluster using Pump/Drainer for big data analysis. We are experiencing problems with data inconsistency, slow synchronization, and coding failures.

With the in-depth study of TiDB by the DBA team of Companion Fish and the continuous in-depth communication with students of TiDB, the control of TiDB is becoming stronger and stronger at present, and the problem of big data synchronization has been solved.

The current situation

At present, Tongyu has 10 sets of TiDB databases, 110+ database instances, 6 TPS over 10,000 core clusters, and the 999 line is basically maintained at about 16ms. The response time and stability are as expected. From the current situation, The choice of TiDB is a very correct choice for Tongyu. We overtake in the corner of database technology, avoid the repeated construction and accumulation of MySQL technology, and enjoy the technical dividends of NewSQL database TiDB in high availability and horizontal expansion, etc. Greatly improving the efficiency of business development and DBA. Of course, this is the result of the joint efforts of the Companion Fish technical team (especially the DBA) and the TiDB technical team.

In particular, TiDB comes with a surprise every time it is upgraded, which is a technology bonus that you can continue to enjoy.

Write in the back

At present, under the comprehensive environment of the failure of Moore’s law, high availability requirements of business and cost optimization, distributed architecture is the general trend of technology trend. Traffic routing policy plus multi-copy deployment (microservice is one of the architectural forms) solves the distributed architecture problem of stateless services. Redis Cluster and Codis solutions to the distributed architecture of the cache, Kubernetes completed the distributed evolution of the operating system, the database field will not be an exception, its distributed architecture trend must be unstoppable. To explained, especially solve the problem here refers to the systematic problem solving, MySQL business of invasive depots table is, indeed, the problem can be solved by a distributed architecture plan, but need to research and development of the business with a business scenario to solve a business scenario, it will not be able to call it a systemic solution, because of the way of solve this problem, We believe that NewSQL is a systematic solution to the problem of intrusive database and table partitioning that leaks large table abstractions that should be handled by the database to the business layer, and that TiDB is a very good choice right now.

In addition, it should be noted that this is an article about database selection, so it only records the relevant content. For example, it describes in detail the pit that the technical team of Accompany Fish stepped after migrating the database to TiDB. Because this is the price paid by our database selection to TiDB, it must be recorded in detail. There is no record in the use of other database on pit, it doesn’t mean we don’t have stepped on, such as in the process of using the mongo also stepped over some pit, but because this is not the reason why we decided to do the database selection (selection reason decided to see the article “why give up directing” parts), so there is no record in the article.

reference

  • Spanner: Google’s Globally Distributed Database
  • Martin Kleppmann.Designing Data-Intensive Applications
  • PingCAP blog
  • Ask TiDB User Group
  • Eric S. Raymond. The Art of UNIX Programming
  • The Law of Leaky Abstractions
  • The technical team with fish | classification database