The correct posture for using open source projects is summed up in blood and tears!

Abstract:

Takeaway:There is a popular principle in software development: DRY, Don’t Repeat Yourself, which we translate more figuratively: Don’t repeat the wheel. The main purpose of open source projects is to share, in fact, to make people not to repeat the wheel, especially in such a rapidly developing field as the Internet, speed is life, the introduction of open source projects, can save a lot of manpower and time, greatly speed up the development of business, why not?

But the reality is often not so good, open source projects while saving a lot of manpower and time, but many problems, too, believe that the vast majority of students on the pit of open source software, the influence of the small may be a half hour outage, big lost hundreds of thousands of data, the problem might be even disastrous accident is all data is lost.

In addition, even though the DRY principle is there, open source projects are actually the worst at following the DRY principle. There are a lot of duplicate wheels, especially crooked ones. Whenever an open source solution doesn’t work, you have MySQL, I have PostgreSQL; You have MongoDB, I have Cassandra; You have memcached. I have Redis. You got Gson, I got Jackson; You have Angular, I have React. In short, in fact, there are many similar wheels! With so many similar wheels, the choice is a headache.

How to do? It’s almost impossible not to use open source projects at all, we need to be smarter about how we choose and use open source projects. Figuratively: Don’t reinvent the wheel, but find the right wheel! You drive a Porsche, don’t get a tractor wheel.

Next, I will summarize some of my experiences and lessons on “how to use open source projects properly” based on my five years of experience in UC. Some of the projects are my own experience, some are my contact, some are my observation, some of the details may not be completely accurate, you can combine your own experience together to discuss.

The following contents are mainly described in three parts, namely, “choose”, “use” and “change”.

Option: How do I choose an open source project?

Does the focus meet the business?

One of the headaches when we choose open source projects is that there are many similar open source solutions, and the latter always claims to be better than the former. We’re A little confused when it comes to choices, and we’re always worried that we’re choosing plan A over Plan B, or vice versa. The lesson here is to focus on business satisfaction rather than on the merits of open source solutions.

Case: When we were trying a social business, we found an open source solution called TT (Tokyo Tyrant), which we thought could be used for caching instead of Memcached and for persistent storage instead of MySQL. It was very cool and elegant, so we used it extensively in our business. But the later use process makes people very painful, mainly manifested as:

1, can not completely replace MySQL, so there are two storage, design time to discuss and decision.

2, the function looks very lofty, but the corresponding bugs are also many, and some bugs are fatal, such as all the data is unreadable, later is their own study of the source code to write a tool to restore some of the data.

3, the function is really cool, but it takes a long time to get familiar with the details.

Later, we reflected and concluded that Memcached + MySQL could have satisfied the business at that time, and everyone was familiar with it. TT was not needed in the business at that time.

Simply put: If your business requires 1000 TPS, there is no difference between a 20000 TPS solution and a 50000 TPS solution. Some people may be concerned about my TPS keeps rising. Don’t worry, our architecture will evolve, and we’ll refactor it when it really needs to be this high. Remember: don’t optimize too early, too early is the root of all evil.

Maturity of focus

Many new open source projects claim to be better than their predecessors: higher performance, more features, and new concepts. Both look tempting, but both, consciously or unconsciously, hide a negative problem: both are more immature! No matter how good programmers write out the project will have bugs, do not think that the author is good without bugs, Windows, Linux, MySQL developers are top developers, the same many bugs.

Immature open source projects are at great risk when applied to production environments. In some cases, the system is down, and in some cases, the system cannot recover after being restarted. In some cases, the data is lost and cannot be recovered. Again, take the TT mentioned above as an example: we really encountered an abnormal power failure, the file was damaged, restart can not recover the fault, fortunately at that time did backup every day, so we can only use the data of 1 day ago to restore, but all the data of that day lost. Later we spent a lot of time and manpower to look at the source code, write their own tools to restore part of the data, fortunately these data is not financial related data, the loss of part of the problem is not big, otherwise there will be big trouble.

Therefore, when choosing open source projects, try to choose mature open source projects to reduce risks.

Maturity can be evaluated from the following aspects:

1) Version number: It is recommended that you select at least 1.X version rather than 0.X version, and the higher the version, the better.

2) Number of companies used: Open source projects usually list the companies that have adopted their projects on their home page. The larger the company, the more the number, the better.

3) Community activity: check whether the community is active, the number of posts, replies, problem processing speed, etc.

Focus on operation and maintenance capabilities

When we choose open source projects, we tend to focus on technical metrics, such as performance, reliability, and functionality, with little focus on operational capabilities. However, if the solution is to be applied to the online production environment, operation and maintenance capability is essential. Otherwise, once something goes wrong, operation and maintenance, RESEARCH and development, and testing will all have nothing to do. Pray for the Buddha’s blessing!

The operation and maintenance capability can be evaluated by the following schemes:

1) Whether the open source solution log is complete: some open source solution log only has a few lines of start and stop, and it is impossible to troubleshoot problems.

2) Whether the open source solution has maintenance tools such as command line and management console, which can see the running situation of the system.

3) Whether the open source solution has the capability of fault detection and recovery, such as alarm and switchover.

Use: How to use open source solutions?

Research deeply and test carefully

A lot of people use open source projects, which are completely “borrowed”, watch a few demos, run the program and deploy it online. It’s like reading the driving manual, knowing that the steering wheel is for turning, the accelerator is for accelerating, the brake is for decelerating, and then driving on the road, is actually very dangerous.

Example: we had a team that used ElasticSearch and basically used it on demand, it was not clear what the inverted index was, it was set to the default value, it went live, it took too long to ping the node, it took too long to remove the abnormal node, and the whole station access failed.

Case 2: Many teams didn’t do much research on MySQL when they first started using it. There were often complaints from business departments that MySQL was too slow. In fact, the most critical parameters (e.g. Innodb_buffer_pool_size, sync_binlog, Innodb_log_file_size, etc.) are not configured or incorrectly configured, and of course performance is slow.

Research and testing can be carried out from the following aspects:

1) Read through design documents or white papers of open source projects to understand their design principles;

2) Check the functions and impacts of each configuration item and identify key configuration items;

3) Perform performance tests in various scenarios;

4) Perform pressure test, run continuously for several days, and observe the fluctuation of CPU, memory, disk IO and other indicators;

5) Fault test: kill, power off, unplug network cable, restart more than 100 times, switch, etc.

Careful application, grayscale publishing

If we do the above “in-depth research, careful testing” and find that there is no problem, can we be at ease to boldly apply online? Don’t be too excited. No matter how much research you do and how much careful testing you do, you have to be careful, because no amount of research, no amount of careful testing, can only reduce the risk, but you can’t cover every scenario online.

Case: Take TT as an example. In fact, we arranged a big bull to look at the source code and test it for about a month before the application, but we still encountered various problems in the end. The complexity of an online production environment is really beyond testing, so you have to be careful.

So, no matter how thoroughly researched, tested, or confident you are, always be in awe of the line. Our experience is to start with non-core businesses and then expand as we get experience.

Just in case

Even in front of us work done very perfect and full, also cannot think everything is all right, especially the first use of an open source project, bad luck is likely to encounter a bug users around the world have never encountered before, lead to business can restore, especially storage, cannot recover from problems may be fatal blow.

Case (I heard about this case) : a certain business used MongoDB, but some data was lost after downtime and could not be restored, and there was no other backup. Manual restoration could not be done, so we had to deal with one complaint from each user. As a result, DBA and OPERATION and maintenance objected to our use of MongoDB, even if it was tried.

While it would be a bit of an overreaction to completely reject the attempt based on a single failure, it does serve as a reminder that when using an open source project for critical business or data, it’s best to have another, more mature solution to back it up, especially data storage. For example, if you want to use MongoDB or Redis, you can use MySQL as backup storage. This may be more complicated and expensive, but it can save lives at a critical time!

Amendment: How to do secondary development based on open source projects?

Keep it pure and wrap it up

When we find something about an open source project that doesn’t meet our needs, there’s a natural impulse to change it, but how to change it is a university question. One way to do that is to take a couple of people and go through it from the inside out and make it exactly fit our business needs. But there are several serious problems with this:

1) Too much investment. Generally speaking, an open source solution of this level, Redis, really needs to be modified by itself, at least 2 people need to invest in it for more than 1 month.

2) Losing the ability to evolve with the original solution: if we change too much, even if the original open source project continues to evolve, we will not be able to merge because the differences are too great.

So our recommendation is not to change the original system, but to develop auxiliary systems: monitoring, alarm, load balancing, management, etc. Take Redis for example. If you want to add clustering, don’t change the implementation of Redis itself, but add a proxy layer to implement it. Twitter’s Twemproxy does this, and Redis itself provides clustering in 3.0. The original scheme can be simply switched to Redis 3.0. Detailed reference (http://www.cnblogs.com/gomysql/p/4413922.html)

What if you really want to switch to the old system? Our suggestion is to directly provide requirements or bugs to the open source project, but the drawback is that the response is slow, which depends on the urgency of the business. If it is too urgent, we can only change it by ourselves, but it is not too urgent, we suggest making backup or emergency measures.

Invent the wheel you want

This estimate makes many people surprised, how to talk for a long time, and finally back to “reinvent the wheel you want”?

In fact, the core of choosing or not choosing open source project is a matter of cost and benefit. It is not that choosing open source project is the optimal solution. The main problem is: there is no wheel that completely suits you!

The biggest difference between the software field and the hardware field is that there is no absolute industry standard in the software field, and everyone enjoys themselves and plays how they like. Unlike the hardware field, if you build a wheel with a different size, other cars will not use it. No matter how high your wheel technology and quality is, it is in vain. There are many similar wheels that can be built in the software world, and they can be used almost anywhere. For example, if you switch your cache from Memcached to Redis, you won’t have a big problem.

In addition, in order for open source projects to be applied on a large scale, they consider common solutions, which vary greatly from business to business, and common solutions may not be perfect for a particular business. Memcached, for example, uses consistent hash to provide clustering, but some of our services, caches, if one of them goes down, the whole business can be slowed down, which requires us to provide caching backup, but Memcached doesn’t have it, and Redis doesn’t have clustering at the time. So we invested 2~4 people and spent about 2 months to build a cache framework to support storage, backup and clustering functions based on LevelDB. Later, we added cross-machine room synchronization functions based on this framework, which greatly improved the level of business availability. If we adopt open source solutions, such as open source solutions, it is impossible to achieve this fast, and even open source projects may not support our requirements at all.

So, if you have the money and the time, it’s also a good idea to invest the manpower to reinvent the wheel that perfectly fits your business! After all, tuhao (BAT, Facebook, Google……) Many do, otherwise we wouldn’t have many good open source projects

The author:
Effect of cloud platform

The original link

The correct posture for using open source projects is summed up in blood and tears!

Related Posts

Redis application instances and cluster building

LiveQing Live video on demand streaming service – Webpack packs compressed back-end code

Canal: Mysql incremental data synchronization tool, a detailed explanation of the core knowledge