Dongxu Huang is co-founder and CTO of PingCAP
A few days ago, when I was having coffee with my friends, I happened to talk about the history of PingCAP and TiDB and my understanding of the core competitiveness of open source software companies. Looking back on my entrepreneurial career and the growth of TiDB community in the past few years, it is like a huge and ongoing sociological experiment. I thought I’d write an article summarizing what community means to open source software and to open source companies.
Network effects everywhere
Two kinds of network effects
Many people have heard of network effects (the Metcalfe effect: the value of a network is proportional to the square number of people connected to it), and many great products and companies have built strong moats through network effects. When it comes to network effects, a classic example is in the communication field, such as mobile phones. Each additional user is more valuable to all users. Although people have no intention to create value for others, once they start using the network, their behavior will help create value for the network. Many familiar toC companies, especially social networking and IM (instant messaging software), have built high barriers through this effect. NfX Venture in one of their blogs (www.nfx.com/post/networ…) Before introducing the community, I’d like to highlight two of these network effects related to open source software.
- Network effects based on herd psychology
These network effects usually start with a few opinion leaders, maybe industry heavyhitters, maybe social hipsters, often when a new product is going to attack an old product market. Though this new product compared to the ruler of the market is not mature, but it will usually take some distinctive features or more frontier idea, attract the dissatisfaction with the “mainstream” or you want to highlight their own front view opinion leader support, create a “cool people are in use, you don’t have to you will be eliminated” feeling.
This feeling can create a network effect of conformity as new users join, but it doesn’t last very long. If you think about it for a moment, you can see that if early opinion leaders join simply to be “different,” they have no reason to stay after the community goes mainstream, and the fans who follow them may follow. In addition, new products are often less polished than older ones, with both good and bad reviews coming early on. At this point, if the product is not polished quickly through the network effect to achieve better iteration speed, then the network effect is not solid. One advantage is that the effect is more effective early on.
Recalling the early community construction of TiDB, it is also because of the work of several founders in Codis and the accumulated reputation in the domestic basic software circle, and the support of some friends in the Internet technology circle, which formed the earliest endorsement.
- Faith-based network effects
** The so-called “belief” is based on the recognition of an idea to join, thus forming a network effect. ** This is common in software as well, with both the free software movement and the open source movement being good examples. People always have to believe in something. ** The moat of such network effects is extremely deep and has a high tolerance for product defects. ** Because belief is a long-term idea, for TiDB, this idea is like believing that distributed is the future, believing that businesses in the cloud era need databases like TiDB. But the goal is challenging enough to be worth the long-term effort.
Faith-based network effects may resemble bandwagon network effects in their earliest stages, where the key is whether the core community has a firm belief in the idea behind the product. On the other hand, simply showing superiority will not last. As interest wanes, network effects will collapse.
What network effects mean for basic software
When it comes to basic software, I’ve always had two things to say:
- Basic software is “used”, not “written”.
- Iteration and speed of evolution are the core competencies of this type of software.
These two points are exactly what network effect can bring. Although the value chain is not as obvious as IM, the existence of network effect is based on the additional value that new users bring to old users. The value of basic software is reflected in the following points:
- Controllable risk (stability)
- More scenario adaptation (discover new scenarios and continuously improve performance)
- Good ease of use
For risk control, the more people use it, the more people share the risk. One of the assumptions is: I am not special, the problem I encounter should be met by others, there must be someone who can find and fix it earlier than me. This assumption holds true in a mature and active community of basic software, because the scene boundaries of the basic software are relatively clear, and the paths within the scope of application are roughly the same. As long as one person steps on a pothole and reports back to the community, no matter who finally fixes it, the behavior benefits all other users.
The same logic holds for scenario adaptation. Individual perception is often limited, and even the founding team of a project may not have a deep understanding of a specific application scenario. The creativity of the community is infinite, and some out-of-design usage paths can be surprisingly useful, developing new scenarios of advantage. Similarly, a single success story increases the value of software for other users in similar scenarios, such as the real-time HTAP data processing solution created by TiDB and Flink.
The logic of usability improvements is similar to stability, but I won’t go into that. Take advantage of the flywheel effect of network effects to improve software, an idea I discussed in my article “Cathedrals May Fall, but markets Remain.”
The maturity curve and necessary stages of the community
Birth of community
Opening up your source code on GitHub, or even using an open Git workflow, is not a community birth moment. A community is really born when a third party, outside of you and your code, starts to step in and make connections. It could be the first external PR, it could be the first external issue. Community begins and results from connectivity. Open source is not the same as open source, and it is a shame that many teams and projects spend so much time on open source that they neglect to communize the code and the team behind it.
The death chasm and the slope of hope
As mentioned in The book Crossing the Chasm, open source software has its own life cycle curve, which is relevant to the community.
The reason for the fault in the figure is that the product maturity has not kept up with, and users have come to find that they are all pits. The resulting bad reviews will make the early backers and founders tired and even lose interest.
** For an open source software, the fault may be that it experiences rapid growth in the early stage and then comes to a quiet period of 1 to 2 years, when the growth almost stops. ** For the community, almost all the effort is devoted to filling holes for early adopters, with natural growth and a high attrition rate. This phase is a huge drain on resources, and the core contributors to the community are so tired that they die if they don’t make it, hence the “death gap.”
The good news is that this phase will eventually pass, and the product will gradually find its own positioning and best practices during this phase, and the product will become more stable and focused along the best practices path. If the positioning is market demand, ** will usher in a phase of rapid growth (maturity), and the ecology of the community will begin to accelerate development with the popularity of the product. ** This can be seen as a profile of this chasm in the Kubernetes and TiDB search indices above.
Community endgame
What is the endgame of a good open source community? There are plenty of examples for this, such as GNU Linux, Hadoop, Spark, MySQL, etc. In my opinion, whether an open source software and community is started and expanded by a commercial company or by some other means, it is natural that a neutral organization, independent of the company, should eventually take over the community.
Corporate-led open source projects, in particular, face problems with neutrality at a later stage. Because customer success is the most important thing for a company, the desire to commercialize is bound to affect the functionality and development priorities of open source software. And priorities often change (possibly more urgent and specific), and change may conflict with the development rhythm of the community, but I don’t think the two are irreconcilable, as I’ll explain below.
* * later too many open source software has been supporting the user scenario success and business interests, by a neutral committee to balance the interests of all parties and the supervision, the responsibility of the parties is far more successful practice of * * and began to have such a group, also from the side that the project is mature, has a good ecological. Haven’t reached this stage of open source software is mostly dominated by the company behind the project community, the project maturity stage, the focus is on continually by optimizing the success of clients and the scene to the flywheel rotation, when the members of the leading companies are more and more in the governance of thinking and practice rule, this is a positive signal.
How can community and commercialization coexist
Farming and cooking & people on rivers and banks
One problem left above is the contradiction between open source and commercialization. No matter how I explain it, in essence open source and traditional software sales model must be in conflict.
Let me give you an example that is easy to understand: if open source is compared to growing vegetables, the source code of open source software is equivalent to seeds, and business success is equivalent to growing vegetables. The traditional business model of software is similar to selling seeds, but hosting is the work of customers themselves. The open source seeds are free, the land is owned by the customer, and the person planting the land is also the customer’s person, so the open source vendors can probably only provide guidance on planting, especially if some seeds are not very easy to plant, the guidance service makes sense. But when you think about it, with seeds improving (performance, stability, ease of use, etc.) and just dropping them in the ground, there’s no need for a professional vegetable service. So the company has to sell some extra value, such as insurance, and at least a team of experts to help out in case of extreme weather. But the business model is still awkward because the value chain is largely on the customer’s side. So, if a vendor sees the community only as a potential customer, it’s hard to make a good product because there’s no intrinsic incentive to continuously optimize the software.
A better perspective is to take a step back, I will give you an example of good understanding: the community as a river, that does not belong to anyone, all keep the water clear and liquidity, who all don’t too much fish, different groups and individuals can be building its own ecological around rivers, but as for the shore what money, that’s another problem, speak again later.
Customer success and user experience: Internal consistency
While the primary goal of an open source software business is customer success, this is not inconsistent with doing a good job in the community. ** A common misconception is that within open source software companies, the two teams are pitted against each other. ** business team believes that the community is for commercial fish farming, fattening will be harvested, extreme point is always closed source; Community groups believe that commercialization will slow down the rate of ecological dissemination, raise the threshold of use, and, in extreme cases, generate anti-commercialization tendencies. If both sides are in their own shoes, then of course both sides are right, what’s wrong with them?
The problem comes down to “phases” and “customer choice.” The lifecycle of open source software for community users and commercial users can be completely different. The average open source software company has two funnels, which I call the community funnel and the commercial funnel. Some say that the community funnel is on top of the business funnel, and I used to believe that, but after years of practice, I’ve come to realize that it’s not. The two are separate, and if they are simply used as a funnel, there are a lot of questions, like the classic question: What is the value of a community that doesn’t flow into a business funnel? So, it’s definitely not a funnel, it’s got deep internal connections.
What connection? For the convenience of understanding, or with vegetable illustration. Things incubated by the open source community, such as user success stories, community contributions to the product polishing, explored application scenarios, are just like raw dishes and ingredients, while the customer wants a dish of fish-scented shredded pork and doesn’t care how the meat and vegetables on the plate come from, so see the key point? The role of the commercial team is like that of the chef, and the role of the community operation team is like that of the farmer. The focus of the chef is how to make good food, and the focus of the farmer is how to grow good food and produce better food. From ingredients to a dish, there is a long process to go through, but without good ingredients, it is difficult for the most capable chef to make a good dish.
** For open source software companies, there is an internal consistency between the community and commercial teams: a good product and a winning scenario. ** Based on our experience, it is better for community teams to focus on two key points:
- Community polishing of the product (in the winning scenario);
- Discover more winning scenarios.
These two key points will form a closed loop, with the community team continuously producing the ingredients (winning scenarios and evolving products), the business team focusing on further processing of winning scenarios and customer journey optimization, and the two teams working together to pull the whole company and project cycle. For example, the scenarios and solutions of TiDB business users are mostly born and matured from the community users. Although the two user groups may be completely different, TiDB forms a large ecological cycle of commercialization, and PingCAP is the bridge in the middle. In addition, community and commercialization teams will share a common North Star metric: user experience.
The only way to scale: the cloud
A good business should be able to scale. The problem with the business model of traditional open source software companies is that scale requires human involvement, sales/pre-sales/after-sales delivery, etc., while a human-based business cannot scale. This question was unsolved before the cloud, so open source software companies needed to find a software business model that didn’t involve open source (it sounds weird, but if you think about it, it is), and the cloud was essentially a resource rental business.
Take vegetable planting as an example. In the traditional business model in the past, open source software companies were in an awkward position because the land and vegetable growers were owned by customers themselves. However, on the cloud, the business model of basic software is essentially a hosting service. Hosting resources and infrastructure, the most important part of the original value chain, are in the hands of the manufacturer, which is also good for users, after all, managing the “land” is also a laborious thing, and it is difficult to buy on demand. The problem is that users want just a good dish, pay attention to this and open source (vegetables), and there is no relationship, because open or not open source, user management and lease fee, the equivalent of even free seeds and ingredients, customers go to the restaurant to eat, also need to pay for the food, it is a good food and service experience for customers to buy.
In addition, ** many people think that the open source community is a barrier to competition, but it is not, the real barrier is the ecosystem, and the open source community is an efficient way to build the ecosystem, if a product without open source to build the ecosystem, then the effect is the same. ** A good example is Snowflake. Although Snowflake is not open source, when it was launched in 2012, it had almost no competitors in the cloud data warehouse market, giving Snowflake plenty of time to build its ecosystem through differentiation and a great user experience. Relying on the rise of cloud and scale effect has achieved great success.
How to make a Good community
I’ve talked a lot about philosophy metaphysically, but let’s talk about practice. There is a way to do open source well, but only if you have the right way of thinking and perspective. Otherwise, in practice, there are countless things you can do, but you don’t know which or which things are more important, and even worse, you can’t measure right from wrong. Here are some of my perspectives and key metrics to consider for community operators.
Who are you? What problem did you solve? Why you?
The foundation of a good community must be a good product, and the answer to the question “Who are you?” must be answered by “what problem did you solve?” which is very different from the operation of toC products. The most common mistake is that some community operators will shift their attention to various activities or promotions, and at the same time exaggerate the capabilities of the product, resulting in a mismatch with reality.
Many friends who do community operation often come to me: I have also done a lot of activities and written a lot of articles, why does it seem to have no effect? I usually ask him at this point: Can you explain in one sentence what your product does? What problem did it solve? Is this a common problem? Do you have to make this product? At this time, he understood: there is no perfect product, a good product must follow its advantage of the scene, for example, Redis obviously cannot be used to do the core financial transaction scene, but no one will deny that Redis in the cache scene is the fact of the standard. There are many similar examples, such as Spark, ClickHouse, etc. So for the operations team, think about these four questions before making any moves.
The ease of use determines the conversion rate of the funnel
Is it enough to find the winning scenario? Of course not. If you think of the entire user journey as a funnel, finding the winning scenario is at best a matter of finding the right entry point. Once inside the funnel, the important thing becomes to increase the conversion rate at all stages. Many to B teams subconsciously ignore this, usually for one of two reasons:
- The project authorities even encourage users to contact the official team, because it is very important to know that someone is using this information in the early stage. However, the basic services and support for commercial customers are all official, and customers have no sense of it, so there is no motivation for the company to optimize.
- Many products are life-saving products from the beginning, and users have no other choice. For example, in the early days of TiDB, when the need for MySQL expansion was imminent, users were more concerned about how to solve the problem immediately, kernel capability was more important, and the rest could wait and endure.
The result of these two reasons is a lack of focus on ease of use and user experience, a mistake that can be hidden in the early stages of market competition. On the one hand, because the number of leads flowing into the funnel is not large enough, human flesh support is acceptable, and the market competition is not fierce, users have no other choice. Imagine that when the market eventually matures and a large number of customers are educated enough to flow into the funnel, the team’s support bandwidth will be insufficient. On the other hand, because the market has been educated, there must be competitors who can do similar things, and when you are not the only life-saving option in the market, users will choose the one that is convenient and easy to use, which is understandable. This is why ease of use and user experience should be Paramount in the middle and late stages of the open source software competition. For the “with the worry”, hypothesis has already dealt with the ecological and case endorsement by mature, but at this point, “by works the” the birth of China open source software group, compared with the advanced world level gap is very big, after all, open source software competition more intense than the domestic, overseas because foreign open source market was born a long time, Moreover, the demand for basic software in business scenarios is not as extreme as that in China. Usually, several products can solve the same scenario, so of course there is a competition between ease of use (worry) and ecology (rest assured).
There are a few questions you can ask yourself as a product owner of an open source project, what is the definition of good in your product domain? What are the best practices? What is the world’s most user-friendly peer level? I believe that thinking about these issues will help product development. A good perspective to reflect ease of use is the level of self-servicing, which has many indexes, such as the percentage of servicing the entire product life cycle on the cloud, the percentage of open source community who does not ask questions during the process of contact and use, and the number of active contributors in the open source community.
Secondary propagation is the key to achieve network effect
Is the premise of the above mentioned, the network effect, the use of any new users is a valuable addition to the old users, so just think: if a community of users using the software silently, silently watching the documentation and best practices, even out of the bug fix yourself silently (not contribution back), it is valuable for the community and the product?
I don’t think so.
Though I know there will be, like the silent majority. For community operators, * * is not the most key tasks let silence more or use more deeply, but to let them and other users in the network to build more connections, such as * * share experiences (writing case article), contributors, and actively to the community feedback problems in use, etc., and be sure to transfer the content to other nodes of the network, Ensure value is generated. For example, one user’s usage scenario helps another user select a model, and one user’s feedback helps the product find a bug and fix it. These are examples of value generated. Don’t let users become islands. If a community operator fails to see this key point, he or she may end up chasing numbers for the sake of numbers (usage), doing a lot of work, but not seeing the overall progress.
Transfer of network effects
The highest level of community operation is to transfer the network effect from user network effect to faith-based network effect, and to transfer the community center from inside the open source company to outside to gain more potential energy. Neither is easy, and for the former it’s probably more about abstracting and summarizing ideas and keeping the long-term insight right, and finding the right community of evangelists, which is not easy. For the latter, as long as enough successful cases and advantage scenarios are accumulated in the company-centered stage, and resources are invested to educate the market, the rest is left to time, which focuses on brand power. Open source software community operation is an exponential curve game, to embrace the long-term mentality to work.
Finally, I’d like to close by saying, what does a great open-source infrastructure software product look like?
I see a great basic software product as not just solving a specific problem at hand, but opening up new horizons, new perspectives, and new possibilities. Just like the invention of the smartphone, it serves as a platform to give birth to great apps like wechat, opening up a whole new world.The invention of the cloud, S3, and EBS gave developers new ways to design, spawned new species like Snowflake, and revolutionized the way people use analytics data. whileThe open source community is the perfect place for this kind of great foundational software, like fish and water.
I don’t know what the community will bring, and I dare not overestimate my ability. After all, the power of an individual is always insignificant in front of the wisdom of a group.