Introduction: At the CONFERENCE of DTCC 2021, Wang Weimin, General Manager of Product and Solution Department of Ali Cloud Database Division (Name: Wei Min) delivered the keynote speech “Cloud native Database 2.0, One-stop all-link Data Management and Service”, and accepted the special interview of Lao Yu, executive editor of IT168 enterprise-level &ITPUB. The content of this article is compiled from the on-site interview.
Guest interviewed: Wang Weimin, General Manager of Product and Solution Department of Ali Cloud Database Division (name: Weimin)
Interviewer: IT168 enterprise &ITPUB executive editor Lao Yu
(This article is based on the DTCC 2021 conference interview)
Reporter: Please introduce your daily work in Ali Cloud database
** Wang Weimin: ** First of all, I would like to introduce myself. I have worked in Oracle, Microsoft, Huawei and other companies in which I have been engaged in database related work and held different positions.
I joined Aliyun this year, and now I am mainly responsible for four aspects of work.
The first part is product management, which is exactly the same as my previous work. It is mainly oriented by market and customer demand to see which products we need to develop.
The second part is the solution, which is responsible for the commercial success of the product, including industry, regional and international markets.
The third part is product experience, including documentation and experience design, mainly for users to have better experience and use our products and services more efficiently.
The fourth part is the brand and ecology, including the brand to continue to build product influence in the industry. In terms of ecology, database, as basic software, is in the PaaS layer, unlike computing, network storage, security and other general capabilities, which are needed by everyone. Nor does it have its own traffic entry point, as SaaS applications do. Therefore, it is not very efficient to promote the database alone. We need to market and promote it in an ecological way, and promote and replicate it to various application scenarios more quickly with our partners.
The above is my current work in Aliyun.
Reporter: Like OLAP, OLTP and NoSQL, r&d integrates and develops for each product, and you become productised.
** Weimin Wang: ** Yes, all customer needs come to our team. We plan the product, r&d meets the demand, commits resources, and drives the product to market according to the roadmap of the product plan. In addition, our team is also responsible for GTM work such as the business model and pricing of products.
Reporter: The topic of your speech today is “cloud native database 2.0, one-stop full-link data management and services”. Why do you choose this theme? What benefits do you think this theme can bring to participants?
** Weimin Wang: ** We first emphasize the “one-stop station”, which means “One Stop”, not “Stack” of “technology Stack”, but “Stack”. What we are doing now is cloud services, and we have diversified the supply of all kinds of engines. Each product or engine is customized and optimized for a specific user scenario, which is a classic tradeoff between General Purpose(GP) and Special Purpose(SP). There are a lot of companies that try to solve all problems with General Purpose, but so far, it may be optimal in some scenarios, but suboptimal in all others. Of course, General Purpose has the advantage that it has the highest ROI for product development teams, and it consistently invests in creating a product that tries to solve all scenarios.
But there is a problem. We see a lot of products designed for dedicated scenarios, such as caches, embedded IoT scenarios, edge scenarios, multi-mode, which GP products do not solve well. Up to now, thousands of industries in the ** are doing digital transformation and cloud, trying to solve all problems with one product, philosophically speaking, this is too ideal state, is impossible to achieve. ** We want to use products and product combinations to meet the needs of various business scenarios, which is equivalent to our MODEL of GP+SP. So that’s why we chose this topic.
Secondly, we should emphasize the “full link”. We have done surveys and found that in most enterprises, all kinds of products and solutions are very complicated, but at present, there are still big problems in the communication and coordination between them. We may be able to solve the “chimney” problem by using cloud, but in fact, we have only solved the “chimney” of resources, and there is still no way through in terms of business and data. Since data is a new means of production, we need to have a data operating system, we want to use DMS to realize DataOS. That’s what we think, and that’s what we’ve been working towards.
Our biggest insight so far is that the rich business scenarios and different business load characteristics within them make it difficult to solve all problems with one product. So we want to make a real difference.
Reporter: You just mentioned this idea, and it’s the same trend I asked about below. In terms of dedicated database, multi-mode database, obviously your dedicated database is similar to Amazon’s, rather than Microsoft’s “one library for all” — one set of products for all scenarios. Now, all vendors are talking about cloud native. Is there any difference between your cloud native architecture and that of your competitors during the evolution?
** Wang Weimin: ** First of all, the brands of Microsoft and Oracle are actually focused and unified, but they are also a combination of GP and SP. Oracle has many brands, such as Oracle TimesTen for cache scenarios and Oracle RAC for high availability scenarios. So Oracle has a tier 1 brand and a tier 2 brand. Similarly, Microsoft has SSIS, SSAS and SSRS for analysis, reports and integration, and Azure CosmosDB for multi-mode.
Therefore, I think the core reason why we all choose the path of “moving towards cloud native 2.0” is that we all see that the design of the underlying underlying software is too challenging due to the diversity of loads. It is impossible to cover all of them with one architecture and one set of code.
Referring to the differences with friends, in the past era of cloud native 1.0, we have actually finished productization and service transformation, basically have completed the infrastructure pooling, in the control of multi-tenant, on-demand use, measurement by volume, and can do different degrees of expansion and contraction capacity.
In the era of cloud native 2.0, the linkage between products and services is more important. In my previous sharing (the 12th China Database Technology Conference), I mentioned seven customer cases, among which none of the scenarios was addressed with a single product. In order to meet the transformation needs of customers, they basically need transactional, analytical, real-time warehousing of data, integration of warehouse and warehouse, etc., and they all need to realize data discovery, insight and value realization.
In our opinion, the cloud native 2.0 era may evolve in two directions. First, continuously consolidate and enhance the differentiated competitiveness of individual products. Second, efficient collaboration and experience enhancement between multiple products.
Reporter: single product respect, want to continue aggrandizement own advantage. Overall, you will also emphasize the benefits of a holistic solution, whether OLTP, OLAP or NoSQL, where all products seek better synergy.
Weimin Wang: We now have an idea to productize the solution. For example, if a customer wants an online e-commerce system, we can directly provide a system with cache and transaction, with a link in the middle, which is ready like a small data warehouse. Because there may not be so much data to start with, the warehouse is scalable.
It’s in the cloud, but it’s also vertical, the business is the chimney, the resources are pulled through horizontally, and that’s what we’re trying to do.
Solution productization, in fact, there are many companies in the industry have done. We may only be able to do it in the general scheme, and it may be difficult to do it in combination with the industry’s know how.
Reporter: If there was a simple, two-sentence way to describe cloud native database 2.0, how would you describe it? Let others understand the concept of “cloud native database 2.0” immediately.
Wang Weimin: “one-stop full link” is still a relatively technical term. If we use business terms, it should be an efficient combination of multiple engines to meet customers’ diversified business loads. Another term is “full scenario” or “full data life cycle”, but these are relatively Marketing terms.
Reporter: Yeah, it’s still a little hard to understand for non-technical people. The next question actually comes from a point of view mentioned by TiDB Huang Dongxu. He thinks that the so-called cloud native databases are not really cloud native, and those that can return to offline deployment are not called cloud native. What do you think of that?
** Wang Weimin: ** this idea is quite new, I have never heard of it before. We’ve seen a class of customers go to the cloud and ask for data to be backflow and ready to go to the cloud. The core reason for customers to have this appeal is that they are worried that the cloud is a super lock in and that “I am on your cloud, unless I die, I will be on your cloud for the rest of my life”.
In the process of communicating with customers, it is found that some customers resist differentiation features, fearing that they will become dependent on a certain manufacturer because of differentiation features. Of course, there are many different schools of technology in this area. Some customers write SQL with a layer of abstraction in the middle, which can be adapted to the underlying MySQL, or can be changed to SQL Server or Oracle, which is invisible to them. There are products, such as Source Pro, that can mask the differences between the underlying databases for upper-layer applications, so I don’t quite agree with this argument.
The product can be provision on on-premise or on the cloud. However, the cloud can play many unique capabilities of the cloud, such as resource flexibility, but not only on the cloud.
For example, RDS MySQL can not be cloud? It must be in the cloud, and it’s not a cloud native database by those standards. But if you decouple it, you become like AWS Aurora, where the computing and storage are decoupled, Aurora is very hard to get down.
Whether it is “cloud native” should be considered based on customers’ business scenarios and requirements, rather than technical implementation. After all, many customers have different requirements. For example, if you carry a bag and check into a hotel, check into this hotel today, check into that hotel tomorrow, but the hotel does not have to let the customer go.
Reporter: Do you think his statement is based on the conclusion of the foreign market? As you said, can both up and down, is it the specific demand of the domestic market? Or is it the demand of the global market? In fact, RECENTLY I read an article, its view is “foreign countries do not use distributed, so the domestic use of distributed”.
** Wang Weimin: ** As you mentioned, is this a special situation in the domestic market? In my opinion, when public cloud becomes the mainstream in the future, the question of how users consider platform neutrality and cloud neutrality will definitely arise. At present, we have seen many similar requirements. For example, the customer’s primary site is in cloud A, and the DISASTER recovery is in cloud B. The other is interleaved. Some services are produced in A and backup or Standby in B, and the other services are active in B and Standby in A. ** The needs of our customers may continue to change in the future, and our philosophy is “customer first”, which is something we must consider. ** Dongxu’s idea is new at present, but do you think TiDB can be deployed offline? It must be.
InfoQ has only recently implemented sub-libraries of the underlying database, which were previously single libraries. As for the realization of products and architectures, cloud vendors have their own target market choices. For example, AWS does not engage in offline market and on-premise deployment. At most, there is an Outposts edge deployment, which only provides limited capabilities, which may be the problem of their customers’ choice.
Reporter: Orient the views of more advanced, they a lot of customers in foreign countries, foreign markets for a high degree of public cloud hug, but for domestic, apparently to offline deployment, a hybrid cloud, private cloud, embrace more than it does the public cloud, at least in the financial, telecommunications and other industries contribute a large, relatively so.
** Wang Weimin: ** In the general trend of public cloud, my judgment is the same as Dongxu. In my opinion, there are two possible reasons for the large business load of domestic financial and telecom industries not being on the public cloud: first, domestic policy requirements on information security, industry supervision and compliance; Second, customer’s independent and controllable demand in system software operation and maintenance management.
In fact, you said it’s not safe on the public cloud? Not necessarily. The $10 billion JEDI projects of the us department of defense are all in the public cloud, and CapitalOne, the largest us bank, is 100% in the public cloud, with no private cloud. I think the future of the public cloud is relatively optimistic. Many industries need a process, the cloud may be more than ten years of process, the track is very long.
Reporter: Because abroad does not use distributed, so distributed has no meaning or application value?
** Weimin WANG: This is another issue that is particularly worth discussing. I think complexity is a conservation thing. For example, GitHub has not distributed before, but it does not mean that this part of the work will not be done. This part of the work may be done in the application layer and middleware layer. It’s just that from a database perspective, we want to leave the complexity to the database and the simplicity to the application. ** We hope that the distributed database can do application development and management operation and maintenance just like a single machine, so that the application does not need to pay attention to the underlying database capacity, transaction, expansion, consistency backup and other problems.
In fact, foreign databases in the number of concurrent users and user scale, not so prominent domestic, after all, we have demographic dividend. For example, can alibaba use Oracle database to support its payment system for the “Hand Chopping Festival” on November 11? Sure, it is possible to change the OceanBase at the bottom to Oracle, but this TCO is too high.
Therefore, whether to use distributed products to solve, or to use distributed solutions to solve, this is another idea. For example, the application connected to distributed middleware, such as ShardingSphere and the standalone database underneath, can do the same thing.
I don’t think it is meaningless to say that it is not used abroad, but it depends on which level these inputs are made. Such as I think from the dimension of the products, with the increase of usage, the marginal cost is falling, so I think it is meaningful to have to do the things, such as * * applications need not concerned about flexible transaction, business is waiting, the complexity at the database level processing, so I think a distributed database is very meaningful. ** You may not need to use a distributed database in a foreign country, but you can degrade a distributed database to a single instance, or you can use only one node in An Oracle RAC.
Reporter: IN the past, I always think that huawei is the best in database ecology among domestic cloud vendors. I often see who has joined its ecological partner circle. Ali cloud is the first to eat crab, so Ali Cloud maintains a certain first-mover advantage, at the same time, Ali also has some extreme scenes similar to double 11, so that it will have the accumulation of technology. Of course, this is just my opinion. From your point of view, ali Cloud database from the technology, product, business, ecological and other perspectives, compared with other competitors, what are its advantages?
** Wang Weimin: ** First of all, I think the first advantage of our production and research team is the high degree of unity and consistency in the organization, so that we can make a difference. Li Fei fly is ali cloud database division 1, he is also a dharma hospital database and storage laboratory 1, so to represent him and at the head of the ali all database system, products, research and development and the preparatory team, in the product strategy, communication, market strategies are highly consistent, this is the first advantage.
Second, Ali does have a first-mover advantage. Many of our customers are not new to the cloud, for example, based on IDC, just to the cloud. “Day one” means to live in the clouds, to be born in the clouds and live in the clouds.
Third, we focus on products. We don’t have many products, and our products are divided into three categories: open source hosting, commercial hosting and self-developed products. The core is to focus on these three products, and we have different strategies for them.
For example, open source hosted classes focus on ease of use, security and reliability, and high cost performance; Commercial products mainly make use of its open ecology. In this respect, we mainly solve the problem of diversified business compliance ecology. We have direct commercial cooperation with SQL Server and MongoDB, which require AliCloud to continuously comply. Self-developed products, we focus on four major categories, PolarDB, AnalyticDB, Lindorm, Tair, and not very divergent.
When it comes to ecology, the way each manufacturer does ecology is not the same. Although some friends are doing database, but not to database as an independent industry to do. Creating a second plane to solve the problem of getting stuck in the neck is amazing, and we need that technology.
Ali’s approach to ecology, in particular, is to do it more in an integrated way. For an open and prosperous ecosystem, the most important thing is to be able to share benefits with partners. If this cannot be done, I don’t think it will work. So we’ve been exploring how to get ecological participants to work with us to make the pie bigger, but also to see how to better accommodate all sides.
We also continue to invest in ecology, such as distribution partners, SaaS integration, ISV, delivery partners, training and certification, but it is true that our investment in this area needs to continue to increase.
Reporter: Last month, I interviewed Li Feifei in ali’s media communication meeting. He mentioned that you would go offline market, because he thought the online pattern had been basically decided. For high-end customers in the database field of the offline market, such as finance, government and enterprises, do some friends also have more advantages than Alibaba? Ali is now going to hit the offline market, what are you going to do? What is your differentiating advantage?
** Wang Weimin: ** First of all, the offline market demand is rigid, this is objective existence. At present, we have accumulated many customers including operators, financial industry, government and so on. Everyone is discussing a question, if we do it again, should we change the platform unchanged or do transformation and upgrading on the original basis?
At present, we see a common demand from customers including operators. Today, I also shared a case of an operator whose business system (including boss system) is directly cloud-based. In this process, they need to combine many concepts including “cloud”, “swim-lane” cutting of upper-layer applications, microservitization, and the whole operation and maintenance development. I think this replacement is not simply to replace a product, but to replace with a solution.
Second, in the government and enterprise market to directly PK Oracle, such as performance, we think this may be misleading. Performance is not the only factor, there are many other factors. We want to be able to use DBStack for agile delivery, seamless integration, a lightweight cloud that can be deployed to the customer’s site, and put all the data engines on it that the customer might want to use. Now, Guangdong mobile, Anhui mobile and other customers, are using these products.
They may have their flexibility, and we have our own differentiating features. For example, if you make it clear that you don’t want to do private cloud, you will release a large chunk of the market, and there will be many friends to fill the gap in the market.
As for the point mentioned by Li Feifei, I understand that there are two core reasons for us to do this. First, this is the high-end market, and it is also a benchmark, and it is the market demand that is displayed on the surface. Second, these business scenarios and business loads are very helpful for traction product research and development, inspection and continuous improvement of product stability and reliability. I think combining these two factors, we will definitely do it unswervingly.
Reporter: Just as you mentioned “one-stop full-link data management and services in the era of cloud native database 2.0”, this is also the reality that we see. Different databases are needed to solve problems in different scenarios. There is no so-called “one database can conquer all the world”. These so-called trends we see, such as cloud native + distributed, soft and hard integration, lake warehouse integration, HTAP and so on, are all a kind of integration in itself, which is driven by users’ demand for simplification. For users, the complexity and difficulty of managing one database is different from that of managing four databases simultaneously. The trend is that people want to simplify, but we’re still seeing dedicated scenarios with dedicated databases, isn’t that a little paradoxical?
** Weimin WANG: ** Many of the problems we have to solve are actually multivariable. First of all, it’s very challenging to look at all the variables, and how they should be prioritized, like reliability, development, management, operations, and so on.
Second, since general-purpose products cannot meet the requirements, how to use a combination of “general-purpose + dedicated” to reduce complexity? We use DAS, DMS and other technical means to solve. We hope DAS can use AI technology to solve the routine work such as patch upgrade, daily inspection, parameter configuration or adjustment, and alarm backup of the previous traditional DBA. There is a particularly good trend now that we have accumulated an enormous amount of best practices for database load, water mark, slow SQL governance, etc., to work on. Similar to deep learning, when it was first proposed by Yang Likun in the 1990s, it could only detect postal codes in the US postal service in the 1990s. Unlike today, it has a very wide range of uses, mainly data and computing power.
Going back to the question I just mentioned, this contradiction might have been insurmountable in the past. But today, we see an opportunity to solve it, and we can solve it in the way of AI4DB. In this way, people can focus more on high-value tasks such as Application DBA, logical design, data placement of data, etc., while routine tasks can be automated by the program, combining the two organically.
Reporter: We mentioned in the press release that Ali Cloud will fully enter the offline market. In the offline market, everyone is paying close attention to the financial industry, especially banks. I have observed that it is an inevitable trend for banks to do distributed transformation. Li Feifei also mentioned before that on this cloud conference, you open source polarDB-X such a distributed database, I can understand that this is your combination of fists into the offline market?
** Wang Weimin: ** Yes, this is one of the key actions. Each participant that domestic does database is enthusiastic upsurge, manufacturer is very much. But for a lot of companies, they’re very worried about how big are you? Are we going to contribute 50-60 percent of your revenue? They’re really worried about the sustainability of the database company, and it’s really about trust. In my opinion, open source includes two aspects: first, open source is used to solve open problems and challenges with an open mind and attract more people to participate; second, open source is equivalent to intellectual crowdfunding;
In addition, open source also shows confidence and is a way for users to examine quality, which can greatly reduce the challenges of the trust-building process.
Finally, of course, we also hope to do ecology in this way, such as application ecology and talent ecology.
Reporter: So open source is just a strategy, not a necessary condition for database companies to stand out from the competition in the future?
** Yes, it isn’t. There are many open source products out there that may be popular, but not necessarily commercially successful. There are also many products, never popular or popular, directly can not play, it has to Open Source. Just because we want to keep doing it this way doesn’t mean it’s going to be successful, because there are too many open source projects that don’t end up being commercially successful. Not just in the database world, but in other areas as well. There was OpenOffice, There was Open Solaris. Even if we take Open Source seriously, there are a lot to learn and a lot to avoid, and a lot to do.
Reporter: Domestic database landing application is more and more, but in fact, more exist in the edge business, the core business is relatively less. Many enterprises deploy domestic databases at the same time, but also in the renewal of Oracle database, even up to now two lines running together. This phenomenon is due to the customer’s heart concerns – afraid of domestic database can not hold, or at present domestic database in the high-end, core business, there is indeed a certain problem?
Wang Weimin: ** Has both of these factors, and not only these two factors. It is true that many customers are trying to use more suppliers, including domestic database manufacturers, while continuing to use commercial database products, this is the current situation. There are several reasons:
First, commercial database after decades of development and massive users (millions) long-term real business load test, its enterprise characteristics, stability, business model, service and other aspects, are tested. No matter what product is used, the user business system can run efficiently.
At present, there is still a gap between many domestic manufacturers and commercial databases in terms of r&d investment, product maturity, product data, ecology, talent and other aspects. This is an objective fact, and we should face it squarely. Others have been working for 50 years, while we have been working for 10 years with much less investment. It is impossible for us to surpass others. This goes against objective laws.
Second, there are a lot of business scenarios that don’t require that kind of high-end product. This sentence has two meanings. First, many products currently have gaps in enterprise functions, features, completeness, performance, stability and other aspects (compared with commercial databases), but this does not mean that the gap is insurmountable, but it takes time to overcome. Second, in many business scenarios, many functions and features of commercial databases are not used and do not need to be used, which is shown by statistical data.
Reporter: What I’ve learned is that the vast majority of companies that use Oracle probably don’t use half of the functionality, half of the performance, and for many companies, the performance problem can be solved by many architectures.
** Weimin Wang: ** To a large extent, our products are adequate for many core business scenarios and typical business loads. However, the stability of our products or other aspects may have a long way to go, and we still need to accept the long-term and continuous load test of massive users.
Reporter: Can I understand this? At present, domestic database is available, but from the use of a certain distance? I have read before that the target of domestic database is divided into technical target and market target. The first market goal is to achieve independent and controllable localization, and the second is to go global. In terms of technical goals, it is divided into four goals: people have me without, people have me without, people have me. Do you think, at present the whole domestic database market from the technical target, market target, respectively is what stage?
Wei-min wang: * * * * I think from the perspective of technical target, function characteristics, there may be some feature is a man without I have, but for many of the universal demand characteristics and commonness, and we can catch up and supplement stage, “people have me have part” or “a small part of the people have my superior”, different vendors in the list of the complete degree are different.
The market is even worse. If the online and offline markets are counted, Aliyun has achieved the first place in China, surpassing Oracle. However, if we go back To the global perspective, according To the statistics of Gartner in 2020, Microsoft is the first, Oracle is the second, and Aliyun ranks the seventh in the world. In the MARKET of To B, cloud and database, there is still a big gap between us and leading enterprises.
Oracle, for example, makes almost $10 billion in net profit in a quarter. Even if the database only accounts for 40%, it’s probably billions of dollars excluding its standard services, which is a huge number. The net profits of China’s top 100 software companies may not even be that big together.
Reporter: What are some of the challenges you face helping customers get to O, MySQL, Teradata, etc.? How do you usually help customers?
** Wang Weimin: ** ‘s biggest challenge is ecological. No customer’s business system can achieve drop-in replacement. There may be application side in the north, data integration in the vertical, and a series of adaptation of operating system and hardware in the south. When you want to replace, many customers will say, “Not only should you replace the product, but also keep the original control, development, operation and maintenance, application and a series of processes. So, the ecological aspect is a big challenge.
The second is talent. Many enterprises require a certain degree of professional accumulation in talents, and changing products may require a learning process. For talents, the slope of the learning curve should be reduced and the learning cost should not be too high. So we have a lot of work to do in terms of compatibility. The work on compatibility is also a dynamic moving target. Today, when a certain point is achieved, the compatible object will move on and continue to do it. For compatibility work, there’s not much difference between 90 and 80, but getting from 90 to 95 is a huge amount of work, and getting from 95 to 100 is almost impossible. So we’re wondering if we should follow, what we should follow, what we should replace in other ways — or not consider compatibility at all, and so on. Of course, the trade-off is very difficult.
Reporter: That’s all for today’s interview. Thank Mr. Wang for accepting the interview.
The original link
This article is the original content of Aliyun and shall not be reproduced without permission.