Why does web technology architecture evolve
I have personally identified two driving forces for the evolution of our technical architecture, which drives why we are evolving our website’s technical architecture:
1. Internal drive: We expect to do better in our current business and develop more new business
2. External driving force: the rise of users and the diversification of user types
The two drives are not independent, but more often parallel. I think Taobao is the result of two driving forces in parallel.
The reason for this evolution is simple. But when should we evolve the technical architecture of a website, and how? To be honest, I didn’t have any experience in dealing with these problems, and in reality, every enterprise faced different problems at that time. Therefore, it was difficult for me to summarize what was the time of evolution from experience.
But I can approach this problem from another Angle: look at the internal and external structures of the site, and find out where those structures might go wrong. Once you know or anticipate the problem points, you’ll know how to evolve. Similarly, if you understand the structure of a PC, you will know when to add memory and when to add hard disk.
So let’s first look at the external structure of the site:
The external structure is composed of the following parts:
U: indicates the user group. How does our site evolve as our user base changes? For the analysis of user groups, I can know the following dimensions: number, type and geographical location (region).
N: network environment. The network environment is different in every region. You can imagine why we need CDN. How do we evolve our site when we expect users in every region to have a good experience?
S: Security. How safe do we want to be? This is related to the current stage of the site and the nature of your site.
C: Stands for our website. Belonging to an internal structure
Internal structure of the site:
Composition of internal structure:
A: Application services.
D: Data service
To sum up, these components provide a baseline for thinking about whether or how the site should evolve.
So why don’t we just design the site “big” from the start? “Do not attempt to design a large website,” Li wrote in an afterword. “The reason is that the Internet operates according to its own laws, and the short history of the Internet has repeatedly proved that such attempts do not work.” “Large websites are not designed, they evolve,” he said. I need to be reminded of this last statement: “Not by design” does not mean “by design”. Architecture technology is a topic that programmers can’t get around, about distributed, microservices, source code, framework structure, design patterns and other technologies I share in the group 650385180, free download. I hope to help friends and children in the development of this industry, spend less time looking for information in forums and blogs and other places, and spend the limited time really on learning, I share these videos. I believe that for those who have worked and met technical bottlenecks, there must be something you need in this group.
As for “large website design”, my personal view is that now we have the “cloud”, computing can be bought, as long as our design can adapt to the “cloud”, CAN I start to design large websites?
What are the problems encountered in the process of evolution
– the original
Start with a small website. One server is enough.
– Data services are separated from application services
More and more users represent more and more data than a single server can handle. We separated the data service from the application service, and configured the application server with better CPU and memory. Better and bigger hard drives for data servers.
– Use cache
Because 80 percent of business access is focused on 20 percent of data, if we can cache that data, performance improves immediately. There are two types of caches: local caches and remote distributed caches. Which one do you use? Or both? I don’t know yet.
Here’s a question: What data should be cached? There should be some rules.
– Use a server cluster
When this server reaches its maximum capacity, it becomes a bottleneck. You can buy more powerful hardware, but there is always a ceiling. At this point, we need a cluster of servers. At this point, you have to add something new: a load-balancing scheduling server.
However, there is one issue to consider when using server clusters: Session management. Session management can be done in the following ways:
If we make sure we use our own Sticky dishes every time we eat, it’s good if we keep our Sticky dishes in a restaurant every time we go to eat there.
The problem with this approach:
1. When a server restarts, all sessions on it are lost
2. The load balancer becomes a stateful machine, which is difficult to implement Dr
Just like we keep a copy of our own at all restaurants. Not suitable for large-scale clusters, suitable for the situation of few machines
Problems with this scheme:
1. The bandwidth between application servers is abnormal
2. A large number of online users occupy too much memory
Cookie-based: similar to bringing your own dishes and chopsticks with you every time you eat
Problems with this scheme:
1. Cookie length limit
2. The security
3. External bandwidth consumption of the data center
4. Performance impact: The server has more content to handle each request
Session server: Can also be clustered. This mode is applicable to the large number of sessions and Web servers
Considerations for such a scheme are:
1. Ensure the availability of the session server
2. We need to make adjustments when we write the application. I don’t know if the application server can make this part of the logic transparent
– Database read/write separation
A portion of the database reads (uncached, cache expired) and all writes still need to go through the database. When the number of users reaches a certain amount, the database will become the bottleneck. Here we use the hot standby function provided by the database to import all read operations to the slave server. Note: Read-write separation addresses the problem of reading stress.
Because the database reads and writes are separated, our application will have to change accordingly. We implement a data access module so that upper-level code writers don’t know about read/write separation. Here, I would like to know if I use ORM model, how to achieve read and write separation?
Database read/write separation may encounter the following problems:
-
Data replication problem: consider delay, database support, replication conditions support. Don’t forget, with extension rooms, this is even more of a problem.
-
Application routing problem to data source
– Use reverse proxy and CDN to speed website response
CDN can be used to solve the problem of access speed in different regions. Reverse proxy caches user resources in the server room:
– Use a distributed file system
– Dedicated database dedicated database: data is split vertically.
This can solve some data writing problems
Problems encountered when splitting a database vertically:
-
Cross-business transactions
-
There are too many configuration items in the application
There are several approaches to the problem of transactions:
-
Using distributed transactions
-
Remove transactions or do not pursue strong transactions
-
Architecture technology is a topic that programmers can’t get around, about distributed, microservices, source code, framework structure, design patterns and other technologies I share in the group 650385180, free download. I hope to help friends and children in the development of this industry, spend less time looking for information in forums and blogs and other places, and spend the limited time really on learning, I share these videos. I believe that for those who have worked and met technical bottlenecks, there must be something you need in this group.
– The amount of data or updates in a data table of a service reaches the bottleneck of a single database: horizontal data split
Split the data of the same table into two databases
Problems encountered in horizontal data splitting:
-
SQL routing problem, need to know a User on which database.
-
The primary key has a different policy.
-
Performance issues at query time, such as paging issues
-
Use search engines: Solve data query problems
-
Some scenarios can use NoSQL to improve performance
-
Develop unified data access module: solve the data source problem of upper application development
– Service splitting and application splitting
As web sites become more and more complex, it becomes impractical to build a single large application to do it all. From the management point of view, it is not convenient to manage. However, it is difficult to find a general model for business separation, which is a mixture of enterprise management issues and technical issues. At the same time, it is related to the specific situation of each enterprise.
How to implement SOA is a big topic that is beyond the scope of this article.
I took a screenshot from Cheng Li’s 2008 talk to illustrate what a post-SOA architecture might look like:
– Non-functional issues
– Security and monitoring problems
– Release issue: A new architecture means a new release method
– the engine room
– Organizational structure changes
Changes in our technical architecture will inevitably lead to changes in our organizational architecture, and vice versa.
It seems that we should not be in charge of this part, but I think our technical staff should also participate in the design of the organizational structure. For example, organizational structures are designed to deal with performance, and performance sometimes resembles the laws of a country. What happens if a country’s laws are not sound? You know.
We also have to consider the cost of learning the new architecture.
I am currently reading relevant books on this part, but I do not have a systematic understanding.
Conclusion:
– About the order of evolution
In reality, the evolution of a technology architecture is not necessarily outlined from beginning to end, so it depends on the circumstances.
– About traditional evolution versus modern evolution in a cloud environment
Unfortunately, only Li Zhihui talked about cloud, and only clicked — “Now more and more people’s websites from the beginning of the establishment is built on the cloud computing services provided by large websites, all the resources required: Computing, storage, network can buy linear scaling on demand, do not need their own bit by bit to piece together a variety of resources, comprehensive use of a variety of technical solutions to gradually improve their own website architecture.
Because I haven’t been using the word “cloud” long enough to conclude that there is a difference between a cloud architecture and a traditional cloud-free architecture as it evolves.
When it comes to traditional architectural evolution, my own conclusions and reflections are as follows:
There are two main dimensions to consider when adjusting the architecture of a website: data services and application services. In the process of adjustment, it is necessary to distinguish which point is the bottleneck and which point has the highest priority for optimization. At the same time, the most important point: although we are technical personnel, we should also learn business knowledge, so that we can distinguish between business problems and technical problems when considering problems, and then we can apply the appropriate medicine. You have to understand that there are some problems that are not more effective with a technical approach than with a business approach. 12306’s timeshare tickets are a case in point.
The above summary and thinking are wrong, welcome to be corrected. Thank you very much. Also welcome to visit my home page.