Hi, everyone. It’s my great honor to have this opportunity to share the experience and design ideas accumulated in the back-end development process through writing a blog post
Modular design
According to business scenarios, services are separated into independent modules to provide external services through interfaces, reducing system complexity and coupling, and realizing reuse, easy maintenance, and easy expansion
Practical examples in the project:
Before:
There is a “My Red envelope” function in the return purchase APP. The user’s red envelope data comes from multiple businesses, such as: Invite new users to register to receive 100 yuan red envelope, promote activity double red envelope, and other activities red envelope, a number of activities have achieved a set of different rules of red envelope and red envelope reward mechanism, resulting in red envelope can not be managed, can not reuse, difficult to maintain and expand
After:
- Reconstructing the Red envelope service
- Red envelopes can be managed backstage
- Red envelope information management, can add, can edit, can configure the rules of red envelope use, can manage user red envelope
- Unified processing of red envelope reward
- To access application services, you only need to focus on sending red packets to users
Design of the profile
Before VS After
Sometimes the business requirements put forward by the product are not considered in this aspect. Combined with the scene and future development needs, the modular design scheme is put forward during the demand discussion and can assist the product design
Universal service withdrawal
In project development often encounter some similar function, but different developers have their respective implementation, or because they can’t reuse and to develop a, iterative development led to a similar function, so we need to pull away the capabilities of independent service, achieve the result of reuse, and can continue to expand, saving the cost of the subsequent development, Improve development efficiency, easy to maintain and expand
Practical examples in the project:
Before
In business, users often need to be informed of information, such as SMS periodic notification, APP message push, wechat notification, etc
Developers when informed of demand function does not consider the follow-up development, we will notify platform, access to the third party information and then inform the simple packaging method, the subsequent have similar information notification requirements, another developer found this advice methods cannot meet the needs of their current, and then to know the third party platform to encapsulate the notification method, Or the subsequent requirements add the function of periodic notification, developers for business to achieve a periodic notification function, but can only be used in their own business, other businesses can not access, no one to do the withdrawal of this function, over time, successful evolution can be repeated development, and is not easy to maintain and expand
After
When it comes to such universal service requirements that can be removed, it will confirm with the product whether there will be similar needs in the future, and then suggest that the block requirements be removed into universal services for the convenience of subsequent maintenance and expansion
Design of the profile
Before VS After
Architecture-independent services
Some requirements during project development are irrelevant to the project business, such as: Collect user behaviors and habits, collect commodity exposure clicks, collect data and provide it to BI for output of statistical reports, and promote public business (Grapefruit Street and return to public). For similar needs, we combine application scenarios, consider the independence of services and future expansion needs, and construct independent projects for maintenance. Independent distributed deployment on the server does not affect the existing primary service server resources
Practical examples in the project:
The architecture user behavior tracks individual services, and the volume of requests for this service is estimated prior to development, and there will be a relatively high number of concurrent requests
Architecture scheme:
- Nodejs is used as the server for project construction
- Single-process, event-driven and non-blocking I/O, is ideal for handling concurrent requests
- Load balancing: Cluster module /PM2
- Architecture NodeJS standalone service
- Provide a service interface to the client
- The interface does not operate directly with DB to ensure the stability under concurrent operation
- Data is asynchronously imported to the database
- Programmatically fetch data from: message queue =>mysql
- nodejs+express+redis(list)/mq+mysql
Service architecture diagram for user behavior tracking services
High concurrency optimization
In addition to the vertical and horizontal expansion of servers, high-concurrency optimization can be used as back-end development to ensure the stable operation of services in high concurrency, avoiding losses caused by service stagnation and bringing bad experience to users
Cache:
- Server side cache
- In-memory database
- redis
- memcache
- way
- Give priority to the cache
- DB penetration problem
- A read-only cache
- Update/Invalid delete
- Give priority to the cache
- Pay attention to
- Memory The allocated memory capacity of the database is limited, so proper planning and abuse will eventually lead to insufficient memory space
- You need to set the expiration time for cache data. Invalid or unused data automatically expires
- Compressed data Caches data without using fields that are not added to the cache
- Deploy cache servers in distributed mode based on services
- In-memory database
- Client cache
- way
- The client requests the data interface, caches the data and the data version number, and carries the cached data version number with each request
- The server compares the reported data version with the current data version
- The data list is not returned if the version number is the same. The latest data and latest version are returned if the version number is different
- Scene:
- Update infrequent data
- way
Server cache architecture diagram
asynchronous
Asynchronous programming
- Method:
- Multithreaded programming
- Nodejs asynchronous programming
- Scene:
- SMS notification after successful participation
- Operations that are not required by the main business logic process to allow asynchronous processing of other ancillary business, etc
Asynchronous service processing
- way
- The business interface pushes the data reported by the client to a message queue (MQ middleware) and responds with the results to the user
- Write a separate program to subscribe to the message queue and process the business asynchronously
- Scene:
- Grab limited red envelopes on the hour
- After successful participation, euphemistic reminder: the red envelope is expected to be distributed after X days
- Asynchronous processing is allowed for services with a large amount of concurrency and no better optimization solution is available
- Grab limited red envelopes on the hour
- Note:
- Control the progress of queue consumption
- Ensure idempotency and final consistency of data
- Defect:
- Sacrificing the User Experience
Architecture diagram of Service asynchronous processing
[Business asynchronous processing] In addition to being used in high-concurrency services, this architecture is also used in the design of generic services above
Current limiting
By limiting the number of requests in the activities of class SEC killing, we can avoid overselling, overcollar and other problems with high concurrency. By controlling the flow at the front end, we can disperse the requests and reduce the concurrency
- Traffic limiting on the server
- Redis counter
- For example: class kill activity
- Client flow control
- By participating in activity games
- Red envelope rain/small games, and so on
Service degradation
When the server resource consumption has reached a certain level, in order to ensure the normal operation of the core business, it is necessary to abandon the pawn, abandon the car to ensure the normal operation of the service, service degradation is the last resort, to avoid the loss of service stagnation caused by server downtime, and to bring bad experience to users
- Business down
- From a complex service to a simple service
- From dynamic interactions to static pages
- To the CDN
- Pull JSON data prepared in advance from CDN
- Navigate to the CDN static page
- Stop the service
- Stop non-core business and give a gentle reminder
Summary diagram of high concurrency optimization
Brush proof/wool proof
Most of the company’s product design and program apes for promotion business of brush prevention consciousness is not strong, the activities of the business in the process of design and development did not add the function of anti brush to business, to those who like brush activities created many corners When you find yourself by brush, has produced a big loss, less the hundreds of thousands of, many tens of thousands
With the lure of interests, a new professional “brush” has emerged. To make a living by professionally brushing Internet activities, I have raised N mobile phones +N mobile phone numbers +N wechat accounts, and the bonus money I brush can be withdrawn, and the products I brush can be resold at a low price, opening up a new grey industrial chain
We have to take up arms (code) for self defense, risk control, high threshold, through verification and restrictions to reduce the possibility of risk, reduce the loss of risk when it occurs
Common routines are listed here (specific applications combined with business scenarios) :
Verify request validity
- Check the validity of request parameters
- Request header check
- user-agent
- referer
- . .
- Signature verification
- Sign the request parameters
- Equipment limited
- IP restrictions
- Wechat UnionID/OpenID validity judgment
- Verification code or SMS verification code
- Sacrifice experience
- Self-built blacklist system filtering
Business risk control
- Limit device/wechat participation times
- Limit the maximum number of rewards
- Prize pool limit
- Design based on specific business scenarios… .
Response to the role
- The average user
- Technical users
- Professional brush guest
- There is no good way to limit it
Brush/wool protection outline drawing
additional
- The signature rules in APP/H5 should be developed by the client, and then extend the API to call the front-end JS, and call the client’s expanded signature when H5 initiates the interface request. In this way, the signature rules constructed in the front-end JS can be avoided and cracked
Concurrency issues
Multiple operating
- Scene:
When == and user == trigger clicks for several times, or simulate concurrent requests, there will be problems of multiple operations, such as: check-in function, check-in only once a day, can get 1 point, but in the case of concurrent, there will be problems that users can get more points
- Analysis:
Simplified check-in logic looks like this:
Check whether there is a check-in record –> No –> add today’s check-in record –> add user points –> Successful check whether there is a check-in record –> Yes –> Today’s check-in
Suppose that at this time, user A concomitant two check-in requests, then it will enter [query whether there are check-in records] at the same time, and then return no at the same time, two check-in records will be added, and the integral will be accumulated
- Solution:
In the most ideal and simple scheme, it is only necessary to add the combined unique index of [date of check-in] + [user ID] in the sign-in record table. In the case of concurrency, only one index can be added successfully, and other adding operations will fail due to the unique constraint
Negative inventory
- Scene:
When == multiple users click concurrently to participate in activities, such as: lottery, there is only one prize in stock at this time, and theoretically only one user can get the prize. However, in the case of concurrency, they will often win the prize successfully, resulting in the expenditure of the prize and increasing the cost of the activity
- Analysis:
The logical flow in question generally looks like this:
Winning – > query prizes — — > have inventory > update inventory prize — — > add a winning record winning prizes — > > told query inventory — — > no prize > told not winning
Assume that in the lottery, there is only the last inventory of prize A, and then users A, B and C participate in the activity at the same time and the winning prize is ALL A. At this time, there is one commodity inventory to query, and the inventory will be updated to add the winning record, and then the prize will be won at the same time
- Solution:
The ideal don’t need to use how to do an inventory of the SELECT awards inventory operation, only need to UPDATE inventory – 1 WHERE prizes are inventory > = 1, after the success of the UPDATE it is have inventory, and then do the follow-up operation, the concurrent UPDATE success will only have a user
Conclusion:
In the development of the business interface needs to take into account the == with user == and == multiple user == concurrent scenarios, so as to avoid the occurrence of data anomalies during concurrency, resulting in high costs can use the following tools to simulate concurrent testing:
- Apache JMeter
- Charles Advanced Repeat
- Visual Studio performance load
Data Acquisition Skills (extras)
Common solution
- Get platform data interface
- Mock interface request
- Data parsing and filtering
- Data is constructed into the repository
Use selenium+Headless automated testing framework
- The development of
- Python is recommended
- python+selenium+headless
- Control the frequency of requests to avoid being restricted by the platform
- Use a proxy IP to bypass the request IP limit
- Python is recommended
- advantages
- No need to simulate interface requests
- Unable to overcome data interface mock request (encrypted signature, etc.)
- Interface version changes frequently (reinvestigation required)
- Platform interface/page version changes, can be quickly adjusted
- Just adjust the location of the HTML element (class/ ID) where the data is collected.
- Users can operate/select/click/simulate login, etc
- Login can be simulated after failure
- You can send the login QR code to Dingpin to scan the login code
- No need to simulate interface requests
- Application Scenarios:
- Competing product data collection
- Taobao commodity price and self-built commodity database backstage price monitoring
- Taobao voucher amount and self-built commodity database backstage voucher amount monitoring
- . .
Reflect the crawler
In the process of data collection, some platforms will set anti-crawler strategy for important data requests to avoid data mining and utilization by rival products, and consume a lot of resources to bring down the server. Anti-crawler and anti-crawler is a contest between technologies, and this war without fire will never stop. (Why should programmers embarrass programmers?)
-
Anti-crawlers can be divided into the following two categories
- Server restrictions
- Server-side row request restriction prevents crawlers from making data requests
- The front-end limit
- The front end interferes with and confuses key data through CSS and HTML tags to prevent crawlers from acquiring data easily
- Server restrictions
-
Cracking server restrictions:
- Mock setup request headers
- Referer
- User-Agent
- Authorization
- .
- Crack the signature
- Signature rule
- Find the signature rule in JS
- Signature rule
- Control the request leveling rate
- Adjust the request time to delay the request
- The proxy IP
- Switch the proxy IP address of the request, self-built or third-party
- The login limit
- Bring the Cookie/Authorization after successful login
- Captcha limit
- Image recognition, based on library/third party
- Poisoning crack
- To prevent poisoning, the data needs to be sampled and verified
- Mock setup request headers
-
Cracking front-end limitations:
- Font-face, custom font interference
- Find the TTF font file address, then download it, use the FONT parsing module package to parse the TTF file and map it to Chinese with the text encoding
- Pseudo-element hidden form
- Find XXXX ::before {content: “Chinese “; } corresponding Chinese
- Backgroud – image shift
- Map the background image’s Position offset to the content in the image
- HTML tag interference
- Filter out confusing HTML tags, or read only valid HTML tags
- Font-face, custom font interference
conclusion
As a back-end developer, development is not only a complete demand function, want to combine business scenarios for the reasonable design, architecture in the future, pressure test was carried out on the core business optimization, in order to ensure the normal operation of the business under the concurrent can, at the same time to consider security and prevent the brush, wool party, avoid bad code smell on coding, faces the abstract development, appropriate use of design patterns, avoid technical debt
Development should be kept in mind.
- The existence value of technology is to let technology drive business growth and achieve corporate profit growth
- There is no best architecture, only the most suitable architecture
- Development languages are just tools, the right tools for the right scenario
- Abstract thinking is an insight into the essence of different problems, understanding the deep model of product demand, and treating the root cause rather than the symptoms
- Knowledge is very important, although it can not directly give you wealth, but can give you a lot of opportunities, never too old to learn
Welcome to Star [Big Talk WEB development]