1. DNS mapping

Whether it is HTTP or Socket long connection, the first step is to obtain the IP address through DNS, and then obtain the corresponding resource according to the IP address. In this process, if the IP address corresponding to the domain name exists in LocalDNS, the IP address will be directly returned, similar to in-app caching. If the IP address does not exist, the system queries the authoritative DNS for the IP address to be accessed, and the queried IP address is cached in the LocalDNS.

However, almost every time an iOS device is disconnected and reconnected, restarting the device invalids the DNS cache and triggers a re-query. Meanwhile, if a new user accesses api.weibo. Cn through China Unicom before, he/she will not query the authoritative DNS of Sina due to the existence of localDNS cache, and the RETURNED IP is the IP of China Unicom, which will slow down the user’s access.

Another thing that caching can cause is that when the authoritative DNS changes the mapping between the domain name and IP, the user can access the wrong server or resources directly because the LocalDNS cache is not changed in time. There are also many third – and fourth-tier operators that direct domain resolution to their caching servers and replace the ads on their web pages with their own or embed their own ads.

HttpDNS, a traffic scheduling solution based on Http and domain name resolution, can largely prevent the above problems.

HttpDNS principle:

A. The client directly accesses the HttpDNS interface to obtain the IP address with the optimal access delay configured in the domain name management system. (For the sake of fault tolerance, it is definitely necessary to retain the carrier’s LocalDNS resolution mode in app.)

B. After obtaining the IP address, the client directly sends a service protocol request to this IP address. Using Http requests as an example, you can send a standard Http request to the IP returned by HttpDNS by specifying the host field in the header.

HttpDNS based extensions:

A. Maintain A Serve IP List in the App. Store each IP address retrieved by App from HttpDNS into the array and set the weight. Theoretically, the weight of IP resolved from HttpDNS is the maximum. The List can be updated when the App is started, and the IP address with the largest weight of the Serve IP List from the local cache is retrieved for data initialization (if the List is not available at the first startup, the LocalDNS is used for data parsing). The weight setting mechanism in the Serve IP List is that the IP addresses resolved from the DNS server have the highest weight. Each IP address obtained from the List should be the one with the highest weight. The IP addresses in the list must be dynamically updated based on the success or failure of the connection or service. In this way, even if the DNS resolution fails, users can obtain an APPROPRIATE IP address for access after a period of time.

B. Conduct data statistics on IP. In all apps, the average time, the longest time, the shortest time, the number of successful requests and the number of failed requests for each IP address are counted. It should be noted that statistics should be made in different network environments, such as Wifi, 4G and 3G, and IP with excellent data in different network environments are stored and delivered to the APP for use. In this way, the collected IP can be measured according to different network environments each time the App is started, and the best IP can be selected for request. It should be noted that the speed test must be re-conducted when the network environment is switched. This saves DNS resolution time and hijacking problems.

C. Place pictures, audio and other resources in a separate server and separate them from other resources. The first is that multiple domain names can increase the number of concurrent downloads. Because the client has a limit on the number of concurrent downloads for the same domain name, multiple domains can increase the number of concurrent downloads to speed up loading. Of course, you should not use too many secondary domain names, because too much time to take into account DNS resolution. The second is convenient management, generally speaking, pictures in the site loading is the most bandwidth, can use an independent server to facilitate later management; Asynchronous loading can also be used to enhance the user experience. At the same time, pictures are mostly static content, which can better use CDN acceleration. The third is that if a separate server is used, there can be a difference in security Settings for Settings, which is very convenient.

D. In the prevention of hijacking, it is necessary to remove the suffix name of resources, such as. Mp3,. Json suffix, so as not to hit the interception of operators.

In general, using HttpDNS to resolve domain names, bypass the three or four operators will resolve domain name problems, after HttpDNS returns the correct IP, we are directly using IP to HTTP requests, just need to pay attention to the security of communication content.

2. Resource optimization

Resource optimization is basically to reduce the size of transmitted data as much as possible and select the appropriate data format.

1. The first is the solution of image size. Webp is used to replace JPG and PNG images to a certain extent.

2. Use a ProtocolBuffer instead of Json, because the ProtocolBuffer is smaller than Json, cross-platform, and easy to serialize and deserialize.

3. Request compression

A DNS query is followed by a TCP handshake to establish the connection and send the request data. For TCP, the size of a single IP packet is limited by the MSS value. In the network environment where most users live, the size of each packet is about 1.5KB. Due to the SLOW start feature of TCP, some local IP packets are temporarily cached on a new HTTP connection, which increases the overall request delay. Therefore, we should try to compress our network Request business data as much as possible, reduce the number of IP packets of a Request, perhaps make users experience less RTT, and reduce the user perception of Request delay.

4, request merge

For non-critical business data or requests with low real-time requirements, the number of interactions with the server can be reduced by merging requests. On the one hand, the server pressure can be reduced. On the other hand, the client traffic can be saved by merging requests and then compressing them. This type of request is commonly seen in non-service requests such as SDK and Crash log collection.

5. Request security

Use HTTPS for basic network security. For sensitive data, encryption methods such as MD5, AES, DES, and RSA are used to ensure secure data transmission and prevent data interception or tampering.

6. Reasonable concurrency

In some service scenarios, multiple requests may be generated in a cluster. In this case, you need to set a reasonable number of concurrent requests. If the number of concurrent requests is too small, “bad” requests will block “good” requests. If the number of concurrent requests is too large and bandwidth is limited, the overall request latency will be increased.

7. Data caching

1, use HTTP cache, reduce the number of requests, reduce the amount of transmission (can not pass the body does not pass) HTTP network cache reduces the number of requests to the server. When a request completes downloading the response from the server, a cached response is saved locally. The next time the same request is made, the locally saved response is returned immediately, without the need to connect to the server. NSURLCache does just that. It can be cached in both Memory and Disk.

2. Store Html, JS, CSS and other webpage static files on the mobile phone, mark the version, and establish an appropriate update mechanism. Use NSURLProtocol to cache web pages (first time), change network IO to local IO, improve H5 experience.

8. Reliability assurance

Reliability assurance is also an aspect that is easily overlooked, but before going further, Request can be categorized by business attributes.

  • Type 1: critical core business data that is expected to reach 100% of the server.
  • The second type: important content request, need higher request success rate.
  • Third category: general content request, no requirement for success rate.

The reason for categorizing requests into three categories is to distinguish between reliability guarantees. Theoretically, we should try our best to achieve the highest success rate of all requests. However, client traffic, bandwidth, mobile phone power, and server pressure are all limited resources, so we adopt a strategy to ensure high reliability only for critical network requests.

The first type of request is similar to the message sent by wechat. Once the message data is sent from the input box, the message data will definitely reach the other party from the user’s perspective. If the network environment is poor, the network module will automatically quietly retry after a period of time, and the user will be informed of the failure of sending through product interaction. Even if the failure, the requested data (message itself) will remain in the client.

For this kind of request processing mode, the first step is not to send over the network, but to persist the request to DB first. Once in DB, the request data will still be there even if the network is disconnected, the power is off, or the data is restarted. You only need to restore the request data when the App restarts and send it again. The second step is to send the request. If the request fails, it is added to the retry queue, and if it succeeds, it is removed from the retry queue. There also needs to be a common mechanism behind retry queues, such as how often to retry and how many times to give up. In the worst case, the App is killed after the request fails to be sent. We need to reload all failed requests from DB after App restart and try again. If it is a type 1 request, the above steps can basically ensure the reliability of the request, but 100% is difficult to achieve. If the user reinstalls the App when the request fails, all persistent data is lost and the request data is lost, but this extreme scenario is very rare.

The example of the second type of request can be the home page that users see when our App is started. The content of the home page is obtained from the server. If the first request fails, the experience is poor. Generally, three retries can eliminate network jitter. After three failures, the request is considered unsuccessful and the user is informed through product interaction.

The third type of request is the least important, such as the UV acquisition into the Controller. This type of request only needs to be made once, and failure will have no negative impact on the product experience.

9. Multi-channel

Now many teams with technical conditions have their own TCP long-connection channels, and even UDP channels, which can greatly improve the probability of success in the network environment with high packet loss rate. If we can have a combination of HTTP, TCP, UDP three network channel, in some scenarios, without considering traffic (such as wifi), you can request for a network, two channels or three-channel volley, the speed and reliability of the request is successful has obvious curative effect, but the client and server requires for the business scenario to heavy. UDP is widely used in VOIP services, though it is also said to be partially enabled by large companies such as Taobao.

10. Network environment monitoring

Nowadays, although the network environment is getting better and better, Wifi, 4G and 3G have been well popularized in the first and second tier cities, there are still many scenarios that will lead to the sudden deterioration of the network state, such as getting into the elevator, taking the train, meeting places with many people, and cutting off the Wifi after going home from work. These scenarios are not uncommon in life. A robust network module needs to carefully detect network changes and retry requests accordingly.

11. Monitor the success rate of requests

Network module should be able to monitor the current App request success rate, the failure rate higher request, to bring business data, mobile network environment, system parameter and so on, the user is not active when can pack reported to server side, one can find out more to optimize business scenarios, second can real-time monitoring on the server side health status, Third, it can accurately judge whether each network optimization is effective from the data level.