Wandyers @Daily Youxian

The copyright belongs to the author. Commercial reprint, please contact this account for authorization, non-commercial reprint, please indicate the daily Excellent fresh big front end team and the original address.

Note: This article is a series of articles, which will be divided into scheme introduction, scheme landing scheme, specific scheme analysis, technical implementation details and many other articles, please look forward to.

background

Due To the problem of App domain name resolution in some regions, some B-side apps (hereinafter referred To as APPS) experience network request timeout, host cannot be found or abnormal errors occur. As a result, some delivery apps cannot be logged in and used, and some To B services are affected.

The target

Try to eliminate App network exceptions caused by external network reasons, such as abnormal external DNS resolution and unreachable domain name. After investigation, we summed up the following several schemes to prevent similar problems from happening again. After comparison, we finally selected some strategies as our batch landing scheme (the steps will be explained in detail later). We hope that the App can avoid DNS hijacking, reduce access delay, reduce user connection failure rate, improve user experience, and SDK has high scalability and reusability.

Project introduction

The following is a brief overview of the various implementation strategies and a simple comparison of their advantages and disadvantages.

Plan in detail

This chapter will elaborate on the strategy mentioned above, including the implementation process and some key details.

1. System DNS cache policy

This strategy is the current implementation strategy, without additional development, using domain names for requests, relying on system DNS and DNS resolution of network service providers. No further details.

2. Direct connection strategy

The direct connection solution is simple. When an exception occurs, such as the host cannot be found or the request times out, the system directly switches to the IP address to retry the request. In addition, the App always uses this IP address to make network requests before the next startup.

The priority of the IP address or domain name is LocalDNS (domain name) > built-in fixed IP address.

The flowchart is shown in the figure below:

This scheme is simple to implement and can be used as a temporary interim scheme in an emergency. However, it is not recommended to use this strategy for a long time, because the IP in the App cannot be updated dynamically in real time, which will increase risks in the future.

3. Configure file policies

When the App is started for the first time (including cold startup), it requests the back-end interface to deliver the configuration file. The data format can be defined in Appendix I, and the data format is persistently cached on the local App client for future use.

When the App starts, the configuration is read from the cache and verified for validity. If all expired, use the domain name or the built-in IP address of the system. If some or none of the IP addresses have not expired (in fact, in most cases, IP addresses are still valid within a certain period of time even if they have expired), direct connection requests are made using IP addresses in order of priority.

The priority of the IP or domain name is configuration file IP > LocalDNS (domain name) > built-in fixed IP address.

Compared with the direct connection solution, this solution is more flexible. The IP address can be changed at any time without affecting the production client.

However, if the domain name is unreachable for a long period of time, or the user does not open the App in a long period of time, but the domain name is unreachable when the App is opened, all the cache configuration files will be invalid, and the direct connection scheme will be degraded. In short, the configuration file TTL time threshold is not easy to control and is not flexible.

Because the configuration file is requested and the local cache is updated every time the App is launched, there is no flexible update based on the TTL of each IP.

This scheme will also be high degree of coupling with the project, is not conducive to expansion, each end needs to realize their own business logic, high degree of coupling with the actual business.

Security check of cached files or data is also essential (this part will be explained in detail in the following sharing) to prevent anomalies caused by malicious tampering by a third party.

The reference process is shown in the figure below:

The following two strategies are based on HttpDNS. HttpDNS provides domain name resolution services in Http or Https mode. Compared with the Local DNS of traditional carriers, it has the following advantages:

First, prevent DNS hijacking

2. Precise scheduling can be achieved, allowing clients to access the nearest service node

3. 0ms parsing delay

Iv. Quick effectiveness

5. Strong expansibility

4. Third-party HttpDNS policy

Tencent Cloud HttpDNS, Ali Cloud HttpDNS, DNSPod and other manufacturers to provide HttpDNS solutions, client access is convenient, small development, relatively stable.

However, due to the dependence on third-party services, you cannot customize some dependency rules, such as triage. In addition, the third-party service may also be abnormal, resulting in uncontrollable network status.

The topology is shown in the following figure [the picture is referred to the network] :

The flow chart is similar to the self-built HttpDNS solution. For details, see the flow chart in the “Self-built HttpDNS” section.

5. Create an HttpDNS policy

The self-built HttpDNS provides private DNS resolution services in Http or Https mode. Strictly speaking, this is not a downgrading strategy. Instead, it uses the self-built DNS alternative system DNS to get the direct IP address for network requests, and uses system-level caching as the final downgrading strategy.

Independent domain name or independent IP to provide HttpDNS domain name resolution service, self-developed HttpDNS SDK to achieve communication with the server; Each App accesses the SDK, completes Http DNS resolution, and obtains part or all available IP address list according to the domain name. It no longer relies on the existing domain name to access the background service, and accesses the service by IP every time.

The topology is shown in the following figure [the picture is referred to the network] :

When the App is started, the SDK needs to be initialized according to the following process. When accessing the service interface, IP is directly used to access the service interface to reduce delay, avoid PROBLEMS such as DNS hijacking, and facilitate integration in other apps.

The priority of the IP address or domain name is: self-built HttpDNS > LocalDNS (domain name) > built-in fixed IP address.

The main flow chart is shown in the figure below:

The difference between custom HttpDNS and profile policies is that the first SDK is provided for multiple quick access; Second, it can manage the life cycle of a single IP. The third SDK access party does not need to care about IP management. It only needs to make network requests based on the IP address or domain name returned by the SDK (the domain name is automatically switched to when no valid IP address is available).

Pay attention to the problem

  1. No matter which solution is adopted, the Host parameter configuration in the Header, Https certificate verification, SNI and other issues need to be considered, as well as the network request of App embedded web pages.
  2. No matter what solution is adopted, the fixed INTERNAL IP address of the App is used as the final Dr Solution by default.
  3. If multiple IP addresses of the same priority are valid, polling or priority scheduling is adopted.

The above mentioned problems will be explained in the following sharing of relevant implementation solutions.

proposal

Based on the above description and considering the current situation of App, it is suggested to temporarily adopt the direct connection policy (DS001) or the configuration file policy (DS002) in the initial stage, which can reduce the production environment problems. The App can be accessed normally.

The second step according to the actual situation, it is recommended to implement or access the third-party HttpDNS policy scheme (DS003) or self-built HttpDNS scheme (DS004) as soon as possible, and to achieve their own SDK for multi-end use, this scheme in flexibility, scalability, maintainability, controllability than other solutions more advantages.

The appendix

Appendix a

[{" hostName ":" a.t est. Cn ", domain name "ips" : / / / / / domain name list of the IP address of the corresponding {" priority ": 1, / / priority" ipAddr ":" 192.168.1.23." // IP address "TTL ":1568859453 // Timeout duration}, {"priority":1, "ipAddr":"192.168.1.23"," TTL ":1568859453}]}, {"hostName": "B.t est. Cn", "ips" : [{" priority ": 1," ipAddr ":" 192.168.1.23 ", "TTL" : 1568859453}]}]Copy the code