We have developed an Intranet penetrating software Notr, which has undergone nearly a year of change. Recently, the software is basically stable, and users have not mentioned any bugs, which is very gratifying. Also due to the recent state of unemployment, plenty of time, so the version of the entire software change to do a record.

Origin of the project

The origin of Notr project is gTUN, an open source project I wrote in my spare time. Gtun is a micro PN in essence, and the main purpose of writing GTUN is to solve two problems:

  1. Easy access to the company from home
  2. It’s for the Internet at home

At present, the family has been using GTUN to access the Internet, mainly combined with raspberry PI as the gateway, so basically all the devices at home can be used.

As for accessing the company Intranet, this is a relatively dangerous behavior, and our small company is relatively loosely managed so we still occasionally use it when we need to.

After writing a version of gTUN project, I came to know the word Intranet penetration, but GTUN’s Intranet penetration is not thorough enough. After some reflection, I think that adding a perfect Intranet penetration function in gTUN project will only make GTUN forget its mission and become very bloated. So I took this functionality and wrote a separate project to address Intranet penetration — Notr

The initial positioning of Notr has two main points:

  • Notr itself provides Intranet penetration, so it is a product
  • The Notr software itself should keep the software as simple as possible, without too many bells and whistles, and its basic mission is to let users know which port they need to penetrate on the machine.

Based on the above two points, we added dynamic domain name resolution (DDNS) and service registration on the basis of gTUN Intranet penetration. The final result presented to the user is as follows:

Notr technology changes

After discussing the origins of the project, you should then talk about the specific technical change process of the project and see step by step what adjustments the project underwent.

The initial release

The initial version is very simple, just two programs, is also the smallest model of the whole system, including the client and the server two, the server has two functions:

  1. Service client (authentication, session saving, IP address assignment, etc.)
  2. Start a local port for each port that needs to be penetrated

Major issues with this release:

  1. First of all, we need to know the IP address of the server, port and other information, we can not tell the user to specify which server
  2. The port information of the server is not fixed, especially for HTTP and HTTPS, and the public number is not used for debugging port 80 and 443
  3. The user’s authentication key is manually specified on the server node
  4. The user finally gets the IP address mapped to the public network

All in all, this version of the reason is not a product, pure Demo, with their own completely no problem, but as a service to others, obviously not to take.

Add the version of the service registry node

In order to solve the first problem of the previous version, that is, IP address and port need to be specified by the client, we decided to develop an interface (the module of this interface is called Registry or Controller), so that the client can call this interface before the server. Get the IP and port information of the server, that is to say, the interface service has a global view and saves the information of all the server nodes, which needs to consider a question — how does Registry know the existence of the server node and whether the server node is still working?

The initial solution was to invoke one of Registry’s register interfaces through HTTP at startup and unregister interfaces at exit. However, it is not always true that unregister is not invoked just because the service is running. The communication protocol between Server and Registry is changed from HTTP to TCP long links.

In order to solve the second problem, that is, for HTTP and HTTPS can not achieve port reuse, multiple users can not share port 80 and 443, we began to introduce nginx, through nginx to help us achieve HTTP and HTTPS proxy. This process is easy to understand, we have nginx do the reverse proxy for us, we no longer listen on the port itself, nginx does the distinction by Host.

To address the third issue, we started to introduce functionality into the Registry module, which was the most important step in the Notr transition from a Demo to production. We added a user module to the Registry module. All user registration, login and key management are implemented in the Registry module.

To solve the fourth problem, we added a domain-related module — DDNS. Our DDNS principle is:

  • If the user is already registered, we generate a fixed domain name based on the user name information
  • If the user is not registered, we generate a random domain name, so that the non-registered users can also play

In this way, no matter how the underlying server node is switched, it is transparent to the user, and the user is not aware of it. It is only the modification of the domain name resolution record in the background.

The sequence diagram for this version is shown below:

By now, the version was almost ready to be given away for others to try, as the software problems began to surface:

  • Local services need to listen for 0.0.0.0 because we’re actually accessing the IP address of our local virtual network card, whereas many Web containers might default to listen for 127.0.0.1
  • Windows requires a TAP driver, which dissuades many people
  • Need administrator authority to start, a lot of users security awareness is very high, and persuaded a part of.

This problem lasted for a long time, and finally during the Spring Festival holiday, a program adjustment, these problems were completely eliminated.

The version that does not use a virtual nic

In the previous version, all the problems were caused by the solution using a virtual network card to build a virtual LAN. Because of the virtual network card, Windows needed to install the TAP driver, and because of the virtual network card, administrator rights were required to start the program.

In the end, we decided to remove the virtual network card, but externally, almost nothing changed. Although we eliminated the virtual network card, internal communication from server to client can still be through the virtual IP address, so as for the Nginx reverse proxy layer, there is no need to make any changes. Compared with the previous version, the client reduces the installation of virtual network cards and administrator permissions, which should be more convenient for users.

To discuss the changes of this version in technical details, our key change is to use DNAT to direct all the reverse proxy traffic of Nginx to a TCP port that our server listens to, and find the corresponding client of this data flow internally, and forward the data to the client. The previous version was completely routed, and this is the biggest difference between the two solutions.

Load Balancing Policies (2019.06.11)

Currently, all the NOTR Intranet penetrating service nodes are deployed in mainland China. Recently, we encountered a problem with an overseas user who wanted to use the NOTR Intranet penetrating service node. The client runs in Singapore. Here’s the problem:

  • When overseas clients communicate with Mainland China, long-distance TCP packet loss is serious and delay is high

According to the user feedback, the delay reached more than 200 milliseconds, and there was a lag phenomenon when using SSH, which was not very smooth. At that time, a Hong Kong node was started urgently. However, according to the original load balancing strategy, the user could only say that there was a certain probability that the Hong Kong node would go, but could not guarantee that the Hong Kong node would go. To address this issue, Notr’s load balancing policy adds a mechanism:

  • Determine the region to which the client belongs based on the client IP address and select the service node in the region

In this way, the delay of overseas users connecting to overseas nodes is changed from 245ms to 43ms. It’s also smoother to use.

I thought about adding this function a long time ago, but since most of the users were in China at that time, and there were few foreign users, basic users could accept it, and I didn’t expect that there would be overseas users who wanted to use it, so I didn’t develop this function. Now this function has been officially launched.

The last

There are still many visible problems to solve in Notr, but the progress will be relatively slow due to my limited energy.

Some time ago, a user cut off his user information to me. I checked it and found that it expired in August 2018. At that time, the probation period of each user was one month, that is, the first user from July 2017. At the same time, I feel happy, happy that I have been able to do this for so long, although I don’t know what the result will be.