[Disclaimer: This content is entirely from personal experience summary, if any similarities, pure plagiarism]

I. Problem scenario

1. A user back-end service, using Nginx as load balancing, deployed two servers (A and B respectively), both of which can upload profile pictures.

2. When the user requests server A to upload the profile picture for the first time, the profile picture is saved on the disk of server A, as shown in Figure A-1.

3. During the second access, the user requests to load the profile picture from server B, which cannot be accessed because there are no files on the disk of server B. This is also A typical stateful service problem, as shown in Figure A-2.

Figure 1 A

Figure 2 A

Second, the solution

Solution 1: The files on the two storage devices are always consistent and full.

Solution 2: The same storage node is used to store and obtain files.

If the first approach is adopted, the disk data of servers A and B can be synchronized with each other. However, this approach is inefficient, has delays, and will cause synchronization storms. For example, there are 10 service nodes, and it will become A disaster if all 10 nodes need to synchronize with each other.

If you take the second approach, you can do the following four things:

1. Turn the avatar into Base64 encoding and store it directly in the database, but this takes up too much database storage and is not convenient for migration and backup.

2. Modify the forwarding policy of Nginx, so that all file upload and file load requests are forwarded to the same server, but such a single node is too heavy, losing the purpose of load balancing and high availability.

3. Share A storage node. The disk directories of servers A and B are mapped to the same storage device.

4. Share a storage service. This idea is to replace a shared storage device with a shared storage service, such as the OSS object storage service.

Iii. Implementation Plan

Before explaining the solution, you must know what NAS and NFS are. The following content is from Baidu Encyclopedia.

NAS (Network Attached Storage) is a device that is connected to the Network and has the function of storing data. It is a dedicated data storage server. Data-centric, it separates storage devices from servers and centrally manages data to release bandwidth, improve performance, reduce total cost of ownership, and protect investment. The cost is much lower than using server storage, and the efficiency is much higher. At present, the international famous NAS enterprises include Netapp, EMC, OUO and so on.

NFS (Network File System) is a UNIX Presentation Layer Protocol developed by SUN. It enables users to access files on the Network as if they were using their own computers.

Obviously, plan 3 and 4 are better choices, while Plan 3 is more conducive to cost reduction.

We use scheme 3 to share A storage node. As shown in the following figure, NFS can be used to mount A directory on the local disks of servers A and B to NAS storage.

In this case, service A and service B operate their own disk directories when uploading files. However, data will be synchronized to NAS storage in real time. In the Intranet environment, the delay can be ignored.

The advantage of this approach is that it is transparent to the operating system, transparent to the developer, and does not require any changes to the program.

The developer is completely unaware of the existence of NFS, and is simply manipulating local files for the purpose of file sharing.

There are many more uses for NFS, which we’ll talk about next time.