An overview of the
Distributed file system: DFS, also known as Network file system. A file system that allows files to be shared across a network across multiple hosts, allowing multiple users on multiple machines to share files and storage space.
FastDFS is written in c an open source distributed file system, give full consideration to the redundancy backup, load balancing and linear expansion mechanism, and pay attention to the high availability, high performance and other indicators, features include: file storage, file synchronization, file access (such as file upload, download), solved the problem of the large capacity storage and load balancing. It is especially suitable for small and medium files (recommended range: 4KB < file_size <500MB), and for online file-based services, such as album websites and video websites.
FastDFS architecture
The FastDFS architecture includes Tracker Server and Storage Server. The client requests the Tracker Server to upload and download files. The Storage Server uploads and downloads files through the Tracker Server.
The Tracker Server
Mainly do scheduling work, play a balanced role; Responsible for the management of all storage servers and groups. After each storage is started, it will connect to Tracker to inform its group and other information, and maintain periodic heartbeat. Tracker Sets up a mapping table for group==>[Storage ServerList] based on the storage heartbeat information.
A Tracker has very little meta information to manage and is stored in memory. In addition, the meta information on tracker is generated by the information reported by the storage, and there is no need to persist any data itself, which makes tracker very easy to expand. By directly adding tracker machine, you can expand to serve tracker cluster. Each tracker in the cluster is completely equivalent. All the trackers accept stroage’s heartbeat information and generate metadata information to provide read and write services.
Storage Server Storage Server
Mainly provides capacity and backup services; The unit is group. Each group can contain multiple storage servers for mutual backup. Group storage can facilitate application isolation, load balancing, and number of copies customization (the number of storage servers in a group is the number of copies in the group). For example, application data can be isolated by storing different application data in different groups. In addition, applications can be assigned to different groups for load balancing based on their access characteristics. The disadvantage is that the capacity of the group is limited by the storage capacity of a single machine. At the same time, when a machine in the group fails, data recovery can only rely on other machines in the group, which takes a long time to recover.
The storage of each storage in a group depends on the local file system. A storage can be configured with multiple data storage directories. For example, if there are 10 disks mounted to /data/disk1-/data/disk10, all 10 directories can be configured as data storage directories of the storage. When receiving a file write request, the storage selects a storage directory to store files based on the configured rules. The number of documents in order to avoid a single directory is too much, for the first time in storage starts, in each data stored in the directory to create 2 levels of subdirectories, each level 256, a total of 65536 files, newly written documents will be routed to the one in the form of a hash a subdirectory, then the file data as a local file stored in the directory.
FastDFS storage policy
To support large capacity, storage nodes (servers) are organized into volumes (or groups). A storage system consists of one or more volumes whose files are independent of each other. The file capacity of all volumes is the total file capacity of the entire storage system. A volume can be composed of one or more storage servers. All files on the storage servers under a volume are the same. Multiple storage servers in a volume provide redundant backup and load balancing.
When a server is added to a volume, the system automatically synchronizes existing files. After the synchronization is complete, the system automatically switches the new server to online services. When the storage space is insufficient or about to be used up, you can dynamically add volumes. You only need to add one or more servers and configure them as a new volume, thus increasing the capacity of the storage system.
FastDFS upload process
FastDFS provides users with basic file access interfaces, such as Upload, Download, Append, and Delete, in the form of client libraries.
The Storage Server periodically sends its Storage information to the Tracker Server. If there is more than one Tracker Server in a Tracker Server Cluster, the relationship between the Tracker servers is equal. Therefore, the client can select any Tracker when uploading.
When the Tracker receives a request from the client to upload a file, it will assign a group for the file. After the group is selected, the Tracker must decide which storage server in the group to assign to the client. After a storage server is allocated, the client sends a file write request to the storage. The storage allocates a data storage directory for the file. Then assign a fileID to the file, and finally generate a file name to store the file based on the above information.
Select the tracker server
When there is more than one Tracker server in the cluster, the client can choose any Trakcer when uploading a file because the relationship between trackers is completely peer.
Select the group for the storage
When the tracker receives a request for an Upload file, it will assign the file to a group that can store the file. The following rules support the selection of groups: 2. Specified group to specify a Specified group. 3
Choose the storage server
After the tracker is selected, it will select a storage server from the group to the client. The following rules support the selection of storage: First server ordered by IP. First server ordered by priority. Sort by priority (Priority is configured on storage)
Choose the storage path
After a storage server is allocated, the client sends a file write request to the storage. The storage allocates a data storage directory for the file. The following rules are supported: Round robin: Round robin among multiple storage directories. The one with the most free storage space takes precedence
Generate a count
After the storage directory is selected, the storage generates a Fileid for the file, which is a concatenation of the storage server IP address, file creation time, file size, file CRc32, and a random number. The binary string is base64 encoded and converted into a printable string.
Select a two-level directory
After selecting a storage directory, the storage assigns a fileID to the file. Each storage directory has two levels of 256 x 256 subdirectories. The storage hashes the file twice based on the fileID, routes the file to one of the subdirectories, and stores the file to the subdirectory with the fileID as the file name.
Generate file name
After a file is stored in a subdirectory, it is considered that the file is successfully stored. Then, a file name is generated for the file. The file name is a combination of group, storage directory, two-level subdirectories, FileID, and file name extension (specified by the client to distinguish file types).
FastDFS file synchronization
When a file is written to a storage server in a group, the client considers that the file is successfully written. After the storage Server writes the file, the background thread synchronizes the file to other storage servers in the same group.
After each storage writes a file, it also writes a binlog. The binlog does not contain file data, but only file name and other meta information. This binlog is used for background synchronization. Progress is recorded as a timestamp, so it is best to keep the clocks of all servers in the cluster in sync.
The synchronization progress of a storage is reported to tracker as part of metadata. Tracke uses the synchronization progress as a reference when selecting a storage to read.
For example, if there are three storage servers A, B, and C in A group, A synchronizes to T1 from C (all files written before T1 have been synchronized to B), and B synchronizes to T2 from C (T2 > T1), when the tracker receives the synchronization progress information, it will arrange it. Use the smallest one as the synchronization timestamp of C. In this case, T1 is C’s synchronization timestamp T1 (all data written before T1 has been synchronized to C). Similarly, according to the above rules, the tracker generates A synchronization timestamp for A and B.
FastDFS file download
After the client Uploadfile succeeds, it will get a file name generated by the storage. Then the client can access the file according to the file name.
As with the Upload file, the client can select any Tracker server when downloading the file. When tracker sends a download request to a tracker, the file name must be displayed. Tracke retrieves information about the file name, such as the file group, size, and creation time, and then selects a storage for the request to serve the read request.
FastDFS performance solution
FastDFS installation
The software package | version |
---|---|
FastDFS | v5.05 |
libfastcommon | v1.0.7 |
Download and install libfastCommon
-
download
Wget HTTP: / / https://github.com/happyfish100/libfastcommon/archive/V1.0.7.tar.gz
Copy the code
-
Unpack the
The tar - XVF V1.0.7. Tar. Gz
CD libfastcommon - 1.0.7
Copy the code
-
Compile and install
./make.sh
./make.sh install
Copy the code
-
Creating soft Links
ln -s /usr/lib64/libfastcommon.so /usr/local/lib/libfastcommon.so
ln -s /usr/lib64/libfastcommon.so /usr/lib/libfastcommon.so
ln -s /usr/lib64/libfdfsclient.so /usr/local/lib/libfdfsclient.so
ln -s /usr/lib64/libfdfsclient.so /usr/lib/libfdfsclient.so
Copy the code
Download and install FastDFS
-
Download FastDFS
Wget HTTP: / / https://github.com/happyfish100/fastdfs/archive/V5.05.tar.gz
Copy the code
-
Unpack the
The tar - XVF V5.05. Tar. Gz
CD fastdfs - 5.05
Copy the code
-
Compile and install
./make.sh
./make.sh install
Copy the code
Configuring the Tracker Service
After the above installation is successful, there will be an FDFS directory in the /etc/directory to access it. Here is a sample file that the author gave us. We need to change the tracker.conf.sample file to the tracker.conf configuration file and modify it:
cp tracker.conf.sample tracker.conf
vi tracker.conf
Copy the code
Edit the tracker. Conf
False indicates that the configuration file does not take effect
disabled=false
The port that provides the service
port=22122
# Tracker Data and log directory address
base_path=//home/data/fastdfs
# HTTP service port
http.server_port=80
Copy the code
Create the base data directory for tracker, the directory corresponding to base_path
mkdir -p /home/data/fastdfs
Copy the code
Use ln -s to establish a soft link
ln -s /usr/bin/fdfs_trackerd /usr/local/bin
ln -s /usr/bin/stop.sh /usr/local/bin
ln -s /usr/bin/restart.sh /usr/local/bin
Copy the code
Start the service
service fdfs_trackerd start
Copy the code
Check the monitor
netstat -unltp|grep fdfs
Copy the code
If port 22122 is monitored normally, the Tracker service is started successfully.
Tracker server directory and file structure After the Tracker service is successfully started, the data and logs directories are created in base_path. The directory structure is as follows:
${base_path}
|__data
| | __storage_groups. Dat: storage group information
| | __storage_servers. Dat: storage server list
|__logs
| | __trackerd. Log: the tracker server log file
Copy the code
Configuring Storage Services
Go to the /etc/fdfs directory, copy the FastDFS storage sample configuration file storage.conf.sample, and name it storage.conf
# cd /etc/fdfs
# cp storage.conf.sample storage.conf
# vi storage.conf
Copy the code
Edit storage. Conf
False indicates that the configuration file does not take effect
disabled=false
Storage server group (volume)
group_name=group1
Storage server service port
port=23000
# Heartbeat interval, in seconds (this refers to actively sending heartbeat to tracker server)
heart_beat_interval=30
The root directory must exist, subdirectories will be generated automatically.
base_path=/home/data/fastdfs/storage
Storage Server supports multiple paths for storing files. The number of base paths for storing files is set here. Usually, only one directory is configured.
store_path_count=1
# configure store_path_count paths one by one, index number based on 0.
If store_path0 is not configured, it is the same path as base_path.
store_path0=/home/data/fastdfs/storage
# FastDFS uses two levels of directories to store files. Set the number of directories for storing files.
# If this parameter is set to N (e.g. : 256), the storage Server will automatically create N * N subdirectories under store_path for storing files when it is first run.
subdir_count_per_path=256
# tracker_server list, will actively connect to Tracker_Server
If there are multiple tracker servers, write one line for each tracker server
Tracker_server = 192.168.1.190:22122
The time period during which system synchronization is allowed (default: full day). It is used to avoid problems with peak synchronization.
sync_start_time=00:00
sync_end_time=23:59
Copy the code
Use ln -s to establish a soft link
ln -s /usr/bin/fdfs_storaged /usr/local/bin
Copy the code
Start the service
service fdfs_storaged start
Copy the code
Check the monitor
netstat -unltp|grep fdfs
Copy the code
Make sure Tracker is enabled before starting Storage. First start is successful, can be in/home/data/fastdfs/storage directory to create data, logs two directories. If port 23000 is monitored normally, the Storage service is started successfully.
Check whether the Storage and Tracker are communicating
/usr/bin/fdfs_monitor /etc/fdfs/storage.conf
Copy the code
FastDFS configures the Nginx module
The software package | version |
---|---|
openresty | v1.13.6.1 |
fastdfs-nginx-module | v1.1.6 |
FastDFS uses the Tracker server to store files on the Storage server. However, files need to be replicated between Storage servers in the same group, resulting in synchronization delay.
Suppose the Tracker server uploads the file to 192.168.1.190, and the file ID is returned to the client. In this case, the FastDFS storage cluster mechanism synchronizes the file to 192.168.1.190. If a client uses this file ID to fetch files from 192.168.1.190, the file cannot be accessed. Fastdfs-nginx-module redirects the file link to the source server to retrieve the file, avoiding the file failure caused by replication delay on the client.
Nginx and fastdfs-nginx-module:
It is recommended that you install the following development libraries using yum:
yum install readline-devel pcre-devel openssl-devel -y
Copy the code
Download the latest version and unzip:
Wget HTTP: / / https://openresty.org/download/openresty-1.13.6.1.tar.gz
The tar - XVF openresty - 1.13.6.1. Tar. Gz
wget https://github.com/happyfish100/fastdfs-nginx-module/archive/master.zip
unzip master.zip
Copy the code
Install nginx and add the fastdfs-nginx-module module:
./configure --add-module=.. /fastdfs-nginx-module-master/src/
Copy the code
Compile, install:
make && make install
Copy the code
View Nginx modules:
/usr/local/openresty/nginx/sbin/nginx -v
Copy the code
If the following is present, the module was added successfully
Copy the fastdfs-nginx-module configuration file to /etc/fdfs and modify it:
cp /fastdfs-nginx-module/src/mod_fastdfs.conf /etc/fdfs/
Copy the code
Connection timeout
connect_timeout=10
# Tracker Server
Tracker_server = 192.168.1.190:22122
# StorageServer Default port
storage_server_port=23000
Set to true if the uri of the file ID contains /group**
url_have_group_name = true
The store_path0 path configured for Storage must be the same as that specified in storage.conf
store_path0=/home/data/fastdfs/storage
Copy the code
Copy some FastDFS configuration files to /etc/fdfs directory:
cp /fastdfs-nginx-module/src/http.conf /etc/fdfs/
cp /fastdfs-nginx-module/src/mime.types /etc/fdfs/
Copy the code
Configure nginx, modify nginx.conf:
location ~/group([0-9])/M00 {
ngx_fastdfs_module;
}
Copy the code
Nginx start:
[root@iz2ze7tgu9zb2gr6av1tysz sbin]# ./nginx
ngx_http_fastdfs_set pid=9236
Copy the code
Test upload:
[root@iz2ze7tgu9zb2gr6av1tysz fdfs]# /usr/bin/fdfs_upload_file /etc/fdfs/client.conf /etc/fdfs/4.jpg
group1/M00/00/00/rBD8EFqVACuAI9mcAAC_ornlYSU088.jpg
Copy the code
Deployment structure diagram:
JAVA client integration
Pom. XML is introduced into:
<! -- fastdfs -->
<dependency>
<groupId>org.csource</groupId>
<artifactId>fastdfs-client-java</artifactId>
< version > 1.27 < / version >
</dependency>
Copy the code
Fdfs_client. Conf configuration:
Specifies the timeout period for connecting to the tracker server
connect_timeout = 2
Timeout duration of socket connection
network_timeout = 30
# File content encoding
charset = UTF-8
#tracker server port
http.tracker_http_port = 8080
http.anti_steal_token = no
http.secret_key = FastDFS1234567890
#tracker server IP address and port
Tracker_server = 192.168.1.190:22122
Copy the code
FastDFSClient upload class:
public class FastDFSClient{
private static final String CONFIG_FILENAME = "D:\\itstyle\\src\\main\\resources\\fdfs_client.conf";
private static final String GROUP_NAME = "market1";
private TrackerClient trackerClient = null;
private TrackerServer trackerServer = null;
private StorageServer storageServer = null;
private StorageClient storageClient = null;
static{
try {
ClientGlobal.init(CONFIG_FILENAME);
} catch (IOException e) {
e.printStackTrace();
} catch (MyException e) {
e.printStackTrace();
}
}
public FastDFSClient() throws Exception {
trackerClient = new TrackerClient(ClientGlobal.g_tracker_group);
trackerServer = trackerClient.getConnection();
storageServer = trackerClient.getStoreStorage(trackerServer);;
storageClient = new StorageClient(trackerServer, storageServer);
}
/ * *
* Upload files
* @param file File object
* @param fileName specifies the fileName
* @return
* /
public String[] uploadFile(File file, String fileName) {
return uploadFile(file,fileName,null);
}
/ * *
* Upload files
* @param file File object
* @param fileName specifies the fileName
* @param metaList file metadata
* @return
* /
public String[] uploadFile(File file, String fileName, Map<String,String> metaList) {
try {
byte[] buff = IOUtils.toByteArray(new FileInputStream(file));
NameValuePair[] nameValuePairs = null;
if (metaList ! = null) {
nameValuePairs = new NameValuePair[metaList.size()];
int index = 0;
for (Iterator<Map.Entry<String,String>> iterator = metaList.entrySet().iterator(); iterator.hasNext();) {
Map.Entry<String,String> entry = iterator.next();
String name = entry.getKey();
String value = entry.getValue();
nameValuePairs[index++] = new NameValuePair(name,value);
}
}
return storageClient.upload_file(GROUP_NAME,buff,fileName,nameValuePairs);
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
/ * *
* Get file metadata
* @param fileId Indicates the ID of a file
* @return
* /
public Map<String,String> getFileMetadata(String groupname,String fileId) {
try {
NameValuePair[] metaList = storageClient.get_metadata(groupname,fileId);
if (metaList ! = null) {
HashMap<String,String> map = new HashMap<String, String>();
for (NameValuePair metaItem : metaList) {
map.put(metaItem.getName(),metaItem.getValue());
}
return map;
}
} catch (Exception e) {
e.printStackTrace();
}
return null;
}
/ * *
* Delete files
* @param fileId Indicates the ID of a file
* @return Returns -1 on delete failure, or 0 otherwise
* /
public int deleteFile(String groupname,String fileId) {
try {
return storageClient.delete_file(groupname,fileId);
} catch (Exception e) {
e.printStackTrace();
}
return -1;
}
/ * *
* Download files
* @param fileId fileId (ID returned after successfully uploading the file)
* @param outFile File download location
* @return
* /
public int downloadFile(String groupName,String fileId, File outFile) {
FileOutputStream fos = null;
try {
byte[] content = storageClient.download_file(groupName,fileId);
fos = new FileOutputStream(outFile);
InputStream ips = new ByteArrayInputStream(content);
IOUtils.copy(ips,fos);
return 0;
} catch (Exception e) {
e.printStackTrace();
} finally {
if (fos ! = null) {
try {
fos.close();
} catch (IOException e) {
e.printStackTrace();
}
}
}
return -1;
}
public static void main(String[] args) throws Exception {
FastDFSClient client = new FastDFSClient();
File file = new File("D:\\23456.png");
String[] result = client.uploadFile(file, "png");
System.out.println(result.length);
System.out.println(result[0]);
System.out.println(result[1]);
}
}
Copy the code
Executing the main method returns:
2
group1
M00/00/00/rBD8EFqTrNyAWyAkAAKCRJfpzAQ227.png
Copy the code
Source:
https://gitee.com/52itstyle/spring-boot-fastdfs
A wechat public account with temperature
I look forward to making progress together with you and sharing beautiful articles
Share various Java learning resources