FastDFS profile

FastDFS is an open-source lightweight distributed file system that uses C and supports Unix-like operating systems such as Linux and BSD. It’s worth noting that Fastdfs is not a generic file system and can only be accessed through a dedicated API. Tailor-made for Internet applications, FastDFS solves the problem of large file storage. Fastdfs pursues high performance and high scalability. The main concepts of FastdFS:

  • Tracker-server: indicates the tracking server. Used for tracking files, mainly for scheduling. It records the status information of all storage groups and storage servers in memory and serves as the main hub for clients and data stores. More streamlined than GFS because no file index is recorded.

  • Storage-server: indicates a storage server. Used to store files. Use the operating system’s file system directly to manage and organize files.

  • Group: group, volume. Multiple servers exist in a group, the servers in a group store exactly the same files, and the server status of the same group is peer. Operations on files can be performed on servers in either group.

  • Metadata: Indicates metadata. Store information about files in key-value pairs.



Comparison of storage systems

In other words, there is no harm without comparison. Fastdfs is not a panacea. You need to choose a suitable storage system based on your business.

The storage system Type of file suitable for storage File distribution The system performance The complexity of the FUSE(user file system) POSIX() Backup mechanism Communication protocol interface The community is Implementation language
FastDFS 4 KB to 500 MB Merge small files for storage high simple Does not support Does not support Intra-group redundancy backup HTTP API Domestic users C
TFS All the files Small files are merged into chunks to organize fragmentation complex Does not support Does not support Block storage in active/standby Dr Mode HTTP API less C++
MFS More than 64 k Shard storage The memory usage of the Master node is high support support Multi-point backup, dynamic redundancy Mount using FUSE more Perl
HDFS A large file Large file fragmentation block storage simple support support More than a copy The native API more Java
Ceph Object large file OSD has one primary node and many secondary nodes complex support support More than a copy The native API less C++
MogileFS Lots of small pictures high complex support Does not support Dynamic redundancy The native API The document is less Perl
ClusterFS A large file simple support support more C

FastDFS client – server interaction principle



FastDFS + Nginx integration

  • Architecture diagram



  • Install FastDFS
mkdir /source
cd /sourceyum install -y gcc gcc-c++ make cmake wget libevent wget https://github.com/happyfish100/libfastcommon/archive/V1.0.35.tar.gz wget https://github.com/happyfish100/fastdfs/archive/V5.10.tar.gz tar - ZXVF V1.0.35. Tar. Gz tar - ZXVF V5.10. Tar. GzcdLibfastcommon-1.0.35./make. Sh./make-s /usr/lib64/libfastcommon.so /usr/local/lib/libfastcommon.so
cd ../
cdFastdfs - 5.10 /. / make. Sh. / make. Sh installcd. / rm - rf libfastcommon - 1.0.35 rm - rf fastdfs - 5.10 cp/etc/FDFS/tracker. Conf. Sample/etc/FDFS/tracker. Conf cp /etc/fdfs/storage.conf.sample /etc/fdfs/storage.conf cp /etc/fdfs/client.conf.sample /etc/fdfs/client.conf mkdir -p /data/fdfs/tracker mkdir -p /data/fdfs/storage ln-s /usr/bin/stop.sh /usr/local/bin/stop.sh
ln -s /usr/bin/restart.sh /usr/local/bin/restart.shCopy the code
  • Modifying a Configuration File

Modify the tracker configuration file:

base_path=/data/fdfs/trackerCopy the code

Modify the storage configuration file:

Base_path = / data/storage/FDFS store_path0 = / data/storage/FDFS tracker_server = 192.168.80.3:22122Copy the code

Modifying the client configuration file:

Base_path = / data/FDFS/client tracker_server = 192.168.80.3:22122Copy the code
  • Start the
/etc/init.d/fdfs_trackerd start
/etc/init.d/fdfs_storaged startCopy the code
Netstat tunlap | grep: TCP 22122 0 0 0.0.0.0:0.0.0.0:22122 * 7247 / fdfs_trackerd TCP 0 0 192.168.80.3: LISTEN. "22122 192.168.80.3:39318 ESTABLISHED 7247/fdfs_trackerd TCP 0 0 192.168.80.3:39318 192.168.80.3:22122 ESTABLISHED 7444/fdfs_storagedCopy the code

Check the service status after startup. If active (exited) is displayed, restart the service.

  • test
/usr/bin/fdfs_upload_file /etc/fdfs/client.conf /source/ FastDFS_v5.05. Tar. Gz group1 / M00/00/00 / wKhQA1ysjSGAPjXbAAVFOL7FJU4 tar. GzCopy the code

Files are stored at:

ll /data/fdfs/storage/data/00/00/wKhQA1ysjSGAPjXbAAVFOL7FJU4.tar.gzCopy the code
  • Install Nginx and configure the module
Install the PCRE (Perl-compatible regular expression) library required by Nginx to allow Nginx to provide URL rewriting functionality using the rewrite module.
yum install pcre pcre-devel perl-ExtUtils-Embed -y
# Install openSSL-devel to allow Nginx to provide HTTPS services.
yum install openssl-devel -y
# Download software package
cd /sourceWget wget HTTP: / / http://59.80.44.46/nginx.org/download/nginx-1.14.2.tar.gz http://nchc.dl.sourceforge.net/project/fastdfs/FastDFS%20Nginx%20Module%20Source%20Code/fastdfs-nginx-module_v1.16.tar.g z# Decompress the packageTar XVF nginx-1.14.2.tar.gz tar XVF fastdfs-nginx-module_v1.16.tar.gzCreate the necessary soft connections
ln -s /usr/include/fastdfs/ /usr/local/include/fastdfs
ln -s /usr/include/fastcommon/ /usr/local/include/fastcommon
cp FastDFS/conf/http.conf /etc/fdfs/
cp FastDFS/conf/mime.types /etc/fdfs/
cp fastdfs-nginx-module/src/mod_fastdfs.conf /etc/fdfs/
mkdir -p /data/fdfs/fastdfs-nginx-moduleCopy the code

Modify the configuration file /etc/fdf/mod_fastdfs.conf:

connect_timeout=10
base_path=/data/fdfs/fastdfs-nginx-module
Configure the server addressTracker_server = 192.168.80.3: url_have_group_name = 22122true
store_path0=/data/fdfs/storageCopy the code
mkdir /applications
mkdir /tmp/nginx
useradd nginx -s /sbin/nologin -M
cdNginx-1.14.2. /configure \ --user=nginx \ --group=nginx \ --prefix=/applications/nginx-1.14.2 \ --with-http_ssl_module \  --with-http_gzip_static_module \ --http-client-body-temp-path=/tmp/nginx/client \ --http-proxy-temp-path=/tmp/nginx/proxy \ --http-fastcgi-temp-path=/tmp/nginx/cgi \ --with-poll_module \ --with-file-aio  \ --with-http_realip_module \ --with-http_addition_module \ --with-http_random_index_module \ --with-pcre \ --with-http_stub_status_module \ --with-stream \ --add-module=/source/fastdfs-nginx-module/src
make
make install
ln -s/applications/nginx chown-r nginx.nginx /applications/nginx chown-r nginx.nginx /applications/nginx-1.14.2/ chown -r nginx.nginx/TMP /nginx/Copy the code

Modify the configuration files/applications/nginx/conf/nginx. Conf, block is added in HTTP:

server {
        listen       8000;
        server_name  media;

        location / {
            root   html;
            index  index.html index.htm;
        }

        Intercept the file request and forward it to the module belowlocation ~/group[0-9] { ngx_fastdfs_module; } error_page 500 502 503 504 /50x.html; location = /50x.html { root html; }}Copy the code

Test configuration file correctness and start:

/applications/nginx/sbin/nginx -t
/applications/nginx/sbin/nginxCopy the code

Delete temporary files:

Rm -rf FastDFS rm -rf fastdfs-nginx-module rm -rf libfastcommon-1.0.35 rm -rf nginx-1.14.2Copy the code

Test Nginx file download function, input in a browser: http://192.168.80.3:8000/group1/M00/00/00/wKhQA1ysjSGAPjXbAAVFOL7FJU4.tar.gz

An nginx can access only one storage server’s data. Therefore, multiple storage servers need to configure multiple nginx, and then route the nginx according to the groupid (groupid) in the request path.

A simple script to start and stop the service:

# start-service
# usage: ./start-service
echo "============================================Sync datetime==========================================="
ntpdate time7.aliyun.com
echo "===========================================Getting status==========================================="
/etc/init.d/fdfs_trackerd status
/etc/init.d/fdfs_storaged status
echo "==========================================Starting service=========================================="
/etc/init.d/fdfs_storaged start
/etc/init.d/fdfs_trackerd start
echo "===========================================Getting status==========================================="
/etc/init.d/fdfs_trackerd status
/etc/init.d/fdfs_storaged status
echo "=========================================Testing config file========================================"
/applications/nginx/sbin/nginx -t
echo "=========================================Starting web server========================================"
/applications/nginx/sbin/nginx
echo "=======================================Getting network status======================================="
sleep 5s
netstat -tunlap | grep :22122
netstat -tunlap | grep :8000Copy the code
# stop-service
# usage: ./stop-service
echo "===========================================Getting status==========================================="
/etc/init.d/fdfs_trackerd status
/etc/init.d/fdfs_storaged status
echo "==========================================Stopping service=========================================="
/etc/init.d/fdfs_storaged stop
/etc/init.d/fdfs_trackerd stop
echo "========================================Stopping web server========================================="
kill `cat /applications/nginx/logs/nginx.pid`
echo "===========================================Getting status==========================================="
/etc/init.d/fdfs_trackerd status
/etc/init.d/fdfs_storaged status
echo "=======================================Getting network status======================================="
sleep 5s
netstat -tunlap | grep :22122
netstat -tunlap | grep :8000Copy the code

Token-based anti-theft chain implementation

FastDFS uses Token to prevent theft. Token is time-sensitive, including file ID, timestamp TS and Token. Request resources in FastDFS using urls with TS and tokens. An algorithm for generating tokens is provided in FastDFS, and the extension module validates the tokens. Since Token generation and verification are done on the server side, there are no security issues. Examples of links:

http://192.168.1.15:8080/group1/M01/01/01/wKgBD01c15nvKU1cAABAOeCdFS466570.c?token=b32cd06a53dea4376e43d71cc882f9cb&ts=1297930137Copy the code

/etc/ff/http. conf

# Enable Token verification
http.anti_steal.check_token=true
The # Token declaration cycle is 240 seconds
http.anti_steal.token_ttl=240
An encrypted string can be generated using openssl rand-base64 64
http.anti_steal.secret_key=2scPwMPctXhbLVOYB0jyuyQzytOofmFCBIYe65n56PPYVWrntxzLIDbPdvDDLJM8QHhKxSGWTcr+9VdG3yptkw
# Token verification failed
http.anti_steal.token_check_fail=/data/fdfs/error.svgCopy the code
  • Using Java client validation:

Install the client to the local repository:

git clone https://github.com/happyfish100/fastdfs-client-java.git
cd fastdfs-client-java
mvn clean installCopy the code

Create a normal project using Maven and add dependencies to the POM file:

<dependencies> <dependency> <groupId>org.csource</groupId> <artifactId>fastdfs-client-java</artifactId> The < version > 1.27 - the SNAPSHOT < / version > < / dependency > < / dependencies >Copy the code

Create FastDFS config file fastdfs-client.properties in resources directory:

fastdfs.connect_timeout_in_seconds = 5
fastdfs.network_timeout_in_seconds = 30
fastdfs.charset = UTF-8
fastdfs.http_anti_steal_token = truefastdfs.http_secret_key = 2scPwMPctXhbLVOYB0jyuyQzytOofmFCBIYe65n56PPYVWrntxzLIDbPdvDDLJM8QHhKxSGWTcr+9VdG3yptkw Fastdfs. http_tracker_http_port = 8080 Fastdfs. tracker_Servers = 192.168.80.3:22122Copy the code

Create com. Bluemiaomiao. Demo. Java class files:

package com.bluemiaomiao; import org.csource.common.MyException; import org.csource.fastdfs.ClientGlobal; import org.csource.fastdfs.ProtoCommon; import java.io.IOException; import java.security.NoSuchAlgorithmException; import java.util.Properties; public class Demo { public static void main(String[] args) throws IOException, MyException, NoSuchAlgorithmException {// Load config file Properties prop = new Properties(); prop.load(Demo.class.getResourceAsStream("/fastdfs-client.properties")); ClientGlobal.initByProperties(prop); System.out.println(clientglobal.configinfo ()); // The address returned by the file upload tool is normally saved in the database. String remoteFileName ="group1/M00/00/00/wKhQA1ysjSGAPjXbAAVFOL7FJU4.tar.gz";
        // 获取当前时间戳
        int ts = (int)(System.currentTimeMillis()/1000);
        // 获取Token, 传入的文件ID不要含有分组信息
        String token = ProtoCommon.getToken("M00/00/00/wKhQA1ysjSGAPjXbAAVFOL7FJU4.tar.gz", ts, prop.getProperty("fastdfs.http_secret_key")); // Use the browser to access the returned URL system.out.println ("http://192.168.80.3:8000/" + remoteFileName + "? token=" + token + "&ts="+ ts); }}Copy the code

If an anti-theft link image is displayed during the access, there may be a gap between the test client and the server. The difference between the two hosts cannot be minute. You can use the following method to synchronize the server:

Install the client of the time synchronization server. Windows also needs to synchronize with this server
# Control Panel -> Clock and Region -> Set Date and time ->Internet Time -> Change Settings
yum install ntpdate
ntpdate time7.aliyun.comCopy the code

Integrate FastDHT for data de-duplication

FastDHT is a distributed hash system (DHT) that uses BerkeleyDB for data storage and libevent for network IO processing. Depends on the libfastCommon component.

  • download

Go to the Oracle website to download Berkeley DB database. Go to the Fastdfs GitHub page to download FastDHT source package. Since linEvent and libFastCommon have been installed before, you only need to install the database and FastDHT.

  • Install and configure the database with FastDHT
The tar XVF db - 6.2.23. Tar. GzcdThe db - 6.2.23 / build_unix /.. / dist/configure -- prefix = / applications/db - 6.2.23cd. /.. / rm -rf db-6.2.23 unzip fastdht-master.zipcd fastdht-master
vim make.shCopy the code

Modify line 27:

CFLAGS='-Wall -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE -I /applications/db-6.2.23/include/ -L /applications/db-6.2.23/lib/'Copy the code

-i: specifies the header file directory provided by the database

-l: specifies the library file directory provided by the database

./make.sh
./make.sh install
cd ../
rm -rf fastdht-masterCopy the code
  • configuration
mkdir /data/fdhtCopy the code

/etc/fdht/fdht_client.conf

base_path=/data/fdht
#include /etc/fdht/fdht_servers.confCopy the code

/etc/fdht/fdht_servers.conf

Group0 = 192.168.80.3:11411Copy the code

/etc/fdht/fdhtd.conf

base_path=/data/fdht
#include /etc/fdht/fdht_servers.confCopy the code

Modify /etc/ff/storage. conf:

check_file_duplicate=1
key_namespace=FastDFS
keep_alive=1
#include /etc/fdht/fdht_servers.confCopy the code

The server configuration file must be included.

ln -s/ applications/db - 6.2.23 / lib/libdb - 6.2 so/usr/lib/libdb - 6.2 so ln-s/ applications/db - 6.2.23 / lib/libdb - 6.2 so/usr/lib64 / libdb - 6.2. SoCopy the code
  • Start and test
fdhtd /etc/fdht/fdhtd.confCopy the code

Restart to use:

fdhtd /etc/fdht/fdhtd.conf restartCopy the code

View the results:

Netstat tunlap | grep: TCP 11411 0 0 0.0.0.0:0.0.0.0:11411 * 20605 / FDHTD LISTENCopy the code
  • Since FastDFS was turned off when you installed FastDHT, you need to start FastDFS
./start-serviceCopy the code

Modify the previous start-service script to add the following before starting tracker and storage:

fdhtd /etc/fdht/fdhtd.confCopy the code

Add before viewing tracker and nGIxN network status:

netstat -tunlap | grep :11411Copy the code

Testing:

fdfs_upload_file /etc/fdfs/client.conf /source/ db - 6.2.23. Tar. Gz group1 / M00/00/00 / wKhQA1yu2L6APTk - AqQOLABfhaQ. Tar. Gz fdfs_upload_file/etc/FDFS/client. The conf /source/ db - 6.2.23. Tar. Gz group1 / M00/00/00 / wKhQA1yu2MKAOmIiAqQOLHUWXfw tar. Gz ll/data/FDFS/storage/data / 00/00 / total 45268 -rw-r--r-- 1 root root 44305964 Apr 11 14:03 wKhQA1yu2L6AM0aiAqQOLKFBFuc.tar.gz lrwxrwxrwx 1 root root 64 Apr 11 14:03 wKhQA1yu2L6APTk-AqQOLABfhaQ.tar.gz -> /data/fdfs/storage/data/00/00/wKhQA1yu2L6AM0aiAqQOLKFBFuc.tar.gz lrwxrwxrwx 1 root  root 64 Apr 11 14:03 wKhQA1yu2MKAOmIiAqQOLHUWXfw.tar.gz -> /data/fdfs/storage/data/00/00/wKhQA1yu2L6AM0aiAqQOLKFBFuc.tar.gzCopy the code

Custom fastdfs – spring – the boot – the starter

To quickly build a SpringBoot project, we can customize a scenario launcher. Details: github.com/bluemiaomia… .

Let’s build a sample project using SpringBoot:

  • Add dependencies:
<dependency> <groupId>com.bluemiaomiao</groupId> <artifactId>fastdfs-spring-boot-starter</artifactId> < version > 1.0 - the SNAPSHOT < / version > < / dependency >Copy the code
  • Add annotations to the main configuration class:

    @EnableFastdfsClient @SpringBootApplication public class DemoApplication { @Autowired private FastdfsClientService fastdfsClientService; public static void main(String[] args) { SpringApplication.run(DemoApplication.class, args); }}Copy the code

The global client is automatically initialized.

Interested in the article friends can pay attention to xiaobian nuggets good, there will be more quality article output.