Postgresql-11 and Chinese word parser zhparser-1
PostgreSQL’s default word segmentation is based on Spaces and various punctuation marks, but this does not fit the Chinese word segmentation method. Zhparser is a PostgreSQL extension for Chinese full-text search. It implements a Chinese parser based on SCWS, so you need to install SCWS before installing Zhparser.
SCWS is the abbreviation of Simple Chinese Words Segmentation system. This is a mechanical Chinese word segmentation engine based on word frequency dictionary, which can basically divide a whole paragraph of Chinese characters into words correctly.
-
Install SCWS
-
Use WGET to download the SCWS package
Wget HTTP: / / http://www.xunsearch.com/scws/down/scws-1.2.2.tar.bz2
-
Unzip and go to the folder
Tar -jxvf scws-1.2.2.tar.bz2 CD SCWS-1.2.2
-
Compile the installation
The configure file is an executable script file used to generate makefiles, and –prefix can specify the path. The default path is: /usr/local/bin is the executable file, /usr/local/lib is the library file, and /usr/local/etc is the configuration file. /usr/local/share./configure –prefix=/opt/scws-1.2.2
Make is a command tool that interprets the instruction make in makefiles
Make install To perform the installation, also read the instructions from the Makefile and install to the specified location
/configure –prefix=/opt/scws-1.2.2 && make && make install
-
Install Zhparser
-
fromGithub.com/amutu/zhpar…Download Zhparser and upload it
-
Setting environment Variables
export PATH=/usr/pgsql-11/bin/:$PATH
-
Install postgresql11 – devel
If you do not install this package, the pg_config: Command not found error will be reported when compiling the zhparser file
yum install postgresql11-devel.x86_64
-
Compile the installation
SCWS_HOME = / opt/SCWS - 1.2.2 make && make install
-
Go to the database to install the extension
-
Create an extension
create extension zhparser;
-
Check whether the vm is created successfully
select * from pg_ts_parser;
-
Create a text search configuration that uses Zhparser as the parser
CREATE TEXT SEARCH CONFIGURATION testzhcfg (PARSER = zhparser);
-
Adding a Token Mapping
ALTER TEXT SEARCH CONFIGURATION testzhcfg ADD MAPPING FOR n,v,a,i,e,l WITH simple;
-
test
The G7 foreign ministers' meeting in London on Friday was overshadowed by the novel coronavirus test positive of the Indian delegation attending the meeting. The incident highlights the severity of the current COVID-19 situation. This is the G7 offline "sheng" held for the first time in two years, and epidemic situation and related problems which should be important issues, because the establishment of the G7 is for the coordination between the major industrial powers in the world economy policy, however, whether before the G7 foreign ministers meeting in London, or on the hot topic of discussion during the meeting time, all show China issues are placed in the top of the agenda. In the words of one US think-tank scholar, "the broader context of the meeting is China". ');
The G7 foreign ministers' meeting in London on Friday was overshadowed by the novel coronavirus test positive of the Indian delegation attending the meeting. The incident highlights the severity of the current COVID-19 situation. This is the offline "event" of the G7 held for the first time in two years, and epidemic situation and related problems which should be important issues, because the establishment of the G7 is for the coordination between the major industrial powers in the world economy policy, however, whether before the G7 foreign ministers meeting in London, or on the hot topic of discussion during the meeting time, all show China issues are placed in the top of the agenda. In the words of one US think-tank scholar, "the broader context of the meeting is China". ');
SELECT to_tsquery(' testZhcfg ', 'this event highlights the severity of the current epidemic situation. ');