This is the 27th day of my participation in the August More Text Challenge
Article search I am now using mysql’s fuzzy query like search title keyword.
Full-text index is also useful before, but the efficiency of full-text index is relatively low, so there is no matching of article content in the later period.
Later, when I came into contact with Chinese word segmentation, I felt that he could just solve my problem: at present, there are solr (developed based on Java) and sphinx (developed based on C++) which support PHP.
Solr requires a Java environment to run. I don’t really like it, so I’m going to filter this out.
A better choice would be sphinx.
However, Sphinx does not support Chinese word segmentation, so most of the results on Baidu are based on sphinx kernel development of coreseek+ MMSEg word segmentation a set of combinations to achieve Chinese word segmentation + full text retrieval.
But there’s a problem, coreseek is no longer maintained.
The official website is no longer accessible: www.coreseek.cn/
The latest version I could find was CoreSee 4.1.
Coreseek4.1 version I did not compile and install the centos7.8 on ali cloud. So I recommend using coresee 3.2 (based on sphinx0.9), which is a bit old.
Download address:
Gitee.com/sdagfsdh/co…
Here I mainly use the red box marked compression package.
One: Install the compilation environment
yum -y install gcc gcc-c++ autoconf python python-devel libiconv libtool
Copy the code
Those who have already installed it, please skip it
2: Install the MMSEG3
My software packages are in the usr/local/download directory
cd /usr/local/ download/coreseek - 3.2.14cdMmseg-3.2.14 chmod -r 777./configureConfigure file to add execution permission
./configure --prefix=/usr/local/mmseg3 /usr/local/mmseg3
make&&make install
Copy the code
1: Possible errors
(1) : config.status: error: cannot find input file: SRC/makefile.in
Solutions:
yum -y install libtool
aclocal
libtoolize --force
automake --add-missing
autoconf
autoheader
make clean
./configure --prefix=/usr/local/mmseg3
make&&make install
Copy the code
2: indicates that compilation is successful
------------------------------------------------------------------------
Configuration:
Source code location: .
Compiler: gcc
Compiler flags: -g -O2
Host System Type: x86_64-redhat-linux-gnu
Install path: /usr/local/mmseg3
See config.h for further configuration information.
------------------------------------------------------------------------
Copy the code
Three: Install Coreseek
1: installs dependencies
yum -y install expat expat-devel
Copy the code
2: Access the directory
# enter directory
cdCSFT - 3.2.14Give configure file execution permission
chmod -R 777 ./configure
The compile command needs to modify the directory according to your own software installation.
./configure --prefix=/usr/local/coreseek -without-unixodbc -with-mmseg -with-mmseg-includes=/usr/local/mmseg3/include/mmseg/ -with-mmseg-libs=/usr/local/mmseg3/lib/ -with-mysql=/usr/local/mariadb # my mysql installation directory
Copy the code
/usr/local/mariadb = /usr/local/mariadb = /usr/local/mariadb = /usr/local/mariadb = /usr/local/mariadb = /usr/local/mariadb = /usr/local/mariadb
Compilation success display:
generating configuration files
------------------------------
configure: creating ./config.status
config.status: creating Makefile
config.status: creating src/Makefile
config.status: creating libstemmer_c/Makefile
config.status: creating sphinx.conf.dist
config.status: creating sphinx-min.conf.dist
config.status: creating config/config.h
config.status: executing depfiles commands
configuration done
------------------
Copy the code
Perform the installation
make&&make install
Copy the code
1: Possible errors
make[2]: *** [sphinxexpr.o] Error 1
make[2]: Leaving directory `/usr/local/ download/coreseek - 3.2.14 / CSFT - 3.2.14 / SRC'make [1] : [all] * * * Error 2 make [1] : brigade directory ` / usr/local/download/coreseek - 3.2.14 CSFT - 3.2.14 / SRC/'
make: *** [all-recursive] Error 1
Copy the code
Solutions:
In the sphinxexpr. CPP file (there will be many lines), replace “ExprEval” with “this->ExprEval” and reconfigure./configure…….. , compile and install:
make && make install
Copy the code
Installation success display:
make[2]: Leaving directory `/usr/local/ download/coreseek - 3.2.14 / CSFT - 3.2.14 / SRC'make [1] : brigade directory ` / usr/local/download/coreseek - 3.2.14 CSFT - 3.2.14 / SRC/'
Making all in test
make[1]: Entering directory `/usr/local/ download/coreseek - 3.2.14 / CSFT 3.2.14 /test'
make[1]: Nothing to be done for `all'.
make[1]: Leaving directory `/usr/local/ download/coreseek - 3.2.14 / CSFT 3.2.14 /test'make [1] : if the directory ` / usr/local/download/coreseek - 3.2.14 / CSFT - 3.2.14'
make[1]: Nothing to be done for `all-am'make [1] : brigade directory ` / usr/local/download/coreseek - 3.2.14 / CSFT - 3.2.14'
Copy the code
At this point, the compilation and installation are successful.
Four: startup error solutions
Start Coreseek with the following command
/usr/local/coreseek/bin/searchd
Copy the code
Error:
/usr/local/coreseek/bin/searchd: error while loading shared libraries: libmariadb.so.3: cannot open shared object file: No such file or directory
Copy the code
Solutions:
ln -s /usr/local/mariadb/lib/libmariadb.so.3 /usr/lib64/libmariadb.so.3
Copy the code
Start again:
/usr/local/coreseek/bin/searchd
Copy the code
Error:
Coreseek Fulltext 3.2 [Sphinx 0.9.9-Release (R2117)] Copyright (C) 2007-2011, Beijing Choice Software Technologies Inc (http://www.coreseek.com) FATAL: no readable config file (lookedin /usr/local/coreseek/etc/csft.conf, ./csft.conf).
Copy the code
No configuration file, solution:
cp /usr/local/coreseek/etc/sphinx-min.conf.dist csft.conf
Copy the code
To start again
/usr/local/coreseek/bin/searchd
Copy the code
Error:
Coreseek Fulltext 3.2 [Sphinx 0.9.9-Release (R2117)] Copyright (C) 2007-2011, Beijing Choice Software Technologies Inc (http://www.coreseek.com) using config file'/usr/local/coreseek/etc/csft.conf'. listening on all interfaces, port=9312 WARNING: index'test1': preload: failed to open /usr/local/coreseek/var/data/test1.sph: No such file or directory; NOT SERVING
FATAL: no valid indexes to serve
Copy the code
I just can’t find the index file.
Let’s configure the cstf.conf file:
#
# Minimal Sphinx configuration sample (clean, simple, functional)
#
source src1
{
type = mysql
# Your database is thin
sql_host = localhost
sql_user = mysql
sql_pass =
sql_db = test
sql_port = 3306 # optional, default is 3306
sql_query = \
SELECT id, group_id, UNIX_TIMESTAMP(date_added) AS date_added, title, content \
FROM documents
sql_attr_uint = group_id
sql_attr_timestamp = date_added
sql_query_info = SELECT * FROM documents WHERE id=$id
}
index test1
{
source = src1
Make sure the path exists, there is no pre-created path
path = /usr/local/coreseek/var/data/test1
docinfo = extern
charset_type = sbcs
}
indexer
{
mem_limit = 32M
}
searchd
{
port = 9312
Make sure the path exists, there is no pre-created path
log = /usr/local/coreseek/var/log/searchd.log
Make sure the path exists, there is no pre-created path
query_log = /usr/local/coreseek/var/log/query.log
read_timeout = 5
max_children = 30
Make sure the path exists, there is no pre-created path
pid_file = /usr/local/coreseek/var/log/searchd.pid
max_matches = 1000
seamless_rotate = 1
preopen_indexes = 0
unlink_old = 1
}
Copy the code
We import example. SQL in /usr/local/coreseek/etc (installation directory) into the database
Use the test database
MariaDB [(none)]> use test;
Database changed
# import SQL file
MariaDB [test] >source /usr/localSQL Query/coreseek/etc/example. OK, 0 rows affected, 1 warning (0.018 SEC) Query OK, 0 rows affected (0.011 SEC) Query OK, 4 rows affected (0.003 SEC) Records: 4 Duplicates: 0 Warnings: 0 Query OK, 0 rows affected, 1 Warning (0.002 SEC) Query OK, 0 rows affected (0.010 SEC) Query OK, 10 Rows Affected (0.001 SEC) Records: 10 Duplicates: 0 Warnings: 0Copy the code
Create index:
/usr/local/coreseek/bin/indexer -c /usr/local/ coreseek/etc/CSFT. Conf -- -- all - rotateCopy the code
If the creation is successful:
Coreseek Fulltext 3.2 [Sphinx 0.9.9-Release (R2117)] Copyright (C) 2007-2011, Beijing Choice Software Technologies Inc (http://www.coreseek.com) using config file'/usr/local/coreseek/etc/csft.conf'. indexing index'test1'. Collected 4 Docs, 0.0 MB sorted 0.0 Mhits, 100.0%doneTotal 4 docs, 193 bytes Total 0.003 SEC, 56581 bytes/ SEC, 1172.67 docs/ SEC Total 2 reads, 0.000 SEC, 0.1 KB /call AVg, 0.0msec /call AVg total 7 index, 0.000 SEC, 0.1 KB /call AVg, 0.0msec /call AVg WARNING: failed to scanf pid from pid_file'/usr/local/coreseek/var/log/searchd.pid'.
WARNING: indices NOT rotated.
Copy the code
The last two warnings are missing files.
Instead of creating it yourself, restart the server and then restart CoreSeek
Five: Coreseek common command
1: start
/usr/local/coreseek/bin/searchd
Copy the code
2: stop
/usr/local/ coreseek/bin/searchd - stopCopy the code
3: Creates an index
/usr/local/coreseek/bin/indexer -c /usr/local/ coreseek/etc/CSFT. Conf -- -- all - rotateCopy the code
4: Search test
/usr/local/coreseek/bin/search -c /usr/local/coreseek/etc/csft_mysql.conf -a abc
Copy the code
5: If the index is created at the coreseek runtime, add the –rotate argument so that the index creation takes effect immediately
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/csft_mysql.conf --all --rotate
Copy the code
For other usage methods, see Sphinx.
For good suggestions, please enter your comments below.
Welcome to my blog guanchao.site