This is the fifth day of my participation in the August More text Challenge. For details, see: August More Text Challenge.
The term “point-and-shoot” is borrowed from a point-and-shoot camera, also known as a lightweight camera or a full-automatic camera. It usually refers to a small full-automatic camera designed for easy operation and aimed at the average person.
Find interesting, entry-level open source projects on HelloGitHub. The name ElasticSearch comes to mind when it comes to open source search, but ES is a bit heavy for individual projects.
MeiliSearch is a lightweight, open-source search engine for dummies that everyone can use
Github.com/meilisearch…
Before I talk about what MeiliSearch does, I’d like to talk about how I found it and liked it.
I don’t ask for much
I developed HelloGitHub small program: support keyword search in the monthly open source projects.
Small program search function is written with Rust open source search engine Sonic, although it is fast search but use the process found:
- Chinese word segmentation is not supported, resulting in poor search results
- There is no official Python client, and the tripartite open source client is problematic
- Only the ID is returned. You need to associate other data with the database
These problems directly affect the search experience, which makes me very distressed while looking at the knowledge related to search, and also looking for new open source solutions. Looking for a:
Deployment + simple configuration, support Chinese word segmentation, fast search speed, lightweight open source search engine project.
Commonly known as: fool Chinese search engine.
It has a beautiful name
It has a meili name, “MeiliSearch,” and is also an open source search engine written in Rust that supports:
Overview features: fast search, full text search, Support for Chinese characters, easy to install and maintain, this is not what I am looking for: fool Chinese search engine?
I’m already rubbing my hands and itching to try, no more words to do!
Start simple
Paper to zhongjue shallow, have to try to use the effect.
1. Installation and startup
Linux & Mac OS one-click installation and startup commands:
curl -L https://install.meilisearch.com | sh
./meilisearch
Copy the code
This installation is not enough fool 🤪 start successfully as shown below:
MeiliSearch’s Web search page is available from a browser: http://127.0.0.1:7700/. I pre-wrote some data to demonstrate the search:
2. Basic operations
MeiliSearch is a search service that provides RESTful API communication protocols for more general use. The official client is available in a variety of programming languages:
- JavaScript
- Python
- PHP
- Go
- .
The following demo will use Python code as an example to install the Python SDK:
Python3.6+ pip3/ PIP install meilisearchCopy the code
Connect, write, query, delete and other basic operations in Python:
import meilisearch
client = meilisearch.Client('http://127.0.0.1:7700'.'masterKey') # masterKey is the password
# index is equivalent to the database table
index = client.index('books')
# Prepare to write the search data
documents = [
{ 'book_id': 123.'title': 'Pride and Prejudice' },
{ 'book_id': 456.'title': 'Le Petit Prince' },
{ 'book_id': 1.'title': 'Alice In Wonderland' },
{ 'book_id': 1344.'title': 'The Hobbit' },
{ 'book_id': 4.'title': 'Harry Potter and the Half-Blood Prince' },
{ 'book_id': 42.'title': 'The Hitchhiker\'s Guide to the Galaxy'}]# Delete: delete the specified index
index.delete_all_documents()
# write:
result = index.add_documents(documents)
# The engine will replace or add data according to the written data ID
# write does not mean that search engine processing is complete, you can check the status of the return updateId
index.get_update_status(result.get('updateId'))
# enqueued, processed or failed
# check:
index.search('harry pottre')
# the results:
# contains rich fields
""" hits" => [{"book_id" => 4, "title" => "Harry Potter and the half-blood Prince"}], // processingTimeMs => 1; // Query => harry pottre} ""
Copy the code
The most basic functions of search have now been implemented, but the exploration does not stop there.
3. Optimize search results
MeiliSearch can configure rules to improve search results:
- Synonyms: synonyms
- StopWords: stopWords (to save storage space and improve search efficiency, automatically filter out some words or words)
- RankingRules: rules of ordering
- .
You can update the MeiliSearch configuration with the Python client. Example code:
# stop words
client.index('movies').update_settings({
'stopWords': [
'the'.'a'.'an'],})# Collation rules
client.index('movies').update_ranking_rules([
"typo"."words"."proximity"."attribute"."wordsPosition"."exactness"."asc(publish_time)"."desc(watch)"
])
# Check stop words
client.index('movies').get_stop_words()
# Reset Settings
# index.reset_settings()
# All operations except search are asynchronous and will return an updateId that requires the ID to query the processing status
# wait_for_pending_UPDATE can block until the result is processed
Copy the code
These Settings can effectively improve the search effect. For example, before using the stop word, the search for “open source books” will not hit “open source books”, adding the stop word can hit, because the stop word (no word) included in the input content is ignored when matching.
When I tested the search effect, I found that go could not be found, but Golang could be found. After searching for a long time, I finally found that go was in the above dictionary of stop words 😅
In addition, there is no word association (suggest) for sonic, which can be achieved by creating index+searchableAttributes.
I did not find the synonym set, if you have a Chinese/English synonym dictionary, please let me know
4, deployment,
MeiliSearch deployment is simple, just add to your system’s Systemd service.
cat << EOF > /etc/systemd/system/meilisearch.service [Unit] Description=MeiliSearch After=systemd-user-sessions.service [Service] Type=simple ExecStart=/usr/bin/meilisearch --http-addr 127.0.0.1:7700 --env production --master-key XXXXXX [Install] WantedBy=default.target EOF # Set the service meilisearch systemctl enable meilisearch # Start the meilisearch service systemctl start meilisearch # Verify that the service is actually running systemctl status meilisearchCopy the code
However, when deploying a formal environment, note the following:
- The password must be set in the production environment, but is not mandatory in the development environment
- The production Web page closes
- There is no remote access and permission control. You can use Nginx to implement IP whitelist and Cerbot to implement HTTPS to improve security
- through
The curl address
Viewing service Status
These are some of the things I’ve learned about using MeiliSearch, and the general feeling I’ve gotten:
- Simple installation, no complex configuration: worry free
- Easy to write data, rich features: fool
- The query is fast
A command can start the search service, a line of code to achieve search function, with it I can achieve this search small white minute search service, comfortable ~
Crystallization of Love (Actual combat)
I rewrote the Search function of HelloGitHub with MeiliSearch, and used the FastAPI framework on the back end. Some new features have also been added:
- Top search terms
- Project Details page
- Item Mirror address Improves access speed
- The new interface
The second version of HelloGitHub is shown below:
Has been online, the public number with the same name has an entrance.
In the future, I plan to add information flow, comments, scoring, user system, and points system. Because I am the only one to develop, so the progress will be slow… But I’m not going to quit 💪
The last
On the one hand, I need to learn some knowledge about participles and NLP, on the other hand, I need to familiarize myself with its API and principles, and then find some dictionary assistance, which should improve the accuracy, but a watch is a watch. Take your time.
Finally, I hope this article is helpful to you, and that’s all for today’s article. If you want to share your thoughts on playing with open source projects, please submit your original article to me.
A good open source project is like a shell on the seashore, and it needs someone to find it.
HelloGitHub is a shellfish picker. Find open source projects on HelloGitHub!
Follow the HelloGitHub public account to receive the first updates.
There are many more open source projects and treasure trove projects to be discovered.