Solr is introduced
What is a solr
Solr is apache’s top open source project. It is a full-text search server developed in Java and based on Lucene.
Solr provides more query statements than Lucene, it is extensible, configurable, and it optimizes Lucene’s performance.
How does Solr implement full text retrieval?
Indexing process: Solr clients (browsers, Java programs) can send POST requests to Solr servers. The request content is an XML document containing Field information, through which Solr implements index maintenance (adding, deleting, changing).
Search flow: The Solr client (browser, Java program) can send a GET request to the Solr server, and the Solr server returns an XML document.
Solr also has no view rendering capability.
Difference between Solr and Lucene
Lucene is a full text search engine toolkit, it is a JAR package, can not run independently, external services.
Solr is a full-text search server that can run in a servlet container and provide search and indexing separately. Solr is faster and more convenient than Lucene in developing full-text search functions.
Solr installation configuration
download
Solr and Lucene are updated at the same time. The latest version is 5.2.1
Download address: http://archive.apache.org/dist/lucene/solr/
The environment
JDK 1.7 or later Solr :4.10.3 mysql:5x Web server: Tomcat7
Initialize the database script
Solr installation configuration
- Install tomcat
- Copy solr.war from solr-4.10.3\example\webapps to tomcat webapps
- Decompress the war package. After decompression, delete the war package
- Add the Solr extension service pack. Copy the jar from solr-4.10.3\example\lib\ext to tomcat Solr Web-INF lib
- Add log4j properties. Copy log4j.properties from solr-4.10.3\example\resources to apache-tomcat-7.0.57\webapps\solr\ web-INF \classes if the directory does not exist.
- Specify the directory for Solrhome in web.xml
Solrcore installation
Solrcore and solrhome
Solrhome is the home directory of the Solr service. A Solrhome directory contains multiple Solrcore directories, and a Solrcore directory contains configuration files and data files required by a Solr instance.
Each Solrcore can independently provide external search and indexing services. There is no relationship between multiple Solrcores.
Directory structures for Solrcore and SolrHome
Solrhome: solr-4.10.3\example\solr solrcore: solr-4.10.3\example\solr\collection1 Contains configuration files, index file log information
The installation of solrcore
To install SolrCore, you need to install SolrHome first. Copy the files under SolrHome to the solrhome specified in web.xml
Solrcore configuration
The solrconfig. XML configuration file in the solrcore conf directory is used to configure the solrcore running information.
In this file, you configure three main tags: lib, Datadir, and requestHandler.
Lib label
Solrcore needs to add an extension dependency package, and specify the address of the dependency package through the lib tag
Solr.install. dir indicates the solrcore installation directory
Copy the contrib and dis directories in example to
Modifying lib tags
Datadir label
Each Solrcore has its own index file directory, which is in data under solrcore by default.
Data Contains the index index directory and tlog file directory. If you do not want to use the default directory, you can also change the index directory using solrconfig.xml as follows:
RequestHandler label
The requestHandler requests the handler and defines the access to the index and search. You can add, modify, and delete indexes by maintaining /update indexes.
Search the index by /select
<requestHandler name="/select" class="solr.SearchHandler">
<! -- Set the default parameter values, you can change these parameters in the request address -->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int><! -- Display quantity -->
<str name="wt">json</str><! -- Display format -->
<str name="df">text</str><! -- Default search field -->
</lst>
</requestHandler>
Copy the code
Solr interface
Dashboard
Dashboard, showing solr instance running time, version, system resources, JVM and other information. ,
Logging
Solr run log information
Cloud
Cloud is SolrCloud, SolrCloud (cluster), this menu is displayed when running in SolrCloud mode.
Core Admin
Solrcore management interface, where you can add solrcore instances.
java properties
Solr attributes in the JVM runtime environment.
Thread Dump
Display the current active thread information in Solr Server, and also trace the thread running stack information.
Core selector
Select a Solrcore for detailed operations as follows:
Analysis
This interface allows you to test the execution of index and search parsers. Note: In SOLr, parsers are bound to domain types.
dataimport
You can define a data import handler to import from a relational database into the SolR index library. The default configuration must be manually configured.
Document
By default, Solr updates the contents of the Document based on the ID (unique constraint) field. If the ID field cannot be found based on the ID value, solr performs the add operation. If it finds the ID field, solr updates the document.
This menu allows you to create, update, and delete indexes. The interface is as follows:
- Overwrite =”true” : Solr replaces documents in XML if they already exist when indexing
- CommitWithin =”1000″ : Solr commits a document every 1000 (1 second) milliseconds while indexing. In order to facilitate testing, you can also submit the Document immediately and add “” after it.
Query
Use /select to execute a search index. You must specify the q query condition.
More solrcore configuration
The benefits of configuring multiple SolrCores are as follows: 1. When performing SolrCloud, you must configure multiple Solrcores. 2. Different business modules can use different Solrcore to provide search and indexing services.
add
- Step 1: Copy collection1 from solrhome to collection2
- Modify core.properties in the solrcore directory
This completes the configuration of solrcore.
Basic use of Solr
schema.xml
In the schema. XML file, solrcore data information is mainly configured, including the definition of fields and fieldTypes. In Solr, fields and fieldTypes need to be defined before being used.
Field
Define the Field domain
Name: specifies the domain Name Type: specifies the domain Type Indexed Stored Required: specifies whether multiValued If multiValued, for example, if a commodity has multiple images and a Field stores multiple values, you must set multiValued to true.
dynamicField
Dynamic domain
Name: specifies the naming rule of the dynamic domain
uniqueKey
Specify a unique key
id
The ID is the domain name already defined in the Field tag, and the Field is set to Required to true.
There must be only one unique key in a schema.xml file
copyField
Replication domain
Source: indicates the domain name of the Source domain to be replicated. Dest: indicates the domain name of the destination domain
The target domain specified by dest must be multiValued to true.
FieldType
Define the type of the domain
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<! -- in this example, we will only use synonyms at query time <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> -->
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Copy the code
Name: specifies the Name of the domain Type. Class: specifies the solr Type of the domain Type. Analyzer: specifies the Analyzer Type
Chinese word divider
Use ikanalyzer for Chinese word segmentation
-
Step 1: copy the ikanalyzer jar package to the following directory
-
Step 2: copy the configuration files of ikanalyzer’s extended thesaurus to a directory
-
Configuration FieldType
-
Configure a field that uses a Chinese word split
-
Restart the tomcat
Configuring service Field
demand
Index the data of the Products table in jingdong case, so the corresponding field field needs to be defined.
Analysis of the configuration
You need to add pid, Name, catalog, catalog_NAME, price, Description, picture to the index database
FieldType: FieldType has been configured for Chinese word segmentation, so it is not needed.
Field: Pid: Since Pid is the unique key in the Products table and there is already a unique key configuration with an ID in Solr’s shema.xml, there is no need to redefine the Pid Field.
Name:
<! --> <field name="product_name" type="text_ik" indexed="true" stored="true"/>
Copy the code
The Catalog, catalog_name:
<! -- category ID --> <field name="product_catalog" type="string" indexed="true" stored="true"/ > <! --> <field name="product_catalog_name" type="string" indexed="true" stored="false"/>
Copy the code
Price:
<! --> <field name="product_price" type="float" indexed="true" stored="true"/>
Copy the code
Description:
<! --> <field name="product_description" type="text_ik" indexed="true" stored="false"/>
Copy the code
Picture:
<! --> <field name="product_picture" type="string" indexed="false" stored="true"/>
Copy the code
<! --> <field name="product_keywords" type="text_ik" indexed="true" stored="true" multiValued="true"/ > <! Add the commodity name to the target domain --> <copyFieldsource="product_name" dest="product_keywords"/ > <! Add the product description to the target domain --> <copyFieldsource="product_description" dest="product_keywords"/>
Copy the code
Dataimport
The plug-in imports the results of specified SQL statements in the database into the Solr index library.
-
Step 1: Add jar package Dataimport jar package (solr-4.10.3\dist\ Solr-DataimPorthandler – Extras-4.10.3.jar) to copy to
Modify the solrconfig. XML file and add lib tags
The lib dir = "${. Solr. Install dir:.. /.. }/contrib/dataimporthandler/lib" regex=".*\.jar" />
Mysql driver package copy mysql driver package to:
Modify the solrconfig. XML file and add lib tags
<lib dir="${solr.install.dir:.. /.. }/contrib/db/lib" regex=".*\.jar" />
-
Step 2: Configure requesthandler In solrconfig. XML, add a Requesthandler for dataimPort
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler"> <lst name="defaults"> <str name="config">data-config.xml</str> </lst> </requestHandler> Copy the code
- Step 3: create data-config. XML in solrconfig. XML directory, create data-config. XML! [](http://pbzzkhjh1.bkt.clouddn.com/1c648865-30cb-4dbe-9133-525692211bf4.jpg) ```xml <dataConfig> <dataSourcetype="JdbcDataSource"
driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost:3306/taotao"
user="root"
password="root"/>
<document>
<entity name="products" query="select pid,name,catalog,catalog_name,price,description,picture from products ">
<field column="pid" name="id" />
<field column="name" name="product_name" />
<field column="catalog" name="product_catalog" />
<field column="catalog_name" name="product_catalog_name" />
<field column="price" name="product_price" />
<field column="description" name="product_description" />
<field column="picture" name="product_picture" />
</entity>
</document>
</dataConfig>
Copy the code
- Step 4: Restart Tomcat
The use of solrj
What is the solrj
Solrj is the Java client for solr server
Environment to prepare
jdk ide tomcat solrj
Building engineering
-
Solrj’s dependencies and core packages
-
Solrj’s extended service pack
Index maintenance is performed using Solrj
Add/modify indexes
In Solr, there will always be a unique key in the index library, if a Document ID exists, then modify the operation, if not, then add the operation.
@Test
public void insertAndUpdateIndex(a) throws Exception {
/ / create HttpSolrServer
HttpSolrServer server = new HttpSolrServer("http://localhost:8080/solr");
// Create Document object
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id"."c001");
doc.addField("name"."solr test111");
// Add the Document object to the index library
server.add(doc);
/ / submit
server.commit();
}
Copy the code
Remove the index
Delete by the specified ID
Delete according to conditions
@Test
public void deleteIndex(a) throws Exception {
/ / create HttpSolrServer
HttpSolrServer server = new HttpSolrServer("http://localhost:8080/solr");
// Drop the index with the specified ID
// server.deleteById("c001");
// Delete according to conditions
server.deleteByQuery("id:c001");
// Delete all
server.deleteByQuery(: "* *");
/ / submit
server.commit();
}
Copy the code
Index of the query
A simple query
@Test
public void search01(a) throws Exception {
/ / create HttpSolrServer
HttpSolrServer server = new HttpSolrServer("http://localhost:8080/solr");
// Create SolrQuery
SolrQuery query = new SolrQuery();
// Enter the query criteria
query.setQuery("Product_name: minion");
// Execute the query and return the result
QueryResponse response = server.query(query);
// Get all matching results
SolrDocumentList list = response.getResults();
// Total number of matching results
long count = list.getNumFound();
System.out.println("Total number of matching results: + count);
for (SolrDocument doc : list) {
System.out.println(doc.get("id"));
System.out.println(doc.get("product_name"));
System.out.println(doc.get("product_catalog"));
System.out.println(doc.get("product_price"));
System.out.println(doc.get("product_picture"));
System.out.println("= = = = = = = = = = = = = = = = = = = = ="); }}Copy the code
Complex queries
Solr’s query syntax
1. Q – Query keywords, required if all queries use *:*. The requested q is a string
Fq – (filter query); fQ – (filter query);
[product_price:[
3. The sort – sorting, format: sort = + < desc | asc > [, + < desc | asc >]… . Example:
4. Start – paging display, start to record subscripts, starting from 0
5. Rows – Specifies the maximum number of records to return results, in conjunction with start for paging. In actual development, you know the current page number and the number of pages displayed and then you figure out the starting subscript.
6. Fl – Specifies which fields to return, separated by commas or Spaces.
7. Df – Specifies a search Field
8. Wt – (writer type) specifies the output format, which can be XML, JSON, PHP, PHPS, or solr 1.3.
9. Whether hl is highlighted, set highlight Field, set format prefix and suffix.
code
@Test
public void search02(a) throws Exception {
/ / create HttpSolrServer
HttpSolrServer server = new HttpSolrServer("http://localhost:8080/solr");
// Create SolrQuery
SolrQuery query = new SolrQuery();
// Enter the query criteria
query.setQuery("Product_name: minion");
Set ("q", "product_name: minion "); // query.set("q", "product_name: minion ");
// Set the filter criteria
Query. AddFilterQuery (fq)
query.setFilterQueries("product_price:[1 TO 10]");
// Set the sort
query.setSort("product_price", ORDER.asc);
// Set the paging information (use the default)
query.setStart(0);
query.setRows(10);
// Set the set of fields to display
query.setFields("id,product_name,product_catalog,product_price,product_picture");
// Set the default field
query.set("df"."product_keywords");
// Set the highlighting information
query.setHighlight(true);
query.addHighlightField("product_name");
query.setHighlightSimplePre("<em>");
query.setHighlightSimplePost("</em>");
// Execute the query and return the result
QueryResponse response = server.query(query);
// Get all matching results
SolrDocumentList list = response.getResults();
// Total number of matching results
long count = list.getNumFound();
System.out.println("Total number of matching results: + count);
// Get the highlighted information
Map<String, Map<String, List<String>>> highlighting = response
.getHighlighting();
for (SolrDocument doc : list) {
System.out.println(doc.get("id"));
List<String> list2 = highlighting.get(doc.get("id")).get(
"product_name");
if(list2 ! =null)
System.out.println("Highlighted trade name:" + list2.get(0));
else {
System.out.println(doc.get("product_name"));
}
System.out.println(doc.get("product_catalog"));
System.out.println(doc.get("product_price"));
System.out.println(doc.get("product_picture"));
System.out.println("= = = = = = = = = = = = = = = = = = = = ="); }}Copy the code