This is the 8th day of my participation in the November Gwen Challenge. Check out the event details: The last Gwen Challenge 2021

Elasticsearch is based onApache LuceneCreated in 2010 by Elasticsearch NV (nowElastic) first release. According to Elastic, it’s oneDistributed open source search and analysis engine for all types of data, including text, numerical, geospatial, structured and unstructured. The Elasticsearch operation is implemented through the REST API. The main functions are:

Store documents in indexes,
Search the index using powerful queries to retrieve these documents, as well
Run analysis functions on the data.

Spring Data Elasticsearch provides a simple interface to perform these operations on Elasticsearch as an alternative to using the REST API directly. Here, we’ll use Spring Data Elasticsearch to demonstrate Elasticsearch’s indexing and search capabilities, and finally build a simple search application for searching products in the product inventory.

Code sample

Examples of working code on GitHub are attached to this article.

Elasticsearch concept

The easiest way to understand the Elasticsearch concept is to use a database analogy, as shown in the following table:

Elasticsearch	->	The database
The index	->	table
The document	->	line
The document	->	column

Any data we want to search for or analyze is stored in the index as a document. In Spring Data, we represent a document as a POJO and decorate it with annotations to define the mapping to Elasticsearch documents.

Unlike a database, text stored in Elasticsearch is first processed by various analyzers. The default parser splits the text by common word separators, such as Spaces and punctuation, and removes common English words.

If we store The text “The Sky is blue”, The parser stores it as a document containing The “terms” “sky” and “blue”. We will be able to search this document using text in the form of “Blue Sky,” “Sky,” or “blue,” with the degree of match as a score.

Elasticsearch can store other types of data in addition to text, called Field types, as described in the Mapping-types section of the documentation.

Start the Elasticsearch instance

Before we go any further, let’s launch an instance of Elasticsearch that we’ll use to run our example. There are several ways to run an instance of Elasticsearch:

Using a hosted service
Use hosted services from cloud providers such as AWS or Azure
Install Elasticsearch on the vm cluster
Run the Docker image

We’ll use a Docker image from Dockerhub, which is good enough for our demo application. Let’s start the Elasticsearch instance by running the Docker run command:

Docker run - p \ 9200-9200 - e discovery. Type = "single - node" \ docker elastic. Co/elasticsearch/elasticsearch: 7.10.0Copy the code

Executing this command will start an Elasticsearch instance listening on port 9200. We can verify the instance status by clicking on the URL http://localhost:9200 and check the resulting output in the browser:

{
  "name" : "8c06d897d156"."cluster_name" : "docker-cluster"."cluster_uuid" : "Jkx.. VyQ"."version" : {
  "number" : "7.10.0". },"tagline" : "You Know, for Search"
}
Copy the code

If our instance of Elasticsearch started successfully, you should see the output above.

Use REST apis for indexing and searching

The Elasticsearch operation is accessed through the REST API. There are two ways to add documents to the index:

Add one document at a time, or
Add documents in batches.

The API for adding a single document takes a single document as a parameter.

A simple PUT request to an Elasticsearch instance is used to store the document as follows:

PUT /messages/_doc/1
{
  "message": "The Sky is blue today"
}
Copy the code

This stores The message – “The Sky is Blue Today” – as a document in The index of “Messages”.

We can retrieve this document using a search query sent to the search REST API:

GET /messages/search
{
  "query":
  {
  "match": {"message": "blue sky"}}}Copy the code

Here we send a query of type match to get documents that match the string “Blue Sky”. We can specify queries to search documents in a number of ways. Elasticsearch provides a JSON-based query DSL (Domain Specific Language) to define queries.

For batch add, we need to provide a JSON document with entries like the following code snippet:

POST /_bulk
{"index": {"_index":"productindex"}} {"_class":".. Product"."name":"Corgi Toys .. Car"."manufacturer":"Hornby"} {"index": {"_index":"productindex"}} {"_class":".. Product"."name":"CLASSIC TOY .. BATTERY". ."manufacturer":"ccf"}
Copy the code

Use Spring Data for Elasticsearch operations

There are two ways to access Elasticsearch using Spring Data, as shown below:

Repositories: We define methods in the interface, Elasticsearch queries are generated at runtime from method names.
ElasticsearchRestTemplate: we use the method of chain and native queries to create queries, in order to better control in a relatively complex scenarios created Elasticsearch queries.

We will examine both approaches in more detail in the following sections.

Create the application and add dependencies

Let’s start by creating our application using Spring Initializr with dependencies that include Web, Thymeleaf, and Lombok. Add the Thymeleaf dependency to increase the user interface.

Add the spring-data-elasticSearch dependency to Maven pop.xml:

<dependency>
    <groupId>org.springframework.data</groupId>
    <artifactId>spring-data-elasticsearch</artifactId>
</dependency>
Copy the code

Connect to Elasticsearch instance

Spring Data Elasticsearch uses the Java High Level REST Client (JHLC) to connect to the Elasticsearch server. JHLC is the default client for Elasticsearch. We’ll create a Spring Bean configuration to set it up:

@Configuration
@EnableElasticsearch
Repositories(basePackages
        = "io.pratik.elasticsearch.repositories")@ComponentScan(basePackages = { "io.pratik.elasticsearch" })
public class ElasticsearchClientConfig extends
         AbstractElasticsearchConfiguration {
  @Override
  @Bean
  public RestHighLevelClient elasticsearchClient(a) {


  final ClientConfiguration clientConfiguration =
    ClientConfiguration
      .builder()
      .connectedTo("localhost:9200")
      .build();


  returnRestClients.create(clientConfiguration).rest(); }}Copy the code

Here, we connect to the Elasticsearch instance we started earlier. We can further customize the connection by adding more properties, such as enabling SSL, setting timeout, and so on.

For debugging and diagnostics, we will turn on transport-level request/response logging in the logging configuration of logback-spring.xml:

public class Product {
  @Id
  private String id;
  
  @Field(type = FieldType.Text, name = "name")
  private String name;
  
  @Field(type = FieldType.Double, name = "price")
  private Double price;
  
  @Field(type = FieldType.Integer, name = "quantity")
  private Integer quantity;
  
  @Field(type = FieldType.Keyword, name = "category")
  private String category;
  
  @Field(type = FieldType.Text, name = "desc")
  private String description;
  
  @Field(type = FieldType.Keyword, name = "manufacturer")
  privateString manufacturer; . }Copy the code

Express document

In our example, we will search for products by name, brand, price, or description. Therefore, in order to store the product as a document in Elasticsearch, we represent the product as a POJO and add a Field annotation to configure the mapping of Elasticsearch, as shown below:

public class Product {
  @Id
  private String id;
  
  @Field(type = FieldType.Text, name = "name")
  private String name;
  
  @Field(type = FieldType.Double, name = "price")
  private Double price;
  
  @Field(type = FieldType.Integer, name = "quantity")
  private Integer quantity;
  
  @Field(type = FieldType.Keyword, name = "category")
  private String category;
  
  @Field(type = FieldType.Text, name = "desc")
  private String description;
  
  @Field(type = FieldType.Keyword, name = "manufacturer")
  privateString manufacturer; . }Copy the code

The @document annotation specifies the index name.

The @ID annotation makes the annotation field the _id of the document as a unique identifier in this index. The ID field has a limit of 512 characters.

The @field annotation configures the type of Field. We can also set the name to a different field name.

An index named ProductIndex is created in Elasticsearch based on these annotations.

Use Spring Data Repository for indexing and searching

The repository provides the most convenient way to access the Data in Spring Data using finder methods. The Elasticsearch query is created based on the method name. However, we must be careful not to generate inefficient queries and load the cluster.

Let’s create a Spring Data repository interface by extending the ElasticsearchRepository interface:

public interface ProductRepository
    extends ElasticsearchRepository<Product.String> {}Copy the code

Here the ProductRepository class inherits the Save (), saveAll(), find(), and findAll() methods contained in the ElasticsearchRepository interface.

The index

We will now store a product in the index by calling the save() method and bulk indexing by calling the saveAll() method. Prior to this, we put the repository interface in a service class:

@Service
public class ProductSearchServiceWithRepo {


  private ProductRepository productRepository;


  public void createProductIndexBulk(final List<Product> products) {
    productRepository.saveAll(products);
  }


  public void createProductIndex(final Product product) { productRepository.save(product); }}Copy the code

When we call these methods from JUnit, we can see the REST API call index and bulk index in the trace log.

search

To satisfy our search requirements, we will add finder methods to the repository interface:

public interface ProductRepository
    extends ElasticsearchRepository<Product.String> {
  List<Product> findByName(String name);
  
  List<Product> findByNameContaining(String name);
  List<Product> findByManufacturerAndCategory
       (String manufacturer, String category);
}
Copy the code

When we run the findByName() method with JUnit, we can see the Elasticsearch query generated in the trace log before it is sent to the server:

TRACE Sending request POST /productindex/_search? . : Request body: {.." Query ": {" bool" : {" must ": [{" query_string" : {" query ":" apple ", "fields" : [" name ^ 1.0 "],..}Copy the code

Similarly, by running findByManufacturerAndCategory () method, we can see that using two query_string parameters corresponding to two fields – “manufacturer” and “category” generated by the query:

TRACE .. Sending request POST /productindex/_search.. : Request body: {.." Query ": {" bool" : {" must ": [{" query_string" : {" query ":" samsung ", "fields" : [" manufacturer ^ 1.0 "], and..}}, {" query_string ": {" query" :" Laptop, "" fields" : [" category ^ 1.0 "], and..}}],.... }},"version":true}Copy the code

There are several method naming patterns that can generate various Elasticsearch queries.

Using ElasticsearchRestTemplate index and search

When we need more control over how we design queries, or when the team has mastered Elasticsearch syntax, the Spring Data repository may not be appropriate.

In this case, we use ElasticsearchRestTemplate. It is a new HTTP-based client for Elasticsearch, replacing the TransportClient which used the node-to-node binary protocol.

ElasticsearchOperations ElasticsearchRestTemplate implements the interface, the interface is responsible for the underlying search and cluster fuck multifarious work.

The index

The interface has a method index() for adding a single document and a method bulkIndex() for adding multiple documents to an index. The code snippet here shows how to use bulkIndex() to add multiple products to the index “productIndex” :

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";
  private ElasticsearchOperations elasticsearchOperations;


  public List<String> createProductIndexBulk
            (final List<Product> products) {


      List<IndexQuery> queries = products.stream()
      .map(product->
        new IndexQueryBuilder()
        .withId(product.getId().toString())
        .withObject(product).build())
      .collect(Collectors.toList());;
    
      returnelasticsearchOperations .bulkIndex(queries,IndexCoordinates.of(PRODUCT_INDEX)); }... }Copy the code

The documents to be stored are contained in IndexQuery objects. The bulkIndex() method takes as input a list of IndexQuery objects and the Index name contained in IndexCoordinates. When we execute this method, we get a REST API trace for bulk requests:

Sending request POST /_bulk? timeout=1m with parameters: Request body: {"index":{"_index":"productindex","_id":"383.. 35"}}{"_class":".. Product","id":"383.. 35","name":"New Apple.. phone",.. manufacturer":"apple"} .. {"_class":".. Product","id":"d7a.. 34, ".." manufacturer":"samsung"}Copy the code

Next, we add a single document using the index() method:

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";
   
  private ElasticsearchOperations elasticsearchOperations;


  public String createProductIndex(Product product) {


    IndexQuery indexQuery = new IndexQueryBuilder()
         .withId(product.getId().toString())
         .withObject(product).build();


    String documentId = elasticsearchOperations
     .index(indexQuery, IndexCoordinates.of(PRODUCT_INDEX));


    returndocumentId; }}Copy the code

The trace accordingly shows the REST API PUT requests for adding individual documents.

Sending request PUT /productindex/_doc/59d.. 987.. : Request body: {"_class":".. Product","id":"59d.. 87 ",.. ,"manufacturer":"dell"}Copy the code

search

ElasticsearchRestTemplate also has the search () method, is used to search for documents in the index. This search operation is similar to the Elasticsearch Query, which is built by constructing a Query object and passing it to the search method.

Query objects come in three variants – NativeQueryy, StringQuery, and CriteriaQuery, depending on how we construct the Query. Let’s build some queries to search for products.

NativeQuery

NativeQuery provides maximum flexibility for building queries using objects that represent Elasticsearch constructs such as aggregation, filtering, and sorting. This is the NativeQuery used to search for products that match a particular manufacturer:

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";
  private ElasticsearchOperations elasticsearchOperations;


  public void findProductsByBrand(final String brandName) {


    QueryBuilder queryBuilder =
      QueryBuilders
      .matchQuery("manufacturer", brandName);


    Query searchQuery = newNativeSearchQueryBuilder() .withQuery(queryBuilder) .build(); SearchHits<Product> productHits = elasticsearchOperations .search(searchQuery, Product.class, IndexCoordinates.of(PRODUCT_INDEX)); }}Copy the code

Here, we build the query using the creative search query builder, which uses the MatchQueryBuilder to specify a matching query that contains the field “manufacturer.”

StringQuery

StringQuery provides full control by allowing native Elasticsearch queries to be used as JSON strings, as shown below:

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";
  private ElasticsearchOperations elasticsearchOperations;


  public void findByProductName(final String productName) {
    Query searchQuery = new StringQuery(
      "{\"match\":{\"name\":{\"query\":\""+ productName + "\"}}} \ ""); SearchHits<Product> products = elasticsearchOperations.search( searchQuery, Product.class, IndexCoordinates.of(PRODUCT_INDEX_NAME)); . }}Copy the code

In this snippet, we specify a simple match query to get a product with a specific name sent as a method parameter.

CriteriaQuery

Using CriteriaQuery, we can build queries without knowing any of the terms of Elasticsearch. The query is built using a chain of methods with Criteria objects. Each object specifies some criteria for searching documents:

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";
   
  private ElasticsearchOperations elasticsearchOperations;


  public void findByProductPrice(final String productPrice) {
    Criteria criteria = new Criteria("price")
                  .greaterThan(10.0)
                  .lessThan(100.0);


    Query searchQuery = newCriteriaQuery(criteria); SearchHits<Product> products = elasticsearchOperations .search(searchQuery, Product.class, IndexCoordinates.of(PRODUCT_INDEX_NAME)); }}Copy the code

In this code snippet, we use CriteriaQuery to form queries to get products with prices greater than 10.0 and less than 100.0.

Build the search application

We will now add a user interface to our application to see the product search in action. The user interface will have a search input box for searching products by name or description. The input box will have auto-complete to display a list of suggestions based on available products, as follows:We will create autocomplete suggestions for the user’s search input. The product is then searched based on a name or description that closely matches the search text entered by the user. We’ll build two search services to implement this use case:

Get autocomplete search suggestions
Process searches for search products based on the user’s search queries

The service class ProductSearchService will contain methods to search and get suggestions.

Mature applications with user interfaces are available in the GitHub repository.

Build product search indexes

Productindex is the same as the index we used to run JUnit tests. We will first remove productIndex using the Elasticsearch REST API to create a new ProductIndex during application startup using products loaded from our sample dataset of 50 fashion products:

curl -X DELETE http://localhost:9200/productindex
Copy the code

If the deletion is successful, we receive the message {“acknowledged”: true}.

Now, let’s create an index for the products in inventory. We will use a sample dataset of 50 products to build our index. These products are arranged as separate lines in the CSV file.

Each row has three attributes – ID, name, and Description. We want to create indexes during application startup. Note that in a real production environment, index creation would be a separate process. We will read each line of the CSV and add it to the product index:

@SpringBootApplication
@Slf4j
public class ProductsearchappApplication {...@PostConstruct
  public void buildIndex(a) {
    esOps.indexOps(Product.class).refresh();
    productRepo.saveAll(prepareDataset());
  }


  private Collection<Product> prepareDataset(a) {
    Resource resource = new ClassPathResource("fashion-products.csv"); .returnproductList; }}Copy the code

In this fragment, we do some pre-processing by reading rows from the dataset and passing them to the saveAll() method of the repository to add products to the index. When running the application, we can see the following trace log in the application startup.

. Sending request POST /_bulk? timeout=1m with parameters: Request body: {"index":{"_index":"productindex"}}{"_class":"io.pratik.elasticsearch.productsearchapp.Product","name":"Hornby 2014 Catalogue","description":"Product Desc.. talogue","manufacturer":"Hornby"}{"index":{"_index":"productindex"}}{"_class":"io.pratik.elasticsearch.productsearchapp. Product","name":"FunkyBuys.." ,"description":"Size Name:Lar.. & Smoke","manufacturer":"FunkyBuys"}{"index":{"_index":"productindex"}}. ...Copy the code

Use multi-field and fuzzy search to search for products

Here’s how we handle the search request when we submit it in the method processSearch() :

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";


  private ElasticsearchOperations elasticsearchOperations;


  public List<Product> processSearch(final String query) {
  log.info("Search with query {}", query);
  
  // 1. Create query on multiple fields enabling fuzzy search
  QueryBuilder queryBuilder =
    QueryBuilders
    .multiMatchQuery(query, "name"."description")
    .fuzziness(Fuzziness.AUTO);


  Query searchQuery = new NativeSearchQueryBuilder()
            .withFilter(queryBuilder)
            .build();


  // 2. Execute search
  SearchHits<Product> productHits =
    elasticsearchOperations
    .search(searchQuery, Product.class,
    IndexCoordinates.of(PRODUCT_INDEX));


  // 3. Map searchHits to product list
  List<Product> productMatches = new ArrayList<Product>();
  productHits.forEach(searchHit->{
    productMatches.add(searchHit.getContent());
  });
  return productMatches;
  }...
}
Copy the code

Here, we perform a search on multiple fields – name and description. We also added fuzziness() to search for tightly matched text to explain spelling errors.

Use wildcard search to get suggestions

Next, we build autocomplete for the search text box. When we enter content in the search text field, we will get suggestions by performing a wildcard search using the characters entered in the search box.

We build this function in the fetchSuggestions() method, as follows:

@Service
@Slf4j
public class ProductSearchService {


  private static final String PRODUCT_INDEX = "productindex";


  public List<String> fetchSuggestions(String query) {
    QueryBuilder queryBuilder = QueryBuilders
      .wildcardQuery("name", query+"*");


    Query searchQuery = new NativeSearchQueryBuilder()
      .withFilter(queryBuilder)
      .withPageable(PageRequest.of(0.5))
      .build();


    SearchHits<Product> searchSuggestions =
      elasticsearchOperations.search(searchQuery,
        Product.class,
      IndexCoordinates.of(PRODUCT_INDEX));
    
    List<String> suggestions = new ArrayList<String>();
    
    searchSuggestions.getSearchHits().forEach(searchHit->{
      suggestions.add(searchHit.getContent().getName());
    });
    returnsuggestions; }}Copy the code

We use wildcard queries in the form of search input text, and append * so that if we type “red”, we get suggestions that start with “red”. We limit the suggested number to 5 using the withPageable() method. You can see some screenshots of the search results for a running application here:

conclusion

In this article, we introduced the main operations of Elasticsearch — indexing documents, bulk indexing, and search — which are provided as REST apis. The combination of Query DSL with different analyzers makes searching very powerful.

Spring Data Elasticsearch through the use of Spring Data Repositories or ElasticsearchRestTemplate provides a convenient interface to access to the application of these operations.

We ended up building an application where we saw how to use Elasticsearch’s bulk indexing and search capabilities in a near-real life application.

Motoring Using Elasticsearch with Spring Boot-reflecing
For ELK suite please refer to: ELK Tutorial – Discover, Analyze and Visualize Your Data

mo4tech.com (Moment For Technology) is a global community with thousands techies from across the global hang out!Passionate technologists, be it gadget freaks, tech enthusiasts, coders, technopreneurs, or CIOs, you would find them all here.

Use the search engine Elasticsearch in Spring Boot

Code sample

Elasticsearch concept

Start the Elasticsearch instance

Use REST apis for indexing and searching

Use Spring Data for Elasticsearch operations

Create the application and add dependencies

Connect to Elasticsearch instance

Express document

Use Spring Data Repository for indexing and searching

The index

search

Using ElasticsearchRestTemplate index and search

The index

search

NativeQuery

StringQuery

CriteriaQuery

Build the search application

Build product search indexes

Use multi-field and fuzzy search to search for products

Use wildcard search to get suggestions

conclusion

Use the search engine Elasticsearch in Spring Boot

Code sample

Elasticsearch concept

Start the Elasticsearch instance

Use REST apis for indexing and searching

Use Spring Data for Elasticsearch operations

Create the application and add dependencies

Connect to Elasticsearch instance

Express document

Use Spring Data Repository for indexing and searching

The index

search

Using ElasticsearchRestTemplate index and search

The index

search

NativeQuery

StringQuery

CriteriaQuery

Build the search application

Build product search indexes

Use multi-field and fuzzy search to search for products

Use wildcard search to get suggestions

conclusion

Related Posts

The C++ header declares the definition

Are you clear on these questions about Spring dependency injection?

Spring+SpringMVC+Mybatis to build a Web development project