In the previous article, “Elasticsearch: Creating your own Ingest Processor,” I convincingly showed you how to create an Ingest plug-in using a template. In today’s article, we use another approach to do the same thing. We will use Eclipse to generate a Maven project.

preface

Elasticsearch is written in Java, and all the various modules are hooked to Google Guice during Elasticsearch server startup. In addition, a plugin API is provided to further extend Elasticsearch’s functionality. In this article, we’ll show you how to write an Ingest plug-in for Elasticsearch that filters out a keyword in a document field before the document index is indexed. The example plug-in is fairly simple, but it gives you enough knowledge to write a fully functional Elasticsearch Ingest plug-in that can process documents before they are actually indexed.

Elasticsearch plug-in API

Elasticsearch loads plugins from Elasticsearch’s plugins directory. Each plug-in needs to provide an implementation of org. Elasticsearch. Plugins. The Plugin interface classes. In addition, the class can also be from the same org. Elasticsearch. Plugins package implementation a particular interface, determine the plug-in type such as:

In addition, each plugin needs to provide plugin metadata to Elasticsearch in the plugin-Descriptor.properties file and optionally in the plugin-security.policy file. This file has the access control permissions (specified by the JDK) required by the plug-in to secure the sandbox model.

 

Hands-on practice

Create the plug-in

In today’s exercise, we will create a keyword over ingest plug-in. We will use Eclipse to demonstrate this. First start eclipse for us:

Above, we select Maven Project:

Let’s Next select our workspace location and click Next:

If you haven’t already configured your catalog, you can click Configure:

Let’s click Add Remote Catalog and Add the following Catalog repo1.maven.org/maven2/:

 

After we have configured the catalog. I go back to the previous page:

Let’s click the Next button:

We fill in the above information and click the Finish button. We have thus produced a sample plugin. But this plugin is not the Ingest Plugin framework we were hoping for. We need to change it accordingly:

FilterIngestPlugin.java

package com.liuxg.elasticsearch.plugin.filter; import java.util.HashMap; import java.util.Map; import org.elasticsearch.ingest.Processor; import org.elasticsearch.plugins.IngestPlugin; import org.elasticsearch.plugins.Plugin; public class FilterIngestPlugin extends Plugin implements IngestPlugin { @Override public Map<String, Processor.Factory> getProcessors(Processor.Parameters parameters) { Map<String, Processor.Factory> processors = new HashMap<>(); processors.put(FilterWordProcessor.TYPE, new FilterWordProcessor.Factory()); return processors; }}Copy the code

Once the plug-in is installed, the getProcessors method returns one or more ingest processors that can be used by Elasticsearch. In this case, we will register one of the processors provided by the FilterWordProcess class with the following implementation. We then click on the red link above and produce another file called FilterWordProcessor.java:

FilterWordProcessor.java

package com.liuxg.elasticsearch.plugin.filter; import java.util.Map; import org.elasticsearch.ingest.AbstractProcessor; import org.elasticsearch.ingest.ConfigurationUtils; import org.elasticsearch.ingest.IngestDocument; import org.elasticsearch.ingest.Processor; public class FilterWordProcessor extends AbstractProcessor { public static final String TYPE = "filter_word"; private String filterWord; private String field; public FilterWordProcessor(String tag, String filterWord, String field) { super(tag); this.filterWord = filterWord; this.field = field; } @Override public IngestDocument execute(IngestDocument ingestDocument) throws Exception { IngestDocument document = ingestDocument; String value = document.getFieldValue(field, String.class); String clearedValue = value.replace(filterWord, ""); document.setFieldValue(field, clearedValue); return document; } @Override public String getType() { return TYPE; } public static final class Factory implements Processor.Factory { @Override public Processor create(Map<String, Processor.Factory> registry, String processorTag, Map<String, Object> config) throws Exception { String field = ConfigurationUtils.readStringProperty(TYPE, processorTag, config, "field"); String filterWord = ConfigurationUtils.readStringProperty(TYPE, processorTag, config, "filterWord"); return new FilterWordProcessor(processorTag, filterWord, field); }}}Copy the code

The Ingest handler has a type called filter_word, which Elasticsearch uses to manage it. The plug-in uses two properties that are specified when the processor is created: field, which represents the document fields to be filtered, and filterWord, which represents the words in the document fields to be filtered out before indexing the document. The execute method provides the logic of the processor. Its implementation is very simple.

All right. We are almost done with the code we need to change.

 

Compile the plug-in

Let’s go ahead and compile the code we just generated. We use the following command:

mvn clean install
Copy the code

If nothing else, you’ll see the same error message I did:

We went back to file RestFilterIngestAction. Java file, and do the corresponding modification:

We make the corresponding changes, and then recompile:

We copied the generated plugin into the Elasticsearch installation root directory:

Cp. / target/releases/filter - ingest - plugin - 1.0.0 - the SNAPSHOT. Zip ~ / elastic0 / elasticsearch - 7.6.2Copy the code

Depending on where you install Elasticsearch, the path to the command above will need to change depending on your own installation.

Go to the root directory of the Elasticsearch installation and type the following command:

The. / bin/elasticsearch - plugin install file: the Users/liuxg/elastic0 / elasticsearch - 7.6.2 / filter - ingest the plugin - 1.0.0 - the SNAPSHOT. ZipCopy the code

You must note the file:/// above. Otherwise your installation will be unsuccessful. This is followed by the path to your file. After the installation is complete, we can use the following command to check whether the installation is successful:

./bin/elasticsearch-plugin list
Copy the code

The last step is a very important one. We need to restart our Elasticsearch so that the new plugin will work.

Test it out in Kibana

Open Kibana and type the following command:

PUT /_ingest/pipeline/filter_crap
{
  "processors": [
    {
      "filter_word": {
        "field": "description",
        "filterWord": "crap"
      }
    }
  ]
}
Copy the code

Above, we want to filter the description field and our filter keyword is crap, which is crap. Run the above command to display:

{
  "acknowledged" : true
}
Copy the code

This indicates that our filter_word processor has been installed and can be found. This command will fail if it is not installed or not loaded by Elasticsearch.

We then execute the following command:

PUT /order_data/_doc/1? pipeline=filter_crap { "description": "crap ! Don't buy this." }Copy the code
{
  "_index" : "order_data",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 1,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}
Copy the code

Up there, in decription, there’s a crap word. We hope that after our filters there will be no crap in the final Elasticsearch file. It’s kind of like everything we see forever is good. No bad language 🙂

We then use the following command to check:

GET order_data/_search
Copy the code
{ "took" : 1, "timed_out" : false, "_shards" : { "total" : 1, "successful" : 1, "skipped" : 0, "failed" : 0 }, "hits" : {" total ": {" value" : 1, the "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ":" order_data ", "_type" : "_doc", "_id" : "1", "_score" : 1.0, "_source" : {" description ":"! Don 't buy this. "}}}}]Copy the code

Up here, we can see that in description, the word Crap is gone. It shows that our Filter Processor is working properly.

If you are interested in this whole code, please visit my source: github.com/liu-xiao-gu…