In today’s article, we’ll do a simple Logstash experience and try Geo Search. Our data is all the Zipcodes, or zip codes, in China.

The installation

If you haven’t already installed your own Elastic Stack:

  • See the article “Elastic: Beginner’s Guide” to install Elasticsearch and Kibana.
  • See article “How to Install a Logstash in an Elastic stack.

In the above installation, we did it by default. Elasticsearch runs on localhost:9200 and Kibana runs on localhost:5601.

Download the data

We download our data through the following command:

git clone https://github.com/liu-xiao-guo/elasticzipcodes
Copy the code

From the download above, we can get a file called zipcodes.csv. Its file reads as follows:

From the table above, we can see: In the zipcodes.csv file, it contains fields Id, Code, AreaCode, Name, ShortName, Logitude, Latitude, Sort, Memo, and Disabled. We place the downloaded file in a directory we like.

Import data

To import this data into Elasticsearch, we use Logstash. To do this, we’ll write our own logstash configuration file:

logstash_zipcodes.conf

input {
	file {
		path => "/Users/liuxg/data/zipcodes/zipcodes.csv"
		start_position => "beginning"
		sincedb_path => "/dev/null"
	}
}


filter {
	csv {
		separator => ","
		columns => ["Id", "Code", "AreaCode", "Name", "ShortName", "Longitude", "Latitude", "Sort", "Memo", "Disabled"]
	}

	mutate {
		convert => {"Longitude" => "float"}
		convert => {"Latitude" => "float"}
		add_field => ["location", "%{Latitude},%{Longitude}"]
		rename => ["Code", "zipcode"]
	}
}


output {
	stdout {codec => rubydebug}

	elasticsearch {
		index => "zipcodes"
		hosts => ["http://localhost:9200"]
	}
}
Copy the code

We know that the Logstash pipeline consists of three parts: inputs, filters and outputs.

In the input section above, we use file to enter the file zipcodes.csv to which we specify a path. Depending on your actual path of this file, we need to make adjustments accordingly.

In the Filters section, we use the CSV filter to advance the corresponding fields. We also use mutate filters to type and rename our data. We create a new field called location with add_fields, which consists of latitude and longitude.

In our output section, we output to our standard output via stdout. This is useful in many cases to help debug our Logstash pipeline. Meanwhile, we import the output into our local Elasticsearch. Our data will be stored in an index called ZipCodes.

In order to make Elasticsearch’s ZipCodes fit our needs, we typed the following command into Kibana:

PUT _template/zipcodes
{
  "order": 10,
  "index_patterns": [
    "zipcodes*"
  ],
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "properties": {
      "zipcode": {
        "type": "text"
      },
      "location": {
        "type": "geo_point"
      }
    }
  },
  "aliases": {}
}
Copy the code

The above index template indicates that any index whose index pattern is zipCodes * will have the Settings, mappings, and alias defined above. Above, we define our location as a geo_point data type.

We can start the Logstash with the following command:

sudo ./bin/logstash -f ~/data/zipcodes/logstash_zipcodes.conf
Copy the code

After our Logstash starts, we can see the following output in Terminal

From above we can see the various fields that Logstash helps us produce.

We can open our Kibana and type the following command:

GET zipcodes/_count
Copy the code

We can see the following results:

{
  "count" : 42358,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  }
}
Copy the code

Above, we can see that we have 42358 documents.

Search data – Geo Search

Above we have seen the Geo data we imported, we can use the following command to query our data:

GET zipcodes/_search { "query": { "bool": { "must": [ { "match_all": {} } ], "filter": { "geo_distance": { "distance": "1 km" and "location" : {" lat ": 39.920086," says lon ": 116.454182}}}}}}Copy the code

Above, we query all documents within one kilometer radius of our center at the coordinates (116.454182, 39.920086). The results are as follows:

"Hits" : {" total ": {" value" : 1, the "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ": "Zipcodes _type", "" :" _doc ", "_id" : "rvG12HABqA - NWEvj7Dj0", "_score" : 1.0, "_source" : {" Memo ": null," Latitude ": 39.920929, "Name" : "HuJiaLou street", "message" : "35110050 03110 105, HuJiaLou streets, HuJiaLou streets, 116.464325, 39.920929, 16, and false", "Sort" : "16", "host" : "liuxg", "AreaCode" : "110105", "location" : "39.920929,116.464325", "path" : "/ Users/liuxg/data/zipcodes zipcodes. CSV", "ShortName" : "HuJiaLou street", "Id" : "35", "Longitude" : 116.464325, "@ version: "1", "@timestamp" : "2020-03-14T11:02:43.681z ", "zipcode" : "110105003", "Disabled" : "false"}}]}Copy the code

Of course, we can also modify the distance above to get more results.

Similarly, we can search for a defined rectangle:

GET zipcodes/_search { "query": { "bool": { "must": [ { "match_all": {} } ], "filter": { "geo_bounding_box": {" location ": {" top_left" : {" lat ": 39.94086," says lon ": 116.454182}," bottom_right ": {" lat" : 39.930086, "says lon" : 116.464182}}}}}}}Copy the code

Above, we define two coordinate positions. These two positions define the region of a rectangle. The search above will show all documents in this area:

"Hits" : {" total ": {" value" : 1, the "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ": "Zipcodes _type", "" :" _doc ", "_id" : "MfG12HABqA - NWEvj7Diy", "_score" : 1.0, "_source" : {" Memo ": null," Latitude ": 39.931461, "Name" : "prices street", "message" : "45110050 13110 105, prices streets, street prices, 116.462578, 39.931461, 35, and false", "Sort" : "35", "host" : "liuxg", "AreaCode" : "110105", "location" : "39.931461,116.462578", "path" : "/ Users/liuxg/data/zipcodes zipcodes. CSV", "ShortName" : "prices street", "Id" : "45", "Longitude" : 116.462578, "@ version: "1", "@timestamp" : "2020-03-14T11:02:43.684z ", "zipcode" : "110105013", "Disabled" : "false"}}]}Copy the code

The result shown above is all the documents in the area we defined.

Display data

We can easily search our data on it. We can also use the powerful graphical tools provided by Kibana to display all our documents. To do this, we first create an index pattern called ZipCodes:

Click “Create Index Pattern” above:

Click the “Next Step” button above:

Since this is not a timing document, we select “I don’t want to use the Timer Filter”. Click “Create Index Pattern” :

This creates our ZipCodes * index pattern.

Next we create a Visualization for our ZipCodes:

Click the “Create Visualization” button above:

Let’s select “Maps” :

Click the “Add Layer” button above:

Click on “Documents” above:

Click on the “Add Layer” above:

We added tooltips to zipCode and ShortName:

Click the “Save & Close” button above:

We can see from the above that all the documents are shown on the map of The location of China. Let’s zoom in on our map and see something like this:

When we place our mouse over a document, it displays the name and zipcode of the current location.

We can also use the tools on the map to define a desired area for search:

We define an area by using the mouse:

Select Draw Shape:

We can use the mouse to mark the area we want to search in the area we like, and finally form a filter that shows only the documents in that area:

Select filter: Location in Shape above:

Select Edit Filter above and we can see:

Above we can see the points used to search for geo_polygon using DSL:

GET zipcodes/_search { "query": { "bool": { "must": [ { "match_all": {} } ], "filter": { "geo_polygon": {"points": [{"lat": 40.05915, "lon": 115.95216}, {"lat": 39.86638, "lon": 116.22583}, {" lat ": 40.08309," says lon ": 116.31184}, {" lat" : 40.05915, "says lon" : 115.95216}]}}}}}}Copy the code

The result of the above query will display 13 documents:

"Hits" : {" total ": {" value" : 13, "base" : "eq"}, "max_score" : 1.0, "hits" : [{" _index ": "Zipcodes _type", "" :" _doc ", "_id" : "svG12HABqA - NWEvj7Dj0", "_score" : 1.0, "_source" : {" Memo ": null," Latitude ": 39.916122, "Name" : "bajiao street", "message" : "99110070 03110 107, bajiao street, bajiao street, 116.195648, 39.916122, 2, and false", "Sort" : "2", "host" : "liuxg", "AreaCode" : "110107", "location" : "39.916122,116.195648", "path" : "/ Users/liuxg/data/zipcodes zipcodes. CSV", "ShortName" : "bajiao street", "Id" : "99", "Longitude" : 116.195648, "@ version: "1", "@ timestamp" : "the 2020-03-14 T11:02:43. 694 z", "zipcode" : "110107003", "Disabled" : "false"}}, {" _index ": "Zipcodes _type", "" :" _doc ", "_id" : "MfG12HABqA - NWEvj7Tk5", "_score" : 1.0, "_source" : {" Memo ": null," Latitude ": 40.003696, "Name" : "JunZhuang Town", "message" : "143110091, 04110, 109, JunZhuang Town JunZhuang Town, 116.103455, 40.003696, 5, and false", "Sort" : "5", "host" : "liuxg", "AreaCode" : "110109", "location" : "40.003696,116.103455", "path" : "/ Users/liuxg/data/zipcodes zipcodes. CSV", "ShortName" : "JunZhuang Town", "Id" : "143", "Longitude" : 116.103455, "@ version: "1", "@ timestamp" : "the 2020-03-14 T11:02:43. 704 z", "zipcode" : "110109104", "Disabled" : "false"}}, {" _index ": "Zipcodes _type", "" :" _doc ", "_id" : "KfG12HABqA - NWEvj7Tqx", "_score" : 1.0, "_source" : {" Memo ": null," Latitude ": 39.898903, "Name" : "babaoshan street", "message" : "97110070 01110 107, babaoshan streets, streets of babaoshan, 116.231972, 39.898903, 1, and false", "Sort" : "1", "host" : "liuxg", "AreaCode" : "110107", "location" : "39.898903,116.231972", "path" : "/ Users/liuxg/data/zipcodes zipcodes. CSV", "ShortName" : "babaoshan street", "Id" : "97", "Longitude" : 116.231972, "@ version: "1", "@ timestamp" : "the 2020-03-14 T11:02:43. 694 z", "zipcode" : "110107001", "Disabled" : "false"}},... .Copy the code

We can save the current Visualization for use in the Dashboard:

Select Save, and finally, we form our own Dashboard: