How to design and realize the “nearby people” of wechat, “nearby restaurants” of Meituan and “nearby cars” of Alipay shared bikes?

Use a database implementation to find people nearby

We all know that any location on the earth can be represented in terms of latitude and longitude in two dimensions, longitude ranges

[- 180, 180]

Latitude range

[- 90, 90]

Latitude plus and minus is bounded by the equator north plus and south minus longitude plus and minus by prime meridian

(Greenwich Observatory, UK)

Is the boundary, plus east and minus west. For example, the latitude and longitude coordinates of the Monument to the People’s Heroes in Beijing are

(39.904610, 116.397724)

Both are positive because China is in the northeastern hemisphere.

Therefore, when we use the database to store the longitude and latitude information of all people, we can divide the range of a rectangle based on the current coordinate node to know the people nearby, as shown in the following figure:

Therefore, it is easy to write the following pseudo-SQL statements:

SELECT id FROM positions WHERE x0 - r < x < x0 + r AND y0 - r < y < y0 + rCopy the code

If we want to further know the distance from each coordinate element and order it, we need to do some calculation.

When two coordinate elements are not very far apart, we can simply use the Pythagorean theorem to find the distance between them. However, it should be noted that the earth is not a standard sphere, the density of latitude and longitude is not the same, so when we use the Pythagorean theorem to calculate the square and then sum, we need to be weighted according to a certain coefficient. Of course, if you can’t be exact, you don’t need to be weighted.

Refer to the below

Resources 2

We can almost write an optimized SQL statement like this:

(For information only)

SELECT	* FROM	users_location WHERE	latitude > '.$lat.' - 1 	AND latitude < '.$lat.' + 1 AND longitude > '.$lon.' - 1 	AND longitude < '.$lon.' + 1 ORDER BY	ACOS(		SIN( ( '.$lat.'* latitude * 3.1415/180) * SIN(latitude * 3.1415/180) + COS((latitude * 3.1415/180))'.$lat.'* 3.1415) / 180) * COS((latitude * 3.1415) / 180) * COS((latitude * 3.1415) / 180)'.$lon.'* 3.1415) / 180 - (longitude * 3.1415) / 180)) * 6380 ASC LIMIT 10';Copy the code

In order to meet the high performance rectangular region algorithm, the data table also needs to add the latitude and longitude coordinates plus the bi-directional composite index (x, y), which can meet the maximum optimal query performance.

2. Brief description of the GeoHash algorithm

This is the industry’s more common, used for geographical location distance sorting algorithm, Redis also used such an algorithm. The GeoHash algorithm maps the latitude and longitude data from two dimensions to integers from one dimension, so that all elements are mounted on a line, and points that are close to each other in two dimensions are mapped to one dimension. When we want to calculate “nearby people”, we first map the target position to this line, and then obtain nearby points on this one-dimensional line.

Its core idea is to treat the whole earth as a two-dimensional plane, and divide the plane into smaller and smaller squares continuously. Each coordinate element is located in a unique square. The smaller the squares are, the more accurate the coordinates will be, similar to the following figure:

Divided earth, we need to code it:

After coding in this sequence, if you watch carefully for a while, you’ll notice some patterns:

  • Of all the codes that are horizontal,
    The second place is the same as the fourth placeFor example, first row first
    0101And the second
    0111Their second and fourth positions are both
    1;
  • Of all the codes up there,
    Bits 1 and 3 are increasingFor example, first row first
    0101If you take the first and third places out separately, that is
    00Same thing with the first row and the second row
    0111In the same way, the first and third positions are picked out
    01, just
    00Incrementing by one;

By this rule, we encode each small square in a certain order, and the benefits of doing so are obvious: The coordinates of each element can be uniquely identified on this coded map without exposing specific locations, because the area is shared. I can tell you that I’m near the park, but you don’t know exactly where.

Anyway, with the idea above, we can turn any coordinate into a string of binary codes, like 11010010110001000100

(Note that longitude and dimension are interchangeable.)

From this integer we can restore the element’s coordinates. The longer the integer, the smaller the loss of the restored coordinates. for
“People in the neighborhood.”For this function, the loss of a little longitude is negligible.

And finally, a Base32

(0~9, a~z, drop a/ I/L/O)

Encoding operations to make it a string, such as the one above
wx4g0ec1.

In Redis, latitude and longitude are encoded as 52-bit integers in zset, where value is the element’s key and score is the 52-bit integer value of the GeoHash. Zset score is a floating-point number, but it can be stored losslessly for 52-bit integer values.

Use Geo in Redis

Quote below

Reference 1 – Redis Deep Adventure

When using Redis for Geo queries, keep in mind that its internal structure is really just a Zset (skiplist). Other elements near the coordinates can be obtained by sorting zset score

(It’s a little more complicated, but that’s enough.)

By moving the
scoreRestore the coordinates to get the original coordinates of the element.

Redis provides only six Geo commands, which are easy to master.

increase

The geoAdd directive carries the collection name as well as multiple triples of latitude and longitude names. Note that multiple triples can be added here.

127.0.0.1:6379> Geoadd Company 116.48105 39.996794integer) 1127.0.1:6379 > Geoadd Company 116.514203 39.905409 iReaderinteger) 1127.0.0.1:6379> Geoadd Company 116.489033 40.007669integer) 1127.0.1:6379 > Geoadd Company 116.562108 39.787602 JD 116.334255 40.027400 xiaomi(integer2)Copy the code

But it’s weird… Redis does not directly provide instructions for deleting Geo, but we can operate Geo data through zset-related instructions, so element deletion can be done by using ZREM instructions.

distance

The Geodist directive can be used to calculate the distance between two elements, carrying the collection name, two names, and units of distance.

127.0.0.1:6379> geodist company juejin ireader km"10.5501"127.0.0.1:6379> Geodist Company Juejin Meituan KM"1.3878"127.0.0.1:6379> geodist company juejin jd km"24.2739"127.0.0.1:6379> Geodist Company Juejin Xiaomi KM"12.9606"127.0.0.1:6379> geodist company juejin juejin km"0.0000"Copy the code

We can see the Nuggets are closest to the US team because they are both in Wangjing. The units of distance can be m, km, ML, and ft, representing meters, kilometers, miles, and feet, respectively.

Get element position

The GeopOS directive can obtain the latitude and longitude coordinates of any element in a collection, more than one at a time.

Geopos company juejin1) 1)"116.48104995489120483" 2) "39.99679348858259686"127.0.0.1:6379> geopos company ireader1) 1) "116.5142020583152771" 2) "39.90540918662494363"Geopos company juejin iReader1) 1)"116.48104995489120483" 2) "39.99679348858259686"2) 1) "116.5142020583152771" 2) "39.90540918662494363"Copy the code

We observed a slight error between the obtained latitude and longitude coordinates and the coordinates geoadded. The reason is that the one-dimensional mapping of the two-dimensional coordinates by Geohash is lossy, and the values restored by the mapping will show a small difference. For a feature like “People nearby,” this error is not a big deal.

Gets the hash value of the element

The GeoHash gets the element’s latitude and longitude encoded string, which, as mentioned above, is base32 encoded. You can use this code value to locate the Geohash directly at http://geohash.org/${hash}, which is the standard code value for Geohash.

127.0.0.1:6379 > geohash company ireader1)"wx4g52e1ce0"127.0.0.1:6379 > geohash company juejin1)"wx4gd94yjn0"Copy the code

Let’s open the address http://geohash.org/wx4g52e1ce0, observe whether the map the location of the point to is correct:

Good. That’s the spot. It’s spot on.

Nearby companies

The georadiusbyMember directive is the most critical directive. It can be used to query other elements near the specified element, and its parameters are very complex.

# Up to 3 elements within 20km are aligned according to distance. It does not exclude itself 127.0.0.1:6379> Georadiusbymember Company iReader 20 km count 3 asc1) "ireader"2) "juejin"3) "meituan"# range 20 127.0.0.1:6379> Georadiusbymember company ireader 20 km count 3 desc1) "JD "2) "meituan"3) "juejin"# Three optional arguments withcoord withdist withhash are used to carry the additional argument # withdist which is useful, It can be used to display distance 127.0.0.1:6379> georadiusbymember company iReader 20 km withcoord withdist withHash count 3 asc1) 1) "iReader" 2) "0.0000" 3) (INTEGER) 4069886008361398 4) 1) "116.5142020583152771" 2) "39.90540918662494363"2) 1) "juejin" 2) "10.5501" 3) (INTEGER) 4069887154388167 4) 1) "116.48104995489120483" 2) "39.99679348858259686"3) 1) "meituan" 2) "11.5748" 3) (INTEGER) 4069887179083478 4) 1) "116.48903220891952515" 2) "40.00766997707732031"Copy the code

In addition to georadiusBymember instruction to query nearby elements according to elements, Redis also provides to query nearby elements according to coordinate values. This instruction is more useful, it can calculate “nearby cars”, “nearby restaurants” and so on according to the user’s positioning. Its parameters are basically the same as GeoradiusByMember, except that the target element is changed to latitude and longitude coordinates:

127.0.0.1:6379> georadius company 116.514202 39.905409 20 km withdist count 3 asc1) 1) "ireader" 2) "0.0000"2) 1) "juejin" 2) "10.5501"3) 1) "meituan" 2) "11.5748"Copy the code

Matters needing attention

In a map application, the data of cars, restaurants, and people may be in millions and millions, and if Redis’s Geo data structure is used, they will all be in one Zset set. In the cluster environment of Redis, the set may be migrated from one node to another. If the data of a single key is too large, the migration of the cluster will be greatly affected. In the cluster environment, the data amount corresponding to a single key should not exceed 1M, otherwise the cluster migration will lag. The online services are affected.

Therefore, it is recommended that Geo data be deployed in a separate Redis instance instead of a clustered environment.

If the amount of data is over 100 million or even larger, it is necessary to split Geo data by country, by province, by city, and even by district in populous megacities. This can significantly reduce the size of a single Zset collection.

reading

  1. Redis (1) – 5 kinds of basic data structures – www.wmyskxz.com/2020/02/28/…
  2. Redis (2) – jump table – www.wmyskxz.com/2020/02/29/…
  3. Redis (3) – distributed lock delve into – www.wmyskxz.com/2020/03/01/…
  4. Reids (4) – magical HyperLoglog solving statistical problems – www.wmyskxz.com/2020/03/02/…
  5. Redis (5) – million level data filtering and bloom filter – www.wmyskxz.com/2020/03/11/…

The resources

  1. The depth of the Redis adventures – Qian Wenpin/a – book.douban.com/subject/303…
  2. Mysql query latitude and longitude and calculate 2 km range near the user’s SQL query performance tuning instance tutorial – www.cnblogs.com/mgbert/p/41…
  3. Geohash algorithm principle and implementation – www.jianshu.com/p/2fd0cf12e…
  4. GeoHash algorithm study interpretation, parsing and analysis – zhuanlan.zhihu.com/p/35940647 principle
  • This article has been included in my Github Programmer growth series
    【More Than Java】, learn More Than Code, welcome star:Github.com/wmyskxz/Mor…
  • Personal public account: wmyskxz.
    Personal independent domain name blog: wmyskxz.com, adhere to the original output, scan code below attention, 2020, and you grow together!

Thank you very much for reading this, if you think this article is well written, if you think there is something about “I don’t have three hearts”, please like, follow, share and leave a comment!

Creation is not easy, your support and recognition, is the biggest motivation for my creation, we will see you in the next article!

This article is formatted using MDNICE