How does Google S2 solve the problem of optimal spatial coverage?

preface

This is, unsurprisingly, the final installment in the Google S2 series. This article will explain the regionCoverer algorithm. As for Google S2, there are many other small algorithms in the library, the code is also worth reading and learning, here is not to expand, interested readers can read the entire library.

At the time of writing this article, and discovered the author of the library of some new commit, such as on December 4, 2017 commit f9610db2b871b54b17d36d4da6a4d6a2aab6018d, the changes of the submitted README, although only change the document, It’s a lot of stuff actually.


 -For an analogous library in C++, see
 -https://code.google.com/archive/p/s2-geometry-library/, and in Java, see
 -https://github.com/google/s2-geometry-library-java
------------------------------------------------------------------------------------
 +For an analogous library in C++, see https://github.com/google/s2geometry, in
 +Java, see https://github.com/google/s2-geometry-library-java, and Python, see
 +https://github.com/google/s2geometry/tree/master/src/python

Copy the code

As you can see, they’ve taken code from Google’s official private repository and put it on Github. C++ code, which was previously only available in the code archive, can now be viewed directly on github. It’s a lot easier.


+More details about S2 in general are available on the S2 Geometry Website
 +[s2geometry.io](https://s2geometry.io/).

Copy the code

There was also a new website mentioned in this commit, which I found was just released recently. Because of the author, I have been following S2 every commit for nearly half a year. Every source on the web about S2 has a following, this site is very new. This website will be mentioned at the end of the article, but I won’t go into details here.

From this submission, I think Google S2 may be taken seriously and ready to roll out.

All right, let’s get down to business.

1. Space type

There are several types of RegionCovering that can be done in Google S2. Basically, the Region interface must be met.

1. Cap ball Cap

Cap represents a disc-shaped region defined by center and radius. Technically, this shape is called a “spherical cap” (rather than a disk) because it is not flat. The cap represents a portion of the sphere cut off by an airplane. The boundary of a cap is a circle defined by the intersection of a sphere and a plane. A hat is a closed combination, that is, it contains its boundaries. In most cases, regardless of the use of optical disks in planar geometry, spherical crowns can be used. The radius of the cap is measured along the surface of the sphere (not through the straight-line distance inside). Thus, a cap of radius π/ 2 is a hemisphere, and a cap of radius π covers the entire sphere. The center is a point on the unit sphere. (hence the need for it to be unit length.) A hat can also be defined by its center point and height. Height is the distance from the center point to the truncated plane. There are also upper limits for supporting “empty” and “full”, which do not include any and all points, respectively. Below are some useful relationships between cap height (h), cap radius (R), maximum chord at the center of the cap (d), and cap bottom radius (a).

h = 1 - cos(r)
	= 2 * sin^2(r/2)
d^2 = 2 * h
	= a^2 + h^2

Copy the code

2. The Loop cycle

Loop represents a simple spherical polygon. It consists of a series of vertices, the first of which is implicitly thought to be connected to the last. All loops are defined to have CCW orientation, that is, the inside of the loop is to the left of the edge. This means that a clockwise loop surrounding a small area is interpreted as a loop surrounding a very large area of CCW. Loop does not allow any duplicate vertices (whether adjacent or not). Adjacent edges are not allowed to intersect, and edges with a length of 180 degrees are not allowed (i.e., adjacent vertices cannot be opposite). Loops must have at least 3 vertices (except for the “empty” and “full” loops discussed below). There are two special loops: EmptyLoop contains no points, and FullLoop contains all points. These loops do not have any edges, but to keep the invariant that each loop can be represented as a chain of vertices, they are defined to have only one vertex each.

3. Polygon

A polygon represents a sequence of zero or more loops; Similarly, the left-hand direction of a loop is defined as its interior. When a polygon is initialized, the given loop is automatically converted to the canonical form of the composition of the “holes”. Loops are reordered to correspond to the predefined traversal of the nested hierarchy. A polygon can represent any region of a sphere with polygon boundaries, including the entire sphere (called a “complete” polygon). A complete polygon consists of a complete loop, while an empty polygon has no loop at all. Use FullPolygon() to construct a complete polygon. A zero value of a Polygon is treated as an empty Polygon.

To form a Polygon with multiple loops, the following four conditions must be met:

Loops cannot intersect, meaning that the boundaries of a loop may not intersect the inside and outside of any other loop.
Loops do not share edges, that is, if a loop contains an edge AB, other loops may not contain AB or BA.
Loops can share vertices, but never have two vertices in a single loop (see S2Loop).
No empty loop. Full loops may only appear in full polygons.

4. The Rect rectangle

Rect represents a closed latitude and longitude rectangle. It is also of Region type. It can represent empty and complete rectangles as well as individual points. It has an AddPoint method that makes it easy to construct a bounding rectangle for a set of points, including a set of points spanning a 180 degree meridian.

5. The Region area

A region represents a two-dimensional region on a unit sphere. The purpose of this interface is to approximate a complex region into a simpler region. This interface is limited to methods that compute approximations.

The S2 region represents a two-dimensional region on the unit sphere. It is an abstract interface with various concrete subtypes, such as disks, rectangles, polygons, geometric sets, buffer shapes, etc. The main purpose of this interface is to approximate complex regions to simpler regions. Thus, interfaces can only be used for methods that evaluate approximations, not virtual methods that have a variety of implementations by all subtypes.

6. Shape Shape

Shape is the “base class” of any Shape or Shape. It is the most flexible way to represent geometric polygons. It is made up of collections of edges that optionally define the interior. All geometry represented by a given Shape must have the same size, which means that a Shape can represent a set of points, a set of polygons, or a set of polygons. Shape is defined as an interface to make it easier for clients to control the underlying data representation. Sometimes a Shape does not have its own data, but wraps other types of data. Shape operations are usually defined on ShapeIndex rather than individual shapes. ShapeIndex is just a collection of Shapes, possibly with different dimensions (for example, 10 points and 3 polygons), organized into a data structure for efficient access. The edges of a Shape are indexed by a contiguous range of edge ids starting at 0. Edges are further subdivided into chains, where each chain consists of a series of edges (multi-segment lines) connected end-to-end. For example, the shape representing two polylines AB and CDE will have three edges (AB, CD, DE) that divide into two chains (AB) and (CD, DE). Similarly, a shape representing five points will have five chains made up of one edge. Shape has methods that allow you to use global numbers (edge ids) or access edges in a particular chain. Global numbering is sufficient for most cases, but chain representation is useful for certain algorithms, such as intersection (see BooleanOperation).

A total of two extensible interfaces for representing geometry are defined in S2: S2Shape and S2Region.

The difference between them is that S2Shape’s purpose is to flexibly represent polygon geometry. (This includes not only polygons, but also points and polylines). Most of the core S2 operations will work with any class that implements the S2Shape interface.

The purpose of S2Region is to compute geometric approximations. For example, there are methods for calculating boundary rectangles and disks, and S2RegionCoverer can be used to approximate a region as a collection of cells with any desired precision. Unlike S2Shape, S2Region can represent non-polygonal geometric shapes, such as ball caps (S2Cap).

In addition to the common types mentioned above, there are several intermediate or lower-level types available for developers to use.

S2LatLngRect – a rectangle in latitude and longitude coordinates.
S2Polyline – polyline.
S2CellUnion – a region approximating the S2CellIds set. RegionCoverer conversions are of this type.
S2ShapeIndexRegion – Any collection of points, polymorphs, and polygons.
S2ShapeIndexBufferedRegion – defined as S2ShapeIndexRegion just expanded the given radius.
S2RegionUnion – A collection of arbitrary regions.
S2RegionIntersection – The intersection part of any other region.

Finally, S2RegionTermIndexer is a type that supports indexing and querying any type of S2Region, that is, all types mentioned above. You can use S2RegionTermIndexer to index a set of multisegment lines and then query which multisegment lines intersect a given polygon.

RegionCoverer Example

RegionCoverer is about finding an approximate optimal solution (why not?) that covers the current region.

There are three conversion conditions: MaxLevel, MaxCells and MinLevel. MaxCells determines the maximum number of cells, but being too close to the maximum will result in too large and inaccurate coverage. So the maximum number is just limited to the maximum accuracy that can’t be exceeded. Because of this, it is not the optimal solution that satisfies MaxCells.

A few examples:

The following is a cap with a radius of 10km, and this cap is located at the Angle between three faces. We assume that a maximum of 10 cells are needed to cover it. The results are as follows:

With the same Settings, we change the number to 20 as follows:

Use the same configuration and change the number to 50:

So far, the accuracy is only so-so, and the edge section covers more than the original cap. We continue to improve the accuracy and adjust the number to 200.

200 seems like a good number, but let’s raise it a little bit, let’s raise it to 1,000, and see what happens.

Although the code configuration is set to 1000, there are only 706 cells. The reason is that although the code is calculated according to 1000, the actual algorithm processing will also carry out the combination after cell pruning. So we’re going to end up with less than 1,000.

For another example, the rectangle below represents a latitude/longitude rectangle extending from 60 to 80 degrees north latitude and from -170 to +170 degrees longitude. Coverage is limited to 8 units. Note that the hole in the middle is completely covered. This is clearly not in keeping with our intentions.

Let’s increase the number of cells to 20. The hole in the middle was still filled in.

We set the parameters to 100, and everything else is exactly the same. Now the hole in the middle has a certain shape. But the blank space near the date line still didn’t come out.

Finally adjust the parameters to 500. Now the hole in the middle is more complete.

Let me give you a couple of examples from our actual projects. Below is the edge of a grid in Shanghai. We use first


defaultCoverer := &s2.RegionCoverer{MaxLevel: 16, MaxCells: 100, MinLevel: 13}

Copy the code

To convert, the result is as follows:


defaultCoverer := &s2.RegionCoverer{MaxLevel: 30, MaxCells: 1000, MinLevel: 1}

Copy the code

Increase the accuracy to 1000 and the result will look like this:

There are some larger regional situations, such as one province, Hubei province:

Or a lake, Taihu:

Finally, take a polygon example. We know that polygon is made up of multiple loops:


	loops := []*s2.Loop{}
	loops = append(loops, loop1)
	loops = append(loops, loop2)

	polygon := s2.PolygonFromLoops(loops)

	defaultCoverer := &s2.RegionCoverer{MaxLevel: 16, MaxCells: 100, MinLevel: 8}
	fmt.Println("---- polygon-----")

	cvr = defaultCoverer.Covering(polygon)

	for i := 0; i < len(cvr); i++ {
		fmt.Printf("%d,\n", cvr[i])
	}
	fmt.Printf("------------\n")


Copy the code

Here’s what the two loops look like on the map.

Finally, there is polygon, which contains both loops.

RegionCoverer core algorithm Covering implementation

This chapter gives a detailed analysis of how Covering is implemented.

The most common uses are the following lines:


	rc := &s2.RegionCoverer{MaxLevel: 30, MaxCells: 5}
	r := s2.Region(CapFromCenterArea(center, area))
	covering := rc.Covering(r)


Copy the code

The above example shows that there are at most 5 Cellids in the CellUnion after conversion. The area covered by the top three lines is a Cap.


type RegionCoverer struct {
	MinLevel int // the minimum cell level to be used.
	MaxLevel int // the maximum cell level to be used.
	LevelMod int // the LevelMod to be used.
	MaxCells int // the maximum desired number of cells in the approximation.
}


Copy the code

RegionCoverer is a structure that actually contains four elements. MinLevel, MaxLevel, MaxCells. The key is to explain LevelMod.

LevelMod % LevelMod = 0 LevelMod % LevelMod = 0 LevelMod % LevelMod = 0 Level-minlevel can only be a multiple of LevelMod. A Cell level that meets this condition can be selected. This effectively allows branching factors at the S2 CellID level to increase. The current parameter value can only be 0,1,2,3, and the corresponding branching factor is 0,4,16,64.

Let’s talk about the core of the algorithm.

RegionCover can be abstractly described as: given a region, cover it with as many cells as possible, but not more than MaxCells. How to find these cells?

This problem is a problem of the optimal solution of myopia. If you want to be more precise, of course, the solution is to use MaxLevel for all the edges (the larger the Level, the smaller the grid), which is the most accurate. However, this would cause the number of cells to soar far beyond MaxCells, which again would not meet the requirements. How do you get the most accurate coverage of a given region with <= MaxCells?

A few things to note in advance are:

1. MinLevel has a higher priority than MaxCells, meaning that cells below a given Level are never used, i.e. a Cell that can replace many cells with a higher Level.
1. For the minimum range of values for MaxCells, up to 6 cells can be returned if a case requires the minimum number of cells required (for example, if the region intersects all six plane cells). Even very small raised areas may return up to 3 cells if they happen to be located at the intersection of three cubic faces.
1. If MinLevel is too large for the approximate range, MaxCells loses the constraint and can return any number of cells.
1. If MaxCells is less than 4, even if the region is convex, such as CAP or RECT, it will end up covering a larger area than the native region. Developers should be aware of this.

Okay, so let’s start with the source code. This is the core function of the RegionCoverer transformation.


func (rc *RegionCoverer) Covering(region Region) CellUnion {
	covering := rc.CellUnion(region)
	covering.Denormalize(maxInt(0, minInt(maxLevel, rc.MinLevel)), maxInt(1, minInt(3, rc.LevelMod)))
	return covering
}

Copy the code

From this function implementation, we can see that the transformation is actually divided into two steps, one is the Normalize Cell + transformation, and the other is the Denormalize Cell.

(一). CellUnion

A concrete implementation of the CellUnion method:


func (rc *RegionCoverer) CellUnion(region Region) CellUnion {
	c := rc.newCoverer()
	c.coveringInternal(region)
	cu := c.result
	cu.Normalize()
	return cu
}


Copy the code

This method can also be broken down into three main parts: newCoverer, coveringInternal, and Normalize.

1. newCoverer()


func (rc *RegionCoverer) newCoverer(a) *coverer {
	return &coverer{
		minLevel: maxInt(0, minInt(maxLevel, rc.MinLevel)),
		maxLevel: maxInt(0, minInt(maxLevel, rc.MaxLevel)),
		levelMod: maxInt(1, minInt(3, rc.LevelMod)),
		maxCells: rc.MaxCells,
	}
}


Copy the code

The newCoverer() method initializes the structure of a coverer. MaxLevel is a previously defined constant, maxLevel = 30. The initialization parameters of Coverer are all from the parameters of RegionCoverer. We initialize a RegionCoverer externally, and its four main parameters, MinLevel, MaxLevel, LevelMod, and MaxCells, are passed here. The maxInt and minInt initialization functions used in this section are mainly used to handle invalid values.

The Coverer structure actually contains eight elements.


type coverer struct {
	minLevel         int // the minimum cell level to be used.
	maxLevel         int // the maximum cell level to be used.
	levelMod         int // the LevelMod to be used.
	maxCells         int // the maximum desired number of cells in the approximation.
	region           Region
	result           CellUnion
	pq               priorityQueue
	interiorCovering bool
}


Copy the code

In addition to the four items initialized above, it contains four other important items that will be used below. Region Indicates the region to be covered. Result is an array of CellUnion, pq is a priorityQueue, interiorCovering is a bool variable, the current conversion of the flag is internal conversion.

2. coveringInternal()

Now look at the covered Internal method.


func (c *coverer) coveringInternal(region Region) {
	c.region = region

	c.initialCandidates()
	for c.pq.Len() > 0&& (! c.interiorCovering ||len(c.result) < c.maxCells) {
		cand := heap.Pop(&c.pq).(*candidate)

		if c.interiorCovering || int(cand.cell.level) < c.minLevel || cand.numChildren == 1 || len(c.result)+c.pq.Len()+cand.numChildren <= c.maxCells {
			for _, child := range cand.children {
				if! c.interiorCovering ||len(c.result) < c.maxCells {
					c.addCandidate(child)
				}
			}
		} else {
			cand.terminal = true
			c.addCandidate(cand)
		}
	}
	c.pq.Reset()
	c.region = nil
}


Copy the code

The coveringInternal method generates the covered schema and stores the result in Result. The general strategy for overriding transformations is:

Start with the six sides of the cube. Discard any shapes that do not intersect the region. Then repeatedly select the largest cell that intersects the shape and subdivide it.

Of the eight elements in the Coverer structure, the first four are passed in externally to initialize and the last four are used here. First of all,


c.region = region

Copy the code

Initialize coverer’s area. The other three elements result, Pq and interiorCovering are all used below.

Result contains only qualified cells that will be part of the final output, whereas THE PQ priority queue contains cells that may still need to be subdivided.

If a Cell is 100% contained within the coverage area, it is immediately added to the output, and cells that do not intersect the area in any way are immediately discarded. Therefore, the PQ priority queue contains only some cells that intersect the region.

Pq Priority team listing strategies are:

1. Candidates are considered first based on Cell size (starting with large cells)

2. Then according to the number of children intersecting (children with the least priority is higher, first out of the line)

3. Finally, according to the number of children fully accommodated (children with the least priority, first out)

After filtering the PQ priority queue, the remaining Cell must have the lowest priority, that is, the Cell area is relatively small, and the part that intersects with the region is large and fully accommodates the largest number of children. That is, the Cell closest to the edge of the region (the area covered by the Cell is the least excess than the area to be covered by the transformation) will be left behind.


if c.interiorCovering || int(cand.cell.level) < c.minLevel || cand.numChildren == 1 || len(c.result)+c.pq.Len()+cand.numChildren <= c.maxCells {

}


Copy the code

The intent of this judgment condition in the implementation of the coveringInternal function is:

For internal coverage conversions, we continue to subdivide candidates regardless of how many children they have. If we reach MaxCells before we expand all the children, we will only use some of them. We can’t do this for external coverage because the result has to cover the whole area, so all the kids have to use it.


candidate.numChildren == 1

Copy the code

In the above case, we have more MaxCells results (minLevel too high) in the case of the care case. Candidates with one child who continued to subdivide in this case had no effect on the outcome.

3. initialCandidates()

Let’s see how to initialize the candidate’s:


func (c *coverer) initialCandidates(a) {
	// Optimization: start with a small (usually 4 cell) covering of the region's bounding cap.
	temp := &RegionCoverer{MaxLevel: c.maxLevel, LevelMod: 1, MaxCells: min(4, c.maxCells)}

	cells := temp.FastCovering(c.region)
	c.adjustCellLevels(&cells)
	for _, ci := range cells {
		c.addCandidate(c.newCandidate(CellFromCellID(ci)))
	}
}


Copy the code

There is a small optimization in this function that converts the first coverage of the region into a Cap with four cells covering the edges of the region. The two most important methods in the initialCandidates implementation are FastCovering and adjustCellLevels.

4. FastCovering()


func (rc *RegionCoverer) FastCovering(region Region) CellUnion {
	c := rc.newCoverer()
	cu := CellUnion(region.CellUnionBound())
	c.normalizeCovering(&cu)
	return cu
}


Copy the code

The FastCovering function returns a collection of CellUnion cells covering a given area, but this method is fast and the result is crude. Of course, the resulting CellUnion set also meets MaxCells, MinLevel, MaxLevel, and LevelMod requirements. Only the result does not attempt to use MaxCells’ large value. Usually a small number of cells are returned, so the results are rough.

So using the FastCovering function as a starting point for recursive Cell segmentation is very useful.

In this method, the region.cellUnionBound () method is called. This depends on how each region implements the interface.

For example, loop implements CellUnionBound() as follows:


func (l *Loop) CellUnionBound(a) []CellID {
	return l.CapBound().CellUnionBound()
}



Copy the code

The above method is a concrete implementation of fast computational boundary transformation. It is also the core of spatial coverage.

CellUnionBound returns the CellID array for the overwritten region. Cells are not sorted, may have redundancy (such as cells containing other cells), and may cover more areas than necessary.

Also for this reason, this method is not suitable for direct use in client code. Customers should usually use the Region.Covering method, which can be used to control the size and accuracy of Covering. Alternatively, if you want quick coverage and don’t care about accuracy, consider calling FastCovering (which returns a clean version of the coverage calculated by this method).

The CellUnionBound implementation should try to return a small overlay (ideally four or fewer) of the covered area and be able to compute quickly. So the CellUnionBound method is used by RegionCoverer as a starting point for further improvements.


func (l *Loop) CapBound(a) Cap {
	return l.bound.CapBound()
}

Copy the code

CapBound returns an upper bound that may have more padding than the corresponding RectBound. Its boundary is conservative. If loop contains point P, then the boundary must also contain this point.



func (c Cap) CellUnionBound(a) []CellID {

	level := MinWidthMetric.MaxLevel(c.Radius().Radians()) - 1

	// If level < 0, more than three face cells are required.
	if level < 0 {
		cellIDs := make([]CellID, 6)
		for face := 0; face < 6; face++ {
			cellIDs[face] = CellIDFromFace(face)
		}
		return cellIDs
	}

	return cellIDFromPoint(c.center).VertexNeighbors(level)
}


Copy the code

The above code is the core algorithm for the coarsest version of the transformation. The core step in this algorithm is to calculate the level to be found.


level := MinWidthMetric.MaxLevel(c.Radius().Radians()) - 1

Copy the code

The Level found here is the largest Cell that the Cap can contain.


return cellIDFromPoint(c.center).VertexNeighbors(level)

Copy the code

The above sentence returns four cells, which are closest to the center of the Cap. Of course, if the Cap is very large, it is possible to return 6 cells. Of course, the returned cells are not sorted in any way.

5. normalizeCovering()

The final step in FastCovering is normalizeCovering.


func (c *coverer) normalizeCovering(covering *CellUnion) {
	/ / 1
	// 
	if c.maxLevel < maxLevel || c.levelMod > 1 {
		for i, ci := range *covering {
			level := ci.Level()
			newLevel := c.adjustLevel(minInt(level, c.maxLevel))
			ifnewLevel ! = level { (*covering)[i] = ci.Parent(newLevel) } } }/ / 2
	// 
	covering.Normalize()

	/ / 3
	// 
	for len(*covering) > c.maxCells {
		bestIndex := - 1
		bestLevel := - 1
		for i := 0; i+1 < len(*covering); i++ {
			level, ok := (*covering)[i].CommonAncestorLevel((*covering)[i+1])
			if! ok {continue
			}
			level = c.adjustLevel(level)
			if level > bestLevel {
				bestLevel = level
				bestIndex = i
			}
		}

		if bestLevel < c.minLevel {
			break
		}
		(*covering)[bestIndex] = (*covering)[bestIndex].Parent(bestLevel)
		covering.Normalize()
	}
	/ / 4
	// 
	if c.minLevel > 0 || c.levelMod > 1 {
		covering.Denormalize(c.minLevel, c.levelMod)
	}
}



Copy the code

NormalizeCovering further normalizes the result of the override transformation from the previous step to match the current override parameters (MaxCells, minLevel, maxLevel, and levelMod). This approach does not attempt optimal results. In particular, if minLevel> 0 or levelMod> 1, it may return more values than expected even if this is not required.

There are four points of note in the code implementation above.

First, the judgment is that if cells are too small or don’t meet levelMod’s criteria, replace them with their ancestors.

The second is to sort and simplify the results of the previous step.

Third, if the number of cells is still too large and there are still too many cells, the for loop is used to find the nearest common ancestor LCA of the two adjacent cells and replace them. The order of the for loop is the order of CellID.

The specific implementation of the LCA used here is explained in detail in the previous article. The link to the article is not repeated here.

Fourth, make sure you get results that satisfy minLevel and levelMod, and preferably MaxCells as well.

The next step is that Normalize() and Denormalize() are implemented.

6. Normalize()

Look at the Normalize ().



func (cu *CellUnion) Normalize(a) {
	sortCellIDs(*cu)

	output := make([]CellID, 0.len(*cu)) // the list of accepted cells
	
	for _, ci := range *cu {
	
		/ / 1
		if len(output) > 0 && output[len(output)- 1].Contains(ci) {
			continue
		}
		
		/ / 2
		j := len(output) - 1 // last index to keep
		for j >= 0 {
			if! ci.Contains(output[j]) {break
			}
			j--
		}
		output = output[:j+1]

		/ / 3
		for len(output) >= 3 && areSiblings(output[len(output)- 3], output[len(output)2 -], output[len(output)- 1], ci) {
			output = output[:len(output)- 3]
			ci = ci.immediateParent() // checked ! ci.isFace above
		}
		output = append(output, ci)
	}
	*cu = output
}


Copy the code

The main purpose of Normalize is to sort celllides in CellUnion and output no redundant cellUnions.

Sorting is the first step.

Sorting out the redundant cells next is the second and key step in the implementation of this function. There are two types of redundancy: one is full inclusion, and the other is that four small cells can be combined into one large one.

Check whether one Cell completely contains another Cell. In this case, the contained Cell is redundant and should be discarded. This corresponds to the 1 mark in the code above.

On implementation, we only need to check the last accepted cell. After the Cell is sorted first, if the current candidate Cell is not included by the last accepted Cell, it cannot be included by any previously accepted Cell.

Similarly, if the current candidate Cell contains the Cell that has been checked before, then the Cell that has been in the output needs to be discarded. Since output maintains a continuous tail sequence, as mentioned above, S2 Cell is sorted, so its continuity cannot be destroyed here. This corresponds to the 2 mark in the code above.

And the last part of the code marked 3 is to see if the last three cells plus this can be combined. If three consecutive cells plus the current Cell can be cascaded to the nearest parent node, we replace them with larger cells.

After Normalize, the output cells are orderly and non-redundant.

7. Denormalize()


func (cu *CellUnion) Denormalize(minLevel, levelMod int) {
	var denorm CellUnion
	for _, id := range *cu {
		level := id.Level()
		newLevel := level
		if newLevel < minLevel {
			newLevel = minLevel
		}
		if levelMod > 1 {
			newLevel += (maxLevel - (newLevel - minLevel)) % levelMod
			if newLevel > maxLevel {
				newLevel = maxLevel
			}
		}
		if newLevel == level {
			denorm = append(denorm, id)
		} else {
			end := id.ChildEndAtLevel(newLevel)
			forci := id.ChildBeginAtLevel(newLevel); ci ! = end; ci = ci.Next() { denorm =append(denorm, ci)
			}
		}
	}
	*cu = denorm
}


Copy the code

Denormalize is a simple function that is the other side of Normalize (although the name is antisense). The purpose of this function is to “normalize” whether the Cell meets several predefined conditions before the override transformation: MinLevel, MaxLevel, LevelMOD, and MaxCell.

Any Cell whose level is less than minLevel or (level-minlevel) not a multiple of levelMod will be replaced by its child until both conditions are met or maxLevel is reached.

The intent of the Denormalize function is to ensure that the result satisfies minLevel and levelMod, and preferably MaxCells as well.

The Denormalize function replaces the large Cell with its own child and the small Cell to satisfy this condition, whereas Normalize replaces the four small child cells with their immediate parent nodes.

The FastCovering() function is all analyzed here.

8. adjustCellLevels()

Back in the initialCandidates() function, there is another step after FastCovering() in this function, adjustCellLevels.


c.adjustCellLevels(&cells)

Copy the code

Let’s see how adjustCellLevels is implemented.


func (c *coverer) adjustCellLevels(cells *CellUnion) {
	if c.levelMod == 1 {
		return
	}

	var out int
	for _, ci := range *cells {
		level := ci.Level()
		newLevel := c.adjustLevel(level)
		ifnewLevel ! = level { ci = ci.Parent(newLevel) }if out > 0 && (*cells)[out- 1].Contains(ci) {
			continue
		}
		for out > 0 && ci.Contains((*cells)[out- 1]) {
			out--
		}
		(*cells)[out] = ci
		out++
	}
	*cells = (*cells)[:out]
}


Copy the code

AdjustCellLevels is used to ensure that all cells with Level > minLevel also satisfy levelMod, and can be replaced with ancestors if necessary. For cells smaller than minLevel, their Level will not be modified (see Adjusting levels). The end result is cellUnions that are standardized to ensure that there are no redundant cells.

AdjustCellLevels is a little bit like Denormalize, which is used to adjust CellUnion to meet conditions. AdjustCellLevels (adjustCellLevels) Is used to make adjustCellLevels (adjustCellLevels).


func (c *coverer) adjustLevel(level int) int {
	if c.levelMod > 1 && level > c.minLevel {
		level -= (level - c.minLevel) % c.levelMod
	}
	return level
}


Copy the code

AdjustLevel the adjustLevel intent is to return a lower Level to make it meet the levelMod condition. Levels less than minLevel are not affected (because the units of these levels are eventually processed by the Denormalize function).

B. Denormalize

The implementation of Denormalize has been analyzed above. I’m not going to analyze it here.

(3) summary

A figure is used to represent the whole process of Google S2 covering the entire spatial region:

Each of the key implementations in the figure above has been analyzed, so if you still don’t understand any nodes, you can go back and look at them again.

This approximation algorithm is not optimal, but it works fine in practice. The output result does not always use the maximum number of cells that meet the condition, because it does not always produce a better approximation (for example, the area to be covered is located at the intersection of the three faces, The result is much larger than the original range) and MaxCells limits the amount of searching and the number of cells that can be output.

Since this is an approximation algorithm, the stability of its output cannot be relied on. In particular, the output of the override algorithm varies between library versions.

The algorithm can also generate inner covered cells, which are cells that are completely contained within a region. If no Cell meets the condition, the inner overwrite Cell may be empty even for a non-empty region.

Note that for performance reasons, it is advisable to specify MaxLevel when calculating internal overwriting cells. Otherwise, for small or zero-area areas, the algorithm may spend a lot of time subdividing cells to leaf level to try to find the inner covered cells that meet the conditions.

4. The last

About the space search series of articles to here also come to an end, the last of course to say some experience.

When learning and practicing the knowledge of spatial search, the author checked the materials of physics, mathematics and algorithm, and improved the cognition of space and time from the two levels of physics and mathematics. Although the current personal cognition in this respect may be very shallow, but compared with before it is really a lot of progress. The purpose was achieved.

The last is to recommend two websites, which is also the most asked questions on weibo.

The first question is: how are S2 cells drawn in the series of articles?

This is actually a personal open source website, s2map.com/, where the author fills in CellID calculated by the program and then displays it. This is a visual research presentation tool for S2.

The second question is: why are some of the code not found in the Go version?

The answer is that the implementation of the Go version is not 100% complete, and some need to refer to the complete version of C++ and Java code. Regarding the C++ and Java source code, Google has moved the code from the private repository to Github a few days ago. More convenient to learn and view. Officials have also sorted out some documents on s2Geometry. IO /. Beginners are advised to read the official API documentation first. After reading the document, the original reason of the problem is still some doubts, you can turn over the author of this space search series of articles, I hope to be helpful to readers.

Spatial Search series:

How to understand n-dimensional space and N-dimensional space-time efficient multi-dimensional space point index algorithms – Geohash and Google S2 how to generate CellID in Google S2? How to find the neighbors of the Hilbert curve on the quadtree of the magical Debruine sequence? How does Google S2 solve the problem of optimal spatial coverage?

Making Repo: Halfrost – Field

Follow: halfrost dead simple

Source: halfrost.com/go_s2_regio…