Go: Why generics
- The Chinese version of
- English version
introduce
[This is the version of the talk delivered at Gophercon 2019. Video links are available.]
This article is about what it means to add generics to Go, and why I think we should. I’ll also cover possible design changes to add generics to Go.
Go was released on November 10, 2009. Less than 24 hours later, we saw our first comment on generics. (The review also mentions that we added panic and Recover to the language in early 2010.)
In the three years of the Go survey, the lack of generics was consistently listed as one of the top three problems the language needed fixing.
This project will bring more documentation, code, and discussion about Go generics.
Why generics?
But what does it mean to add generics, and why do we want them?
In the words of Jazayeri et al., generic programming enables functions and data structures to be represented as generic types.
What does that mean?
As a simple example, let’s say we want to reverse the elements in a slice. That’s not something many programs need to do, but it’s not that unusual.
Let’s say it’s an int slice.
func ReverseInts(s []int) {
first := 0
last := len(s)
for first < last {
s[first], s[last] = s[last], s[first]
first++
last--
}
}Copy the code
Very simple, but even for a simple function, you want to write some test cases. In fact, when I did that, I found a mistake. I’m sure many readers have already spotted it.
func ReverseInts(s []int) {
first := 0
last := len(s) - 1
for first < last {
s[first], s[last] = s[last], s[first]
first++
last--
}
}Copy the code
We need to subtract 1 when we set the variable at the end.
Now let’s reverse a string.
func ReverseStrings(s []string) {
first := 0
last := len(s) - 1
for first < last {
s[first], s[last] = s[last], s[first]
first++
last--
}
}Copy the code
If you compare ReverseInts with ReverseStrings, you’ll see that the two functions are exactly the same, except for the types of arguments. I don’t think any readers are surprised.
Some people are surprised by Go that there is no way to write a simple function in Reverse that works with any type of slice.
Most other languages will let you write this functionality.
In dynamically typed languages like Python or JavaScript, you can simply write functions without specifying element types. This doesn’t work in Go, because Go is statically typed and requires you to note down the exact type of slice and the type of slice element.
Most other statically typed languages, such as C++ or Java or Rust or Swift, support generics to solve this kind of problem.
Today’s Go universal programming
So how do people write this code in Go?
In Go, you can use interface types to write a single function for different slice types and define methods on the slice types to be passed. This is how the sort.sort function of the standard library works.
In other words, the interface type in Go is a form of general-purpose programming. They let us capture different types of common aspects and express them as methods. We can then write functions that use these interface types, and these functions will be applicable to any type that implements these methods.
But this approach does not meet our requirements. With interfaces, you have to write your own methods. It is awkward to reverse slices using several ways to define named types. The method you write is exactly the same for each slice type, so in a sense we’re just moving and compressing duplicate code, we haven’t eliminated it yet. While interfaces are a form of generics, they don’t give us all the generics we want.
Another way to use generic interfaces is to get around the need to write your own methods by having the language define methods for certain types. This is not something the language supports, but, for example, the language could define each slice type to have an Index method that returns an element. But in order to use this method in practice, it must return an empty interface type, and then we lose all the benefits of static typing. More subtly, there is no way to define a generic function that takes two different slices of the same element type, or that takes a mapping of an element type and returns slices of the same element type. Go is a statically typed language because it makes it easier to write large programs; We don’t want to lose the benefits of static typing for the benefits of generics.
The alternative is Reverse to write a generic function using reflection packages, but this is clumsy and slow to write, and few people do it. This method also requires explicit type assertion and has no static type checking.
Alternatively, you can write a code generator that takes a type and Reverse generates a function for a slice of that type. Several code generators do just that. But this adds another step Reverse for each package needed, which complicates the build because all the different copies have to be compiled, and fixing errors in the primary source requires regenerating all instances, some of which may be in completely different projects.
All of these methods are awkward, and I think most people who have to reverse slices in Go just write functions for the particular type of slice they need. Then they need to write test cases for the functions to make sure they don’t make a simple error like the one I originally made. And they need to run these tests regularly.
But we do this, which means that there is a lot of extra work for functions that look exactly the same, beyond the element types. It’s not that it can’t be done. Obviously it can be done, and the Go programmers are doing it. There’s just got to be a better way.
For statically typed languages like Go, generics are a better approach. I wrote earlier about generic programming being able to express functions and data structures in a general form, taking types into account. This is exactly what we want.
What generics can be brought in for Go
The first and most important thing we want from generics in Go is to be able to write functions that Reverse without worrying about the element type of the slice. We want to factor out that element type. We can then write functions once, write tests once, put them in a go-gettable package, and call them at any time.
Even better, since it’s an open source world, someone else can write Reverse once and we can use their implementation.
At this point, I should say that “generic” can mean a lot of different things. In this article, what I mean by “generics” is what I just described. In particular, I don’t mean templates in the C ++ language, which support a lot more than WHAT I’ve written here.
I describe this in detail in Reverse, but we can write many other features, such as:
-
Find the smallest/largest element in the slice
-
Find the mean/standard deviation of slices
-
Evaluate federated/crossed maps
-
Find the shortest path in node/edge
-
Apply the conversion function to slice/map and return a new slice/map
These examples are provided in most other languages. I actually wrote this list by browsing the C++ standard template library.
There are also examples specific to Go that strongly support concurrency.
-
Read from a channel with a timeout
-
Combine the two channels into one channel
-
Call the list of functions in parallel, returning a piece of results
-
Call the list of functions, use Context, and return the result of the first function to complete, cancel, and clean up additional goroutines
I’ve seen all of these functions written many times with different types. It’s not hard to write them in Go. But it would be nice to be able to reuse an efficient and debug-friendly implementation that applies to any value type.
To be clear, these are just a few examples. There are also many generic features that can be written more easily and safely using generics.
Plus, as I’ve written before, it’s not just functionality. It’s also a data structure.
Go has two general-purpose data structures built into the language: slices and Maps. Slices and maps can hold values of any data type, using static typing to check stored and retrieved values. Values are stored for themselves, not for the interface type. That is, when I have a []int, the slice saves the int directly instead of converting the int to the interface type.
Slices and maps are the most useful general-purpose data structures, but they’re not the only ones. Here are some other examples.
-
Sets
-
Self-balancing tree, ordered insertion and row sort traversal
-
Multimaps, with multiple instances of keys
-
Concurrent hash mapping, supporting parallel inserts and look-up, without a single lock
If we could write generic types, we could define new data structures like these that have the same type-checking advantages as slicing and mapping: compilers can statically type check the types of the values they hold, and values can be stored as themselves rather than as interface types.
It should also be possible to take the algorithms mentioned earlier and apply them to generic data structures.
These examples should be like Reverse: generic functions and data structures written once, in packages, and reused as needed. They should work like slices and maps, because they should not store values for empty interface types, but specific types should be stored, and these types should be checked at compile time.
That’s what Go can get from generics. Generics give us powerful building blocks that make it easier to share code and builders.
I hope I’ve explained why it’s worth studying.
Benefits and Costs
But generics don’t come from Big Rock Candy Mountain, where the sun shines on lemonade springs every day. Every language change has a cost. There is no doubt that adding generics to Go will make the language more complex. As with any change in language, we need to talk about maximising benefits and minimising costs.
In Go, our goal is to reduce complexity through independent, orthogonal language features that can be freely combined. We reduce complexity by simplifying individual features and maximize the benefits of features by allowing them to be freely combined. We want to do the same thing with generics.
To make this more concrete, I will outline some guidelines that we should follow.
* Minimize new concepts
We should add as few new concepts to the language as possible. This means adding the fewest new syntax and the fewest new keywords and other names.
* Complexity falls on the writers of common code, not the users
Programmers programming generic packages should minimize complexity as much as possible. We don’t want package users to have to worry about generics. This means it should be possible to call generic functions in a natural way, and any errors with generic packages should be reported in a way that is easy to understand and fix. It should also be easy to invoke the call as generic code.
* Writers and users can work independently
Similarly, it should be easy to separate the concerns of common code writers and their users so that they can develop the code independently. They shouldn’t worry about what each other is doing, not just writers and callers of normal functionality in different packages. This sounds obvious, but it’s not true of generics in every other programming language.
* Short build time, short execution time
Of course, as much as possible, we want to keep Go giving us today’s short build times and fast execution times. Generics tend to be a trade-off between fast build and fast execution. We want both as much as possible.
* Keep Go clear and concise
Most importantly, Go today is a simple language. Go programs are usually clear and easy to understand. A major part of our long process of exploring this space has been trying to understand how to add generics while maintaining clarity and simplicity. We need to find mechanisms that fit into the existing language, rather than turning it into something completely different.
These guidelines should apply to any generic implementation in Go. That’s the most important message I want to leave you with today: Generics can bring significant benefits to the language, but they’re worth doing if Go still feels like Go.
The draft design
Fortunately, I think it can be done. To complete this article, I’ll discuss why we want generics, what the requirements are for them, and briefly discuss how we think we can add them to the design of the language.
At this year’s Gophercon Robert Griesemer and I released a draft design to add generics to Go. See the draft for more information. I will discuss some of the main points here.
This is the general inverse function in this design.
func Reverse (type Element) (s []Element) {
first := 0
last := len(s) - 1
for first < last {
s[first], s[last] = s[last], s[first]
first++
last--
}
}Copy the code
You’ll notice that the body of the function is exactly the same. Only the signature has changed.
The element type of the slice has been considered. It is now named Element and becomes what we call a type parameter. Instead of being part of the Slice parameter type, it is a separate additional type parameter.
To call a function with a type parameter, in general, you pass a type parameter, which is like any other parameter, except that it is a type.
func ReverseAndPrint(s []int) {
Reverse(int)(s)
fmt.Println(s)
}Copy the code
This is the int that you see after Reverse in this example.
Fortunately, in most cases, including this one, the compiler can infer type parameters from the types of regular parameters and doesn’t need to mention type parameters at all.
Calling a generic function is like calling any other function.
func ReverseAndPrint(s []int) {
Reverse(s)
fmt.Println(s)
}Copy the code
In other words, while generic Reverse functionality is slightly superior to more complex ReverseInts and ReverseStrings, the complexity falls on functions rather than writing and calling them.
contract
Since Go is a statically typed language, we must discuss the types of type parameters. This meta-type tells the compiler what types of arguments to allow when calling generic functions and what operations generic functions can perform on the values of type parameters.
The Reverse function can use slicing of any type. Its only effect on type values is Element assignment, which applies to any type in Go. This is a very common case for this kind of general-purpose function, and we don’t need to say anything special about type parameters.
Let’s take a quick look at the different features.
func IndexByte (type T Sequence) (s T, b byte) int {
for i := 0; i < len(s); i++ {
if s[i] == b {
return i
}
}
return -1
}Copy the code
Currently, both the Bytes package and the Strings package in the standard library have an IndexByte function. This function returns the index S in the b sequence, where s a string or a []byte. We can use this single generic function to replace the two functions in the byte and string packages. In practice, we might not do this, but this is a useful simple example.
Here, we need to know that T of the type parameter acts like a string or a []byte. We can call len it, we can index it, we can compare the result of the index operation to the byte value.
To compile, the type parameter T itself needs a type. It is a meta-type, but because we sometimes need to describe multiple related types, and because it describes the relationship between the implementation of a generic function and its caller, we actually call T the type of the contract. The contract is named Sequence here. It appears after the type parameter list.
This is how the Sequence contract is defined for this example.
contract Sequence(T) {
T string, []byte
}Copy the code
This is easy because this is a simple example: the type parameter T can be either string or []byte. The contract may be a new keyword, or a special identifier recognized within the scope of the package; Please refer to the draft design for details.
Anyone who remembers the designs we showed at Gophercon 2018 will find it much easier to sign contracts this way. We’ve had a lot of feedback about the early design of contracts being too complex, and we’ve tried to take that into account. New contracts are much easier to write, read and understand.
They allow you to specify the base types of type parameters, and/or list methods for type parameters. They also let you describe relationships between different types of parameters.
Contract with the method
This is another simple example that uses the String method to return the String representation s of all the elements of [] String.
func ToStrings (type E Stringer) (s []E) []string {
r := make([]string, len(s))
for i, v := range s {
r[i] = v.String()
}
return r
}Copy the code
It’s pretty simple: Iterate over the slice, String calls a method on each element, and returns a slice of the resulting String.
This function requires the element type to implement the String method. The string contract ensures this.
contract Stringer(T) {
T String() string
}Copy the code
The contract just says that T must implement the String method.
You might notice that this contract looks like the FMt.stringer interface, so it’s worth pointing out that the argument to the ToStrings function is not a branch FMt.stringer. It is a fragment of some element types FMT.Stringer that the element type implements. The memory representation of element type slices and FMt.stringer slices are generally different, and Go does not support direct conversions between them. So even if FMT.Stringer exists, it’s worth writing about.
There are many types of contracts
The following is an example of a contract with multiple type parameters.
type Graph (type Node, Edge G) struct { ... }
contract G(Node, Edge) {
Node Edges() []Edge
Edge Nodes() (from Node, to Node)
}
func New (type Node, Edge G) (nodes []Node) *Graph(Node, Edge) {
...
}
func (g *Graph(Node, Edge)) ShortestPath(from, to Node) []Edge {
...
}Copy the code
Here we describe a Graph built with Node and Edge. We don’t need Graph’s specific data structure. Instead, we say that the Node type must have a Edges method that returns a list Node of the Edges to which it is connected. And the Edge type must have a Nodes return two method Nodes to which the Edge is connected.
I’ve skipped over the implementation, but this shows the signature of Graph, a function that New returns a, and Graph, the signature of the ShortestPath method.
The important point here is that contracts are not just one type. It can describe the relationship between two or more types.
Order type
A surprisingly common complaint about Go is that it has no Min feature. Or, for that matter, a Max feature. This is because a useful Min function should work for any ordered type, which means it must be generic.
While Min is simple enough to write on its own, any useful generic implementation should let us add it to the standard library. This is our design.
func Min (type T Ordered) (a, b T) T {
if a < b {
return a
}
return b
}Copy the code
The Ordered contract says that T has a type of Ordered type, which means that it supports operations like less than, greater than, and so on.
contract Ordered(T) {
T int, int8, int16, int32, int64,
uint, uint8, uint16, uint32, uint64, uintptr,
float32, float64,
string
}Copy the code
The Ordered contract is simply a list of all the command types defined by the language. This contract accepts any listed type, or any named type whose base type is one of the types. Basically, you can use any type of the less operator.
It turns out that simply enumerating types that support less than operators is much easier than inventing a new notation for all operators. After all, in Go, only built-in types support operators.
This approach can be used for any operator or, more generally, for writing contracts for any generic function intended to be used with built-in types. It allows the writer of a generic function to clearly specify the set of types with which the function should be used. It allows callers of generic functions to see clearly whether the function is applicable to the type being used.
In fact, the contract may end up in the standard library. So the Min function (which might also be in some standard library) looks something like this. Here we refer only to the Ordered contract defined in the package.
func Min (type T contracts.Ordered) (a, b T) T {
if a < b {
return a
}
return b
}Copy the code
Universal data structure
Finally, let’s look at a simple universal data structure, a binary tree. In this example, the tree has comparison capability, so there is no requirement for element type.
type Tree (type E) struct {
root *node(E)
compare func(E, E) int
}
type node (type E) struct {
val E
left, right *node(E)
}Copy the code
Here’s how to create a new binary tree. The comparison function is passed to the New function.
func New (type E) (cmp func(E, E) int) *Tree(E) {
return &Tree(E){compare: cmp}
}Copy the code
The unexported method returns a pointer to the slot that holds v, or to where the tree should go.
func (t *Tree(E)) find(v E) **node(E) { pn := &t.root for *pn ! = nil { switch cmp := t.compare(v, (*pn).val); { case cmp < 0: pn = &(*pn).left case cmp > 0: pn = &(*pn).right default: return pn } } return pn }Copy the code
The details here aren’t important, especially since I haven’t tested this code. I just want to show you what it looks like to write a simple generic data structure.
This is the code that tests whether the tree contains a value.
func (t *Tree(E)) Contains(v E) bool { return *t.find(e) ! = nil }Copy the code
This is the code to insert the new value.
func (t *Tree(E)) Insert(v E) bool { pn := t.find(v) if *pn ! = nil { return false } *pn = &node(E){val: v} return true }Copy the code
Note the type node of the type parameter E. This is what writing a generic data structure looks like. As you can see, it looks like writing normal Go code, except for scattering some types of arguments here and there.
Using trees is simple.
var intTree = tree.New(func(a, b int) int { return a - b }) func InsertAndCheck(v int) { intTree.Insert(v) if ! intTree.Contains(v) { log.Fatalf("%d not found after insertion", v) } }Copy the code
That’s as it should be. Writing generic data structures is a little more difficult because you often need to explicitly write type parameters for supporting types, but using one where possible is no different than using a normal non-generic data structure.
The next step
We are working on practical implementation so that we can try this design. It’s important to be able to experiment with design in practice to make sure we can write the type of program we want to write. It’s not as fast as we’d like, but we’ll send more details as these implementations become available.
Robert Griesemer wrote a preliminary CL that modified the Go /types package. This allows you to test whether code using generics and contracts can be keyed in. It’s not complete yet, but it mostly works with individual packages, and we’ll keep working on that.
What we want people to do with this and future implementations is to try to write and use common code and see what happens. We want to make sure that people can write the code they need, and that they can use it as expected. Of course, not everything will work, and we may have to change everything as we explore this space. Also, be clear that we are more interested in semantic feedback than in the details of syntax.
I’d like to thank everyone who commented on the early designs, and everyone who discussed generics in Go. We’ve read all the comments and we really appreciate the effort that people put into this. Without that work, we wouldn’t be where we are today.
Our goal was to achieve a design that would allow me to write the generic code I discussed today without making the language too complex or moving away from Go. We hope this design is a step toward that goal, and we hope to continue to adjust it as we learn, from our experiences and yours, what works and what doesn’t. If we do achieve this goal, then we can provide some suggestions for future versions of Go.
By Ian Lance Taylor
Original: https://blog.golang.org/why-generics
Translation: https://github.com/llgoer/go-generics