“This is the 20th day of my participation in the Gwen Challenge in November. Check out the details: The Last Gwen Challenge in 2021.”

If you take a look at the new Datadog Agent, you may notice that most of the code base is written in Go, although the checks we use to collect metrics are still written in Python. This is probably because the Datadog Agent is a plain Go binary with a CPython interpreter embedded, which can execute Python code on demand at any time. This process is transparent through an abstraction layer, allowing you to write idiomatic Go code while running Python underneath.

video

There are many reasons to embed Python in Go applications:

  • It’s useful during the transition; You can gradually migrate parts of an existing Python project to the new language without losing any functionality in the process.
  • You can reuse existing Python software or libraries without having to re-implement them in a new language.
  • You can dynamically extend your software by loading it to execute regular Python scripts, even at runtime.

The list goes on, but for Datadog Agent, the last point is crucial: we want to be able to perform custom checks or change existing checks without having to recompile the Agent, or compile anything at all.

Embedding CPython is simple and well-documented. The interpreter itself is written in C, and provides a C API to perform low-level operations programmatically, such as creating objects, importing modules, and calling functions.

In this article, we’ll show some code examples where we’ll keep the idioms of Go code while interacting with Python, but before we proceed, we need to resolve a gap: How can this work when the embedded API is C and our main application is Go?

Introduce cgo

There are many good reasons why you should not introduce CGO into the stack, but embedding CPython is the reason you must. Cgo is not a language, nor is it a compiler. It is a Foreign Function Interface (FFI), a mechanism we can use in Go to call functions and services written in different languages (especially C).

When we say “cgo,” we really mean the set of tools, libraries, functions, and types that the Go toolchain uses underneath, so we can get our Go binaries by executing Go Build. Here is a sample program using CGO:

package main

// #include <float.h>
import "C"
import "fmt"

func main() {
    fmt.Println("Max float value of float is", C.FLT_MAX)
}

Copy the code

In this case, the comment block above the import “C” directive is called a “preamble” and can contain the actual C code. Once imported, we can “jump” to external code via the “C” pseudo-package to access the constant FLT_MAX. You can build by calling Go Build, which is just like regular Go.

If you want to see what cGO is doing behind the scenes, you can run go build-x. You’ll see that the “Cgo” tool will be called to generate some C and Go modules, then the C and Go compilers will be called to build the target modules, and finally the linker will put everything together.

You can read more about CGO on the Go blog, which contains more examples and useful links to further details.

Now that we’ve seen what CGO can do for us, let’s look at how to run some Python code using this mechanism.

Embedded CPython: a how-to guide

Technically, the Go program embedded in CPython is not as complicated as you might think. In fact, we simply initialize the interpreter before running Python code and close it when we’re done. Note that we used Python 2.x in all our examples, but we can apply it to Python 3.x with very few adjustments. Let’s look at an example:

package main // #cgo pkg-config: Python-2.7 // #include < python. h> import "C" import "FMT" func main() {c.py_initialize () fmt.Println(C.GoString(C.Py_GetVersion())) C.Py_Finalize() }Copy the code

The above example does exactly what the following Python code does:

import sys
print(sys.version)
Copy the code

You can see that we added a #cgo directive in the prologue; These instructions are passed to the toolchain, allowing you to change the build workflow. In this case, we tell CGO to call PKg-config to collect the flags needed to build and link a library named Python-2.7, and pass these flags to the C compiler. If you have the CPython development library and PKG-config installed on your system, you just need to run go Build to compile the examples above.

Going back to the code, we use Py_Initialize() and Py_Finalize() to initialize and close the interpreter, and the Py_GetVersion C function to get a string of embedded interpreter version information.

In case you’re wondering, all the CGO code we need to put together to call the Python API of C is template code. This is why Datadog Agent relies on Go-Python for all the embedding operations; The library provides a go-friendly lightweight package for the C API and hides the CGO details. Here’s another basic embedded example, this time using Go-Python:

package main import ( python "github.com/sbinet/go-python" ) func main() { python.Initialize() python.PyRun_SimpleString("print 'hello, world! '") python.Finalize() }Copy the code

This looks more like normal Go code, no longer exposing CGO, and we can use the Go string back and forth when accessing the Python API. Embedded looks powerful and developer-friendly, and it’s time to take full advantage of the interpreter: Let’s try loading Python modules from disk.

We don’t need anything complicated in Python, the ubiquitous “Hello world” does the trick:

# foo.py
def hello():
    """
    Print hello world for fun and profit.
    """
    print "hello, world!"
Copy the code

The Go code is slightly more complex, but still readable:

// main.go
package main

import "github.com/sbinet/go-python"

func main() {
    python.Initialize()
    defer python.Finalize()

    fooModule := python.PyImport_ImportModule("foo")
    if fooModule == nil {
        panic("Error importing module")
    }

    helloFunc := fooModule.GetAttrString("hello")
    if helloFunc == nil {
        panic("Error importing function")
    }

    // The Python function takes no params but when using the C api
    // we're required to send (empty) *args and **kwargs anyways.
    helloFunc.Call(python.PyTuple_New(0), python.PyDict_New())
}

Copy the code

At build time, we need to set the PYTHONPATH environment variable to the current working directory so that the import statement can find the foo.py module. In the shell, the command looks like this:

$ go build main.go && PYTHONPATH=. ./main
hello, world!
Copy the code

The dreaded global interpreter lock

Cgo had to be introduced in order to embed Python, which was a trade-off: builds would be slower, garbage collectors wouldn’t help us manage memory used by external systems, and cross-compilation would be difficult. Whether or not these issues are debatable for a particular project, but I think there are some non-negotiable issues: the Go concurrency model. If we can’t run Python from Goroutine, there’s no point in using Go.

Before dealing with concurrency, Python, and CGO, there’s something else we need to know: It’s the Global Interpreter Lock, or GIL. GIL is a mechanism widely used in language interpreters (CPython is one of them) to prevent multiple threads from running at the same time. This means that no Python program executed by CPython can be run in parallel in the same process. Concurrency is still possible, and locking is a good trade-off between speed, security, and ease of implementation, so why is this a problem when it comes to embedding?

When a regular, non-embedded Python program is started, the GIL is not involved to avoid unnecessary overhead in locking operations; The GIL starts the first time some Python code requests a thread. For each thread, the interpreter creates a data structure to store the current relevant state information and locks the GIL. When the thread completes, the state is restored and the GIL is unlocked, ready to be used by other threads.

None of this happens automatically when we run Python from the Go program. Without the GIL, our Go program could create multiple Python threads, which could cause race conditions that could lead to fatal runtime errors, and most likely a fragmentation error that would crash the entire Go application.

The solution is to explicitly call GIL when we run multithreaded code from Go; The code is not complicated because the C API provides all the tools we need. To better expose this problem, we need to write some CPU-constrained Python code. Let’s add these functions to the foo.py module in the previous example:

# foo.py
import sys

def print_odds(limit=10):
    """
    Print odds numbers < limit
    """
    for i in range(limit):
        if i%2:
            sys.stderr.write("{}\n".format(i))

def print_even(limit=10):
    """
    Print even numbers < limit
    """
    for i in range(limit):
        if i%2 == 0:
            sys.stderr.write("{}\n".format(i))

Copy the code

We will try to print odd and even numbers concurrently from Go, using two different Goroutines (thus involving threads) :

package main

import (
    "sync"

    "github.com/sbinet/go-python"
)

func main() {
    // The following will also create the GIL explicitly
    // by calling PyEval_InitThreads(), without waiting
    // for the interpreter to do that
    python.Initialize()

    var wg sync.WaitGroup
    wg.Add(2)

    fooModule := python.PyImport_ImportModule("foo")
    odds := fooModule.GetAttrString("print_odds")
    even := fooModule.GetAttrString("print_even")

    // Initialize() has locked the the GIL but at this point we don't need it
    // anymore. We save the current state and release the lock
    // so that goroutines can acquire it
    state := python.PyEval_SaveThread()

    go func() {
        _gstate := python.PyGILState_Ensure()
        odds.Call(python.PyTuple_New(0), python.PyDict_New())
        python.PyGILState_Release(_gstate)

        wg.Done()
    }()

    go func() {
        _gstate := python.PyGILState_Ensure()
        even.Call(python.PyTuple_New(0), python.PyDict_New())
        python.PyGILState_Release(_gstate)

        wg.Done()
    }()

    wg.Wait()

    // At this point we know we won't need Python anymore in this
    // program, we can restore the state and lock the GIL to perform
    // the final operations before exiting.
    python.PyEval_RestoreThread(state)
    python.Finalize()
}

Copy the code

As you read through the examples, you may notice a pattern that will become our customary way of writing embedded Python code:

  1. Save the state and lock the GIL.
  2. Execute Python.
  3. Restore the status and unlock the GIL.

The code should be simple, but we want to point out one subtle detail: Note that although the GIL execution is borrowed, sometimes we manipulate the GIL by calling PyEval_SaveThread() and PyEval_RestoreThread(), Sometimes (see Goroutines) we do the same thing with PyGILState_Ensure() and PyGILState_Release().

We said that when working with multiple threads from Python, the interpreter is responsible for creating the data structures needed to store the current state, but when the same thing happens with the C API, we are responsible for handling it.

When we initialize the interpreter with Go-Python, we operate in the Python context. Therefore, when PyEval_InitThreads() is called, it initializes the data structure and locks the GIL. We can use PyEval_SaveThread() and PyEval_RestoreThread() to manipulate existing states.

In Goroutines, where we operate from the Go context, we need to explicitly create the state and remove it when it’s done, which is what PyGILState_Ensure() and PyGILState_Release() do for us.

Release the Gopher

At this point, we know how to handle multithreaded Go code that executes Python in an embedded interpreter, but after GIL, another challenge looms: the Go scheduler.

When a Goroutine is started, it is scheduled to execute on one of the available GOMAXPROCS threads, see here for more details on this topic. If a Goroutine happens to make a system call or call C code, the current thread will hand over any other Goroutine in the thread queue waiting to run to another thread so that they have a better chance of running; The goroutine is currently suspended, waiting for a system call or C function to return. When this happens, the thread tries to resume the suspended goroutine, but if this is not possible, it asks the Go runtime to find another thread to complete the Goroutine and Go to sleep. The goroutine is finally assigned to another thread, and it is done.

With this in mind, let’s look at what happens to a Goroutine running some Python code when a goroutine is moved to a new thread:

  1. Our Goroutine starts, makes a C call and pauses. GIL is locked.
  2. When the C call returns, the current thread attempts to restore the Goroutine, but fails.
  3. The current thread tells the Go runtime to find another thread to restore our Goroutine.
  4. The Go scheduler finds an available thread and restores the Goroutine.
  5. Goroutine is almost done and tries to unlock the GIL before returning.
  6. The thread ID stored in the current state is from the original thread and is different from the current thread ID.
  7. Crash!

Fortunately, we can force the Go Runtime to always keep our Goroutine running on the same thread by calling the LockOSThread function in the runtime package from the Goroutine:

go func() {
    runtime.LockOSThread()

    _gstate := python.PyGILState_Ensure()
    odds.Call(python.PyTuple_New(0), python.PyDict_New())
    python.PyGILState_Release(_gstate)
    wg.Done()
}()
Copy the code

This interferes with the scheduler and may introduce some overhead, but it is a price we are willing to pay.

conclusion

In order to embed Python, the Datadog Agent must accept some trade-offs:

  • Overhead introduced by CGO.
  • Handle GIL tasks manually.
  • Restrictions on binding goroutine to the same thread during execution.

We are happy to accept each of these for the convenience of running Python checks in Go. But by being aware of these trade-offs, we were able to minimize their impact, and we had no countermeasures to control the potential problems, aside from other restrictions introduced to support Python:

  • Builds are automated and configurable, so developers still need to own withgo buildSomething very similar.
  • A lightweight version of Agent that can use Go to build tags, completely stripping Python support.
  • Such a version relies only on hard-coded core checks (mainly system and network checks) in the Agent itself, but does not have CGO and can be cross-compiled.

We will re-evaluate our options in the future and determine if it is still worth keeping CGO; We could even reconsider whether Python as a whole is still worth waiting for the Go plug-in package to mature enough to support our use cases. But for now, embedded Python works fine, and transitioning from an old agent to a new one is straightforward.

Are you a polyglot who likes mixing different programming languages? Do you enjoy understanding the inner workings of a language to improve the performance of your code?