Introduction: Cross-language programming is a very important direction in modern programming languages, and is widely used in complex design and implementation.

Cross-language programming is a very important direction in modern programming languages and is widely used in the design and implementation of complex systems. This article is about Alibaba FFI — ‘Exploration of Cross-language Programming’ in GIAC 2021. Authors: Dong Denghui, Gu Tianxiao, from JVM team of Aliyun Intelligent Basic Software Department

background

preface

Java is undoubtedly one of the most popular application programming languages in the industry today. In addition to its excellent performance and mature r&d ecosystem (Spring etc.) in the mainstream implementation (OpenJDK Hotspot), its success is due to the low learning threshold of the language itself (compared to C/C++). A beginner can quickly build a fledgling application using existing project scaffolding, but many Java programmers are unfamiliar with the underlying execution of the program. This article will explore a technique that is not covered much in most Java-related development efforts — cross-language programming.

I recall that when I first printed “Hello World” on the console in Java many years ago, I looked through the JDK source code out of curiosity to see what was implemented (printf function can be used in C language, and printf implementation depends on the interface of the operating system). After jumping again and again, I finally stopped at a native method with “no implementation”. And then there was no more.

I think many Java junior programmers still have a little understanding of the calling mechanism of native methods. After all, in most of the research and development work, there are few requirements for us to directly implement a custom native method. To put it simply, native method is the interface for Java to make cross-language calls. This is also part of the Java Native Interface specification.

Application scenarios of Java cross-language programming techniques

Common scenarios

Before introducing Java cross-language programming techniques, let’s briefly analyze the scenarios where cross-language programming techniques are needed in the actual programming process. Here I list the following four scenarios:

1. Rely on capabilities not supported by bytecode

On the other hand, what capabilities do standard bytecodes offer today? According to the Spec specification, existing bytecode can realize the create Java objects and access object fields and methods, the conventional calculation, subtraction, multiplication and division (or not), comparison, jump, and abnormal, lock operation, etc., but as mentioned in the preamble of the string to the console that higher-order functions, bytecode doesn’t directly support, in addition, as for the current time, Bytecodes such as allocating out-of-heap memory and rendering graphics are also not supported, and it is difficult to write a pure Java method (combining these bytecodes) to implement such capabilities, since these functions often require interaction with system resources. In these scenarios, we need to use cross-language programming techniques to integrate these functions through implementations in other languages.

2. Use system-level languages (C, C++, Assembly) to achieve the critical path of the system

The lack of need to display free objects is one of the reasons the Java language has a low learning barrier, but a GC mechanism was introduced to clean up objects no longer needed in a program. In mainstream JVM implementations, GC pauses the entire application and affects overall system performance, including response and throughput.

Therefore, compared with C/C++, Java reduces programmer’s r&d burden and improves product r&d efficiency, but also introduces runtime overhead. (Software engineering is largely the art of balancing tradeoffs.)

When Java is used to implement core parts of the system critical path (such as some complex algorithms), the performance will be unstable, you can try to use a relatively low-level programming language to implement this part of the logic, to achieve stable performance and low resource consumption.

3. Other language implementations call Java

This scenario may strike most Java programmers as almost unheard of at first, but we experience it almost every day.

For example, running a Java program through Java goes through the process of calling the Java language from C, as discussed later in this article.

4. Historical legacy library

There are some high performance libraries written in C/C++ in internal or open source implementations, which are very expensive to rewrite and maintain in Java. When Java applications need to use the capabilities provided by these libraries, we need to reuse them with cross-language programming techniques.

Alibaba Grape

One more scene within Alibaba: The Alibaba Grape project, the first business party to work with our team in the direction of cross-language programming technology.

Grape itself is an open source framework for parallel graph computing (ACM SIGMOD Best Paper Award), written primarily in C++ and implemented with extensive templating features. Those interested in Grape’s project can refer to the documentation on Github, which I won’t cover in detail here.

Internally, there are many business parties that use Java as their primary programming language, so developers need to wrap Grape’s library into a Java SDK for upper-level Java applications to call. In the course of practice, two significant problems were encountered:

  • Packaging the SDK can be tedious, especially for template-dependent libraries like Grape, where manual packaging often goes wrong in the early stages
  • Runtime performance is much lower than C++ applications

In order to solve these problems, the two teams cooperated, and the Alibaba FFI project officially began to evolve. The implementation of the project is mainly targeted at Java calling C++ scenarios.

Java cross-language programming techniques

Here are some of the more mature and widely used Java cross-language invocation techniques in the industry.

Java Native Interface

When it comes to Java cross-language programming, the first thing we have to mention is Java Native Interface, or JNI for short. The JNA/ JNR and JavaCPP technologies mentioned below all rely on JNI. First, a quick review with two examples.

Examples of the console output Plain Text word wrap xxxxxxxxxx1System. Out. Println (” hello ffi “);

We can quickly implement the console output function with system. out. I’m sure many curious students will be concerned about how to implement the output function.

private native void writeBytes(byte b[], int off, int len, boolean append) throws
IOException;
Copy the code

(This method is implemented by the JDK and can be seen here.)

So can we implement such functionality ourselves? A. First, we define a Java Native method that uses the Native keyword and does not provide a specific implementation (native methods can be overloaded).

static native void myHelloFFI();
Copy the code

B. Use javah or javac -h (JDK 10) to generate a header file that is used by C or C++ programs.

/* DO NOT EDIT THIS FILE - it is machine generated */
#include <jni.h>
/* Header for class HelloFFI */


#ifndef _Included_HelloFFI
#define _Included_HelloFFI
#ifdef __cplusplus
extern "C" {
#endif
/*
 * Class:     HelloFFI
 * Method:    myHelloFFI
 * Signature: ()V
 */
JNIEXPORT void JNICALL Java_HelloFFI_myHelloFFI
  (JNIEnv *, jclass);


#ifdef __cplusplus
}
#endif
#endif
Copy the code

C. Implement the function in the header file. Here we directly use the printf function to print “hello ffi” on the console.

JNIEXPORT void JNICALL Java_HelloFFI_myHelloFFI
  (JNIEnv * env, jclass c) {
  printf("hello ffi");
}
Copy the code

D. Use a C/C++ compiler (GCC/LLVM, etc.) to compile the generated library file

E. use – Djava. Library. The path =… Parameter specifies the library file path and calls System.loadLibrary at runtime to load the library generated in the previous step, after which the Java program can call our own myHelloFFI method as normal.

C programs call Java methods

Above is an example of a Java method calling a C function. Using JNI technology, we can also call a Java method in a C program, which involves two concepts: The Invocation API and JNI Function, in the following code example, omits the initialization step and shows only the two steps that make the Invocation final.

// Init jvm ... // Get method idjmethodID mainID = (*env)->GetStaticMethodID(env, mainClass, "main", "([Ljava/lang/String;)V");/* Invoke */(*env)->CallStaticVoidMethod(env, mainClass, mainID, mainArgs);Copy the code

In this example, we first get the ‘ID’ of the method using GetStaticMethodID and then call the method using CallStaticVoidMethod, both of which are JNI functions.

As we mentioned earlier, when we use Java

So what exactly is JNI? Here is my understanding.

  • First, JNI is the Java interface specification for cross-language access, primarily for C, C++, Assembly (why no other language? I personally think it’s because these languages were enough to cover most scenarios at the beginning of the specification.)
  • The specification itself takes into account the mainstream virtual machine implementation (hotspot), but it is not bound to any specific implementation. In other words, the cross-language programming portion of a Java program can theoretically run on any virtual machine that implements the specification
  • The specification defines how other languages can access Java objects, classes, methods, exceptions, start virtual machines, and how Java programs can call other languages (C, C++, Assembly)
  • Powerful, but slow, hard to use, and error-prone can be summed up in one sentence for the performance of specific use and actual operation effect

Java Native Access & Java Native Runtime

From the introduction of Java Native Interface, we can realize that it is very troublesome to use JNI technology to realize the steps of Java method calling C language. Therefore, in order to reduce the difficulty of Java cross-language programming (refers to Java calling C/C++ programs), The open source community gave birth to Java Native Access (JNA) and Java Native Runtime (JNR) projects. In essence, both technologies are still based on JNI at the bottom and therefore do not outperform JNI at run time.

By encapsulating C/C++ programs with JNA/JNR, developers do not need to actively generate or write the underlying glue code to quickly implement cross-language calls. They also provide other optimizations, such as Crash Protection (more on that later). In implementation, JNR dynamically generates stubs to optimize runtime performance.

The relationship between JNA/JNR and JNI is as follows:

Here is an official JNR example. First create the LibC interface to encapsulate the target C function, then call the LibraryLoader API to create a specific instance of LibC, and finally complete the call through the interface:

public class HelloWorld { public interface LibC { // A representation of libC in Java int puts(String s); // mapping of the puts function, in C `int puts(const char *s); ` } public static void main(String[] args) { LibC libc = LibraryLoader.create(LibC.class).load("c"); // load the "c" library into the libc variable libc.puts("Hello World!" ); // prints "Hello World!" to console } }Copy the code

Unfortunately, JNA and JNR are not friendly to C++ support, so their use in scenarios where C++ libraries are called is limited.

JavaCPP

The missing bridge between Java and native C++

If JNA/JNR optimizes the programming experience of Java calling C, JavaCPP aims to optimize the programming experience of Java calling C++, and is currently the most used SDK in the industry.

JavaCPP already supports most C++ features such as Overloaded operators, Class & Function templates, Callback through Function Pointers, and so on. Similar to JNA/JNR, JavaCPP is based on JNI at the bottom, with similar glue code and build scripts automatically generated through annotation processing and other mechanisms.

In addition, the project also provides Preset for some common C++ libraries implemented using JavaCPP, such as LLVM, Caffe, etc.

Here is an example of encapsulating STD ::vector using JavaCPP:

@Platform(include="<vector>") public class VectorTest { @Name("std::vector<std::vector<void*> >") public static class PointerVectorVector extends Pointer { static { Loader.load(); } public PointerVectorVector() { allocate(); } public PointerVectorVector(long n) { allocate(n); } public PointerVectorVector(Pointer p) { super(p); } // this = (vector<vector<void*> >*)p /** other methods .... */ public native @Index void resize(long i, long n); // (*this)[i].resize(n) public native @Index Pointer get(long i, long j); // return (*this)[i][j] public native void put(long i, long j, Pointer p); // (*this)[i][j] = p } public static void main(String[] args) { PointerVectorVector v = new PointerVectorVector(13); v.resize(0, 42); // v[0].resize(42) Pointer p = new Pointer() { { address = 0xDEADBEEFL; }}; v.put(0, 0, p); // v[0][0] = p PointerVectorVector v2 = new PointerVectorVector().put(v); Pointer p2 = v2.get(0).get(); // p2 = *(&v[0][0]) System.out.println(v2.size() + " " + v2.size(0) + " " + p2); v2.at(42); }}Copy the code

Graal & Panama

Graal and Panama are two relatively active community projects that are directly related to cross-language programming. However, these two technologies have not yet been proven on a large scale in a production environment. I will not describe them here, but I will introduce them separately if I have the opportunity.

FBJNI

FBJNI (

Github.com/facebookinc…

JNI is a Facebook open source framework that AIDS C++ developers in using JNI. Most of what has been mentioned above is how to enable Java users to access Native methods quickly. In fact, there are also scenarios where C++ users need to access Java codes safely and conveniently under the scenario of cross-language invocation. Alibaba FFI is currently focusing on how to make Java access C++ quickly. For example, suppose a requirement is to make C++ users access Java’s List interface. Instead of using JNI interface functions to manipulate Java List objects, Alibaba FFI turns C++ STD ::vector into a Java interface via FFI packages.

JNI overhead

inline

At the heart of the JVM’s high performance is the powerful just-in-time (JIT) compiler built into it. The JIT compiles hot methods at run time into executable code, making them run directly (without interpreting bytecode execution). Many optimization techniques are applied during compilation, and inlining is one of the most important. Simply put, inlining is embedding the execution logic of the called method into the caller’s logic, which not only eliminates the overhead of method calls, but also allows for more program optimization.

However, in the current hotspot implementation, the JIT only supports inlining Java methods, so if a Java method calls a native method, it cannot apply inlining optimization to that native method.

At this point, some people must be wondering if some native methods we often use, such as System.currentTimemillis, can’t be inlined. In fact, hotspot uses Intrinsic techniques to improve call performance for native methods that are often used in applications (non-native methods can also be used as intrinsics). Intrinsics are similar to build-in in that the JIT can embed the implementation of a method directly in the resulting code when it encounters such method calls, but method Intrinsic support usually requires direct modification of the JVM.

Parameter passing

Another overhead of JNI is parameter passing (including return values). Due to different Calling conventions of methods/functions in different languages, the process of parameter transfer is required when Java methods call native methods, as shown in the following figure (for X64 platform) :

According to the JNI specification, the JVM first needs to put JNIEnv* into the first parameter register (RDI), and then the remaining parameters, including this (Receiver), into their respective registers. To make this process as fast as possible, hotspot internally generates efficient stubs of the transformation process on the fly based on method signatures.

State switch

State switching is involved in the process of entering native methods from Java methods and returning to Java methods after the execution of native methods. The diagram below:

In implementation, state switching requires the introduction of memory barriers and Safepoint checks.

Object access

Another overhead of JNI exists in accessing Java objects in native methods.

Imagine that we need to access a Java object in a C function. The most violent way is to simply get a pointer to the object and access it. However, Java objects can be moved due to GC, so a mechanism is needed to make the logic of accessing objects in native methods address-independent. All problems in CS can be solved by another level of indirection Use JNI Functions to access objects to solve this problem, but of course this solution also introduces overhead.

From the previous introduction, we know that there are two main problems with the current mainstream Java cross-language programming technology:

1. Programming difficulty

2. The cost of cross-language communication

For problem 1, we can use technologies such as JNA/JNR and JavaCPP to solve it. Do we have a corresponding optimization plan for problem 2?

The following is a formal introduction to Alibaba FFI project

Alibaba FFI

An overview of

Alibaba FFI project is dedicated to solving the problems encountered in Java cross-language programming. Overall, the project is divided into the following two modules:

A. FFI (Solve programming difficulty problems)

  • A set of Java annotations and types
  • Contains an annotation handler for generating glue code
  • Runtime support

B. LLVM4JNI (Solve runtime overhead problem)

  • Achieve bitcode to Bytecode translation, break the boundary between Java methods and Native functions
  • Pure Java interface definition based on FFI, the bottom layer depends on LLVM, through FFI access LLVM’s C++ API

At present, Alibaba FFI mainly targets at C++, and C++ is also used as the target communication language below.

Workflow for cross-language programming via Alibaba FFI:

1. A header file containing the C++ API declarations that the user needs to use

2. Encapsulate C++ API with Java language. At present, this step still needs to be carried out manually

Glue code generated by annotation processor in FFI: both Java layer and Native layer code

4. Concrete realization of library

5. Clinet applications load the products of these processes during operation

Note: The solid line represents the relationship between the source code and the product in the pre-run phase, and the dotted line represents the relationship between the application and the library and product in the run phase

FFI

The FFI module provides a set of annotations and types used to encapsulate interfaces in other languages. The top layer is an FFIType (FFI -> Foreign Function Interface) interface, as you can see in the figure below. In a concrete implementation for C++, an underlying C++ object maps to a Java object, so you need to include the address of the C++ object in the Java object. Since C++ objects are not moved, we can store bare Pointers directly in Java objects.

In essence, FFI modules generate relevant code needed in cross-language calls through annotation processors. Users only need to rely on FFI related libraries (plug-ins) and encapsulate the target functions needed to be called with FFI API.

The sample

Here is a procedure for encapsulating STD :: Vector.

A. Encapsulate the underlying functions that need to be called with annotations and types

@FFIGen(library = "stdcxx-demo")
@CXXHead(system = {"vector", "string"})
@FFITypeAlias("std::vector")
@CXXTemplate(cxx="jint", java="Integer")
@CXXTemplate(cxx="jbyte", java="Byte")
public interface StdVector<E> extends CXXPointer {


    @FFIFactory
    interface Factory<E> {
        StdVector<E> create();
    }


    long size();


    @CXXOperator("[]") @CXXReference E get(long index);
    @CXXOperator("[]") void set(long index, @CXXReference E value);
    void push_back(@CXXValue E e);


    long capacity();
    void reserve(long size);
    void resize(long size);
}
Copy the code

FFIGen: Specifies the name of the final build library

CXXHead: The header file that glue code relies on

FFITypeAlias: C++ class name

CXXTemplate: implements the mapping of C++ template parameter concrete types to Java types. Alibaba FFI provides a more flexible configuration than JavaCPP

B. During compilation, the annotation handler generates a real implementation of the component interface that was relied on during the final call:

public class StdVector_cxx_0x6b0caae2 extends FFIPointerImpl implements StdVector<Byte> { static { FFITypeFactory.loadNativeLibrary(StdVector_cxx_0x6b0caae2.class, "stdcxx-demo"); } public StdVector_cxx_0x6b0caae2(final long address) { super(address); } public long capacity() { return nativeCapacity(address); } public static native long nativeCapacity(long ptr); . public long size() { return nativeSize(address); } public static native long nativeSize(long ptr); }Copy the code

JNI glue code:

#include <jni.h> #include <new> #include <vector> #include <string> #include "stdcxx_demo.h" #ifdef __cplusplus extern "C" { #endif JNIEXPORT jbyte JNICALL Java_com_alibaba_ffi_samples_StdVector_1cxx_10x6b0caae2_nativeGet(JNIEnv* env, jclass cls, jlong ptr, jlong arg0 /* index0 */) { return (jbyte)((*reinterpret_cast<std::vector<jbyte>*>(ptr))[arg0]); } JNIEXPORT jlong JNICALL Java_com_alibaba_ffi_samples_StdVector_1cxx_10x6b0caae2_nativeSize(JNIEnv* env, jclass cls, jlong ptr) { return (jlong)(reinterpret_cast<std::vector<jbyte>*>(ptr)->size()); }... #ifdef __cplusplus } #endifCopy the code

Crash Protection

In the evolution process, we introduced some optimization mechanisms, such as handling of temporary objects returned by C++ functions, exception conversion, etc. Here we introduce Crash Protection, which is also a solution to customers’ problems in real scenarios. It is also dealt with in JNA and JNR. Sometimes, the version of the C++ library that Java applications rely on needs to be upgraded. In order to prevent the bugs in the C++ library from causing the entire application to Crash (bugs in Java are usually shown as exceptions and will not cause problems in the whole application in most cases), we need to introduce a protection mechanism. As follows:

JNIEXPORT void JNICALL Java_Demo_crash(JNIEnv* env, jclass) {
  void* addr = 0;
  *(int*)addr = 0; // (Crash)
}
Copy the code

In line 3, an out-of-bounds memory access problem will occur, and the application will Crash if no special handling is done. To “isolate” this problem, we introduced a protection mechanism, which is implemented on Linux:

PROTECTION_START // Insert macro void* addr = 0 in the generated glue code; *(int*)addr = 0; // register signal handlers signal(sig, signal_handler); // Pseudo code // register signal handlers signal(sig, signal_handler); int sigsetjmp_rv; sigsetjmp_rv = sigsetjmp(acquire_sigjmp_buf(), 1); if (! sigsetjmp_rv) { void* addr = 0; *(int*)addr = 0; } release_sigjmp_buf(); // restore handler ... if (sigsetjmp_rv) { handle_crash(env, sigsetjmp_rv); }Copy the code

It realizes Crash protection by implementing its own signal processing function and sigsetJMP/SiglongjMP mechanism. It should be noted that Hotspot has its own signal processor (Safepoint check, Implicit Null Check, etc.). To prevent collisions, this library needs to be preload libjsig.so (on Linux) on startup. Finally, we can throw Java exceptions in HANDle_Crash for subsequent investigation and analysis.

Comparison of related projects

LLVM4JNI

LLVM4JNI implements bitcode to Bytecode translation, so that a Native function is converted into a Java method, eliminating the overhead mentioned earlier. The translation process is completed before the application runs, and its core is to implement bitcode semantics by Bytecode. This article will not introduce the specific implementation details (to be introduced in detail after the project is open source). The translation results of some simple procedures are shown below. 1. Four simple operations:

  • source

    int v1 = i + j; int v2 = i – j; int v3 = i * j; int v4 = i / j; return v1 + v2 + v3 + v4;

  • bitcode

    %5 = sdiv i32 %2, %3 %6 = add i32 %3, 2 %7 = mul i32 %6, %2 %8 = add nsw i32 %5, %7 ret i32 %8

  • bytecode

    Code: stack=2, locals=6, args_size=3 0: iload_1 1: iload_2 2: idiv 3: istore_3 4: iload_2 5: ldc #193 // int 2 7: iadd 8: iload_1 9: imul 10: istore 5 12: iload_3 13: iload 5 15: iadd 16: ireturn

2, JNI Functions conversion, currently support 90+. This functionality will be integrated with similar frameworks such as FBJNI in the future to break down the Java/Native code boundary and eliminate the extra overhead of method calls.

  • source

    Jclass objectClass = env – > FindClass (” Java/util/List “);

    return env->IsInstanceOf(arg, objectClass);

  • bytecode

    Code: stack=1, locals=3, args_size=2 0: ldc #205 // class java/util/List 2: astore_2 3: aload_1 4: instanceof #205 // class java/util/List 7: i2b 8: ireturn

C++ object access. Another benefit of Alibaba FFI is that Java off-heap applications can be developed in an object-oriented manner (C++ is an object-oriented language). Currently, most Java-based big data platforms require data modules that support OFF-heap to reduce garbage collection pressure. However, manually developed OFF-heap modules need to carefully deal with the underlying offsets and alignment of different platforms and architectures, which is error-prone and time-consuming. With Aliabba FFI, we can develop object models in C++ and expose them to Java users through Alibaba FFI.

  • source

    class Pointer { public: int _x; int _y;

    Pointer(): _x(0), _y(0) {}

    const int x() { return _x; } const int y() { return _y; }};

    JNIEXPORT jint JNICALL Java_Pointer_1cxx_10x4b57d61d_nativeX(JNIEnv*, jclass, jlong ptr) { return (jint)(reinterpret_cast<Pointer*>(ptr)->x()); }

    JNIEXPORT jint JNICALL Java_Pointer_1cxx_10x4b57d61d_nativeY(JNIEnv*, jclass, jlong ptr) { return (jint)(reinterpret_cast<Pointer*>(ptr)->y()); }

  • bitcode

    define i32 @Java_Pointer_1cxx_10x4b57d61d_nativeX %4 = inttoptr i64 %2 to %class.Pointer* %5 = getelementptr inbounds %class.Pointer, %class.Pointer* %4, i64 0, i32 0 %6 = load i32, i32* %5, align 4, ! tbaa ! 3 ret i32 %6

    define i32 @Java_Pointer_1cxx_10x4b57d61d_nativeY

    %4 = inttoptr i64 %2 to %class.Pointer* %5 = getelementptr inbounds %class.Pointer, %class.Pointer* %4, i64 0, i32 1 %6 = load i32, i32* %5, align 4, ! tbaa ! 8 ret i32 %6

  • bytecode

    public int y(); descriptor: ()I flags: ACC_PUBLIC Code: stack=2, locals=1, args_size=1 0: aload_0 1: getfield #36 // Field address:J 4: invokestatic #84 // Method nativeY:(J)I 7: ireturn LineNumberTable: line 70: 0

    public static int nativeY(long); descriptor: (J)I flags: ACC_PUBLIC, ACC_STATIC Code: stack=4, locals=2, args_size=1 0: lload_0 1: ldc2_w #85 // long 4l 4: ladd 5: invokestatic #80 // Method com/alibaba/llvm4jni/runtime/JavaRuntime.getInt:(J)I 8: ireturn

  • JavaRuntime

    public class JavaRuntime {

    public static final Unsafe UNSAFE; . public static int getInt(long address) { return UNSAFE.getInt(address); }...Copy the code

    }

In field implementations that access C++ objects, we use the Unsafe API for direct access to out-of-heap memory, avoiding calls to Native methods.

The performance data

Grape’s performance data for THE SSSP (Single-source Shortest path algorithm) implemented by Alibaba FFI are as follows:

Here are three modes to compare:

  • Pure C++ implementation
  • Java implementation based on Aibaba FFI, but turning off LLVM4JNI does not eliminate any of the additional overhead of JNI
  • Java implementation based on Alibaba FFI, while enabling LLVM4JNI, some of the extra overhead of native methods is eliminated

Here, we take the completion Time of the algorithm as the index, and normalize the final result in the unit of C++ calculation Time.

conclusion

Cross-language programming is an important direction of modern programming languages, and there are many schemes in the community to implement the communication process for different languages. At present, Alibaba FFI is mainly aimed at C++. In the future, we will try to realize and optimize the communication process between Java and other languages. The project will also be officially open source.

The original link

This article is the original content of Aliyun and shall not be reproduced without permission.