This section focuses on how JNI maps Java types to native C types.

First, primitive type

The following table describes Java primitive types and their machine-dependent local counterparts.

Java type The local type describe
boolean jboolean unsigned 8 bits
byte jbyte signed 8 bits
char jchar unsigned 16 bits
short jshort signed 16 bits
int jint signed 32 bits
long jlong signed 64 bits
float jfloat 32 bits
double jdouble 64 bits
void void N/A

For convenience, the following definitions are provided.

#define JNI_FALSE  0 
#define JNI_TRUE   1 
Copy the code

The jsize integer type is used to describe the base index and size:

typedef jint jsize; 
Copy the code

2. Reference types

JNI contains many reference types that correspond to different types of Java objects. JNI reference types are organized in the hierarchy shown in the following figure.

In C, all other JNI reference types are defined to be the same as jobject. Such as:

typedef jobject jclass; 
Copy the code

In C++, JNI introduced a set of virtual classes to enforce subtype relationships. Such as:

class _jobject {}; class _jclass : public _jobject {}; . typedef _jobject *jobject; typedef _jclass *jclass;Copy the code

Field and method ID

The method and field ID are of the normal C pointer type:

struct _jfieldID;              /* opaque structure */ 
typedef struct _jfieldID *jfieldID;   /* field IDs */ 
 
struct _jmethodID;              /* opaque structure */ 
typedef struct _jmethodID *jmethodID; /* method IDs */ 
Copy the code

Value types

The jValue Union type is used as the element type in the parameter array. The statement is as follows:

typedef union jvalue { 
    jboolean z; 
    jbyte    b; 
    jchar    c; 
    jshort   s; 
    jint     i; 
    jlong    j; 
    jfloat   f; 
    jdouble  d; 
    jobject  l; 
} jvalue; 
Copy the code

5. Type signature

JNI is represented using the JVM’s type signature. The following table shows these type signatures.

Type signatures Java type
Z boolean
B byte
C char
S short
I int
J long
F float
D double
L fully-qualified-class ; fully-qualified-class
[ type type[]
( arg-types ) ret-type method type

For example, the Java method:

long f (int n, String s, int[] arr); 
Copy the code

Has the following types of signatures:

(ILjava/lang/String; [I)JCopy the code

Modify the utF-8 encoded string

JNI uses a modified UTF-8 string to represent various string types. The modified UTF-8 string is the same as the string used by the JVM. The modified UTF-8 string is encoded so that character sequences containing only non-empty ASCII characters can only be represented by one byte per character, but can represent all Unicode characters.

All characters in the range \u0001 through \u007F are represented by a single byte, as follows:



The seven bits of data in bytes give the value of the character represented.

Empty characters (‘ \u0000 ‘) and characters in the range ‘\u0080’ to ‘\u07FF’ are represented by a pair of bytes x and y:



Bytes represent characters with values ((x & 0x1f) << 6) + (y & 0x3f).

Characters in the ‘\u0800’ to ‘\uFFFF’ range are represented by 3 bytes x, y, and z:



Characters with the value ((x & 0xf) << 12) + ((y & 0x3f) << 6) + (z & 0x3f) are represented by the above bytes.

Characters with code points above U+FFFF (so-called supplementary characters) are represented by two agent code units that encode their UTF-16 representation separately. Each unit of agent code is represented by three bytes. This means that supplementary characters are represented by six bytes u, V, w, x, y, and z:



Value is 0 x10000 + ((v & 0 x0f) < < 16) + ((w & 0 x3f) < < 10) + (y & 0 x0f) < < 6) + (z & 0 x3f) character is made up of six bytes.

The bytes of multibyte characters are stored in the class file in big-endian order.

This format differs from the standard UTF-8 format in two ways.

First, the null character (char)0 is encoded in double-byte format instead of single-byte format. This means that the modified UTF-8 string is never embedded with null values.

Second, only the single-byte, double-byte, and three-byte formats of standard UTF-8 are used.

The JVM does not recognize the standard UTF-8 four-byte format; It uses its own double triple byte format instead.