JDK version: OpenJDK 11

What is Unsafe?

This is an ulterior motive name that bluntly warns developers that this is an “unsafe” class.

We know that Java is different from C, because the JVM is in the middle layer, the average developer can not directly manipulate the memory through the code, everything is done by the JVM behind the scenes.

Unsafe, in contrast, defines low-level, insecure operations. How low and unsafe is it?

In contrast to the first sentence, Unsafe allows for direct access to or manipulation of data in memory. This is faster, but at the expense of the JVM’s ability to check and restrict access to objects or variables, similar to how private variables are handled by reflection.

This leads to the first feature: in cases where performance is the highest priority, this method does not guarantee checking of input parameters; Even at the runtime compiler level, some or all of the checks are omitted when the class is optimized.

So:

  • Callers must not rely on checks or corresponding exceptions for this class;
  • Restrict the use of this class to trusted code, typically JDK libraries.

We can use Unsafe?

If I follow the above statement, the BUG I wrote is definitely not trusted code.

Nature has a way, the first thing to know is how to limit.

Gets the limits of the Unsafe instance

After JDK9, sun.misc.Unsafe was moved to the JDk.unsupported module, and a jdk.internal. Misc. Replacing sun.misc.Unsafe features previously found in JDK8. The JDK. Internal package is not open for developers to call, complete with import.

Unaddressed, sun.misc.Unsafe delegates JDK. Internal. Misc. And while JDK. Internal.misc.Unsafe provides more complete operations, sun.misc.Unsafe only opens up parts.

The JDK. Internal. Misc. Unsafe:

public final class Unsafe {

    private Unsafe(a) {}

    private static final Unsafe theUnsafe = new Unsafe();

    public static Unsafe getUnsafe(a) {
        returntheUnsafe; }}Copy the code

Sun. Misc. Unsafe:

public final class Unsafe {

    private Unsafe(a) {}

    private static final Unsafe theUnsafe = new Unsafe();
    private static final jdk.internal.misc.Unsafe theInternalUnsafe = jdk.internal.misc.Unsafe.getUnsafe();
    @CallerSensitive
    public static Unsafe getUnsafe(a) { Class<? > caller = Reflection.getCallerClass();if(! VM.isSystemDomainLoader(caller.getClassLoader()))throw new SecurityException("Unsafe");
        returntheUnsafe; }}Copy the code
public static boolean isSystemDomainLoader(ClassLoader loader) {
    return loader == null || loader == ClassLoader.getPlatformClassLoader();
}
Copy the code

When we get an instance in our application code via unsafe.getunsafe (), we will be asked whether the loader currently calling the class is the Bootstrap loader (loader == NULL) or the Platform loader (previously called the Extensions loader).

Retrieve the Unsafe instance forcibly

One is that we use the JVM argument -xBootCLASSPath: to make the calling class loaded by the Bootstrap loader.

However, this parameter is not supported since 1.9, which causes the JVM to fail to start.

So use the other way, by reflection.

public static void main(String[] args) throws Exception {
    Field f = Unsafe.class.getDeclaredField("theUnsafe");
    f.setAccessible(true);
    Unsafe unsafe = (Unsafe) f.get(null);
}
Copy the code
Access to the JDK. Internal. Misc. Unsafe

Based on JDK 11, due to modularity restrictions, you can’t actually get jdK.internal.misc.unsafe in your code.

But there are ways to beat magic with magic and modify module access restrictions.

Added in the VM arguments – add – opens Java. Base/JDK. Internal. Misc = ALL – UNNAMED, can direct access to in the code, the JDK. Internal. Misc. Unsafe. GetUnsafe ().

Using JDK. Internal. Misc.Unsafe is not recommended.

The first is to release the module limit, the security is reduced;

2 it is relative to the sun. The misc. Unsafe, JDK. Internal. Misc. Unsafe provided by the underlying operating more, also is not safer.

Unsafe Recommendations

Content from Reference 1.

  1. Unsafe is an internal implementation, and it is possible that specific implementations will change in future JDK versions, which may cause applications that use Unsafe to not run in older JDK versions.

  2. Unsafe in many of its methods, the original address (memory address) and the address of the object to be replaced must be addressed, the offset must be calculated by itself, and a JVM crack-level exception will cause the entire JVM instance to crash, causing the application to crash. The JVM is software written in C, and operating on a memory address that does not exist is a crash operation in C programs.

  3. Unsafe offers a direct memory access method that uses memory that is not managed by the JVM (not GC) and needs to be managed manually, making it a potential source of memory leaks.

What’s Unsafe? How does it work?

Note: There is a lot to choose from.

1. Read and write object properties based on the offset

There are a number of PUT and GET methods for basic types, objects, and pointer addresses.

The operations listed above are based on the JVM heap.

Basic types of

Int, Boolean, Long, Byte, Float, Double, Short, Char

Get :

@HotSpotIntrinsicCandidate
// Get the value from o based on offset; Or o is null, the value is directly obtained from the memory address offset.
public native int getInt(Object o, long offset);
public int getInt(long address) {
    return getInt(null, address);
}
Copy the code

Unsafe methods are unchecked, so how do you ensure that values are typed and results are unambiguous?

Start with parameters:

  • O is not null, and the offset is the reflection class Filed call of the corresponding field on the class oUnsafe.objectFieldOffset()(specifies the offset of the field in the class). Second, the class represented by O must be compatible with the parent class.There are no explicit methods following; the default is the Unsafe class.
  • Static variable fields, whether o is null or not, pass separatelystaticFieldOffset()(static field at the Class offset) andstaticFieldBase()(corresponding to the start position of the static field).
  • O is an array, offset is B+N*SThe integer. N is the valid index of the array, which is the NTH value assigned to the element; B is the memory address of the array objectarrayBaseOffset()To obtain; S is the offset of an element in the array, how much space an element occupies in memory, byarrayIndexScaleTo obtain.
  • Most specially, when o == null or is usedgetInt(long address), offset indicates the specified memory address.
    • If the address is zero, or does not point to a slaveallocateMemoryObtain memory, the result is uncertain.

Define a model for the experiment, which will not be explained later:

public class Demo {
    static int n1 = 1;
    int n2 = 2;
    Integer n3 = 3;
}
Copy the code
private static void _getInt(Unsafe unsafe) throws NoSuchFieldException {
    Demo o = new Demo();
    // The first case
    int i1 = unsafe.getInt(o, unsafe.objectFieldOffset(Demo.class.getDeclaredField("n2")));
    System.out.println("Case one:" + i1);
    // The second case
    Field n1 = Demo.class.getDeclaredField("n1");
    int i2 = unsafe.getInt(unsafe.staticFieldBase(n1), unsafe.staticFieldOffset(n1));
    System.out.println("Second case:" + i2);
    / / the third kind
    int[] ns = {9.8.7.6};
    int i3 = unsafe.getInt(ns, unsafe.arrayBaseOffset(ns.getClass()) + 3 * unsafe.arrayIndexScale(ns.getClass()));
    System.out.println("Third case:" + i3);
    // Memory address
    // vm.current ().addressof (o) dependencies: compile("org.openjdk.jol: jo-core :0.9")
    int i4 = unsafe.getInt(VM.current().addressOf(o) + unsafe.objectFieldOffset(Demo.class.getDeclaredField("n2")));
    System.out.println("Fourth case:" + i4);
}

/ / the resultThe first case:2The second case:1The third case:6The fourth case:2
    
Copy the code

Put :

@HotSpotIntrinsicCandidate
public native void putInt(Object o, long offset, int x);
Copy the code

Store x at the memory address in each of the four cases. That is, the case of O and offset is the same as that of getInt(Object O, long offset).

private static void _putInt(Unsafe unsafe) throws NoSuchFieldException {
    Demo o = new Demo();
    // The first case
    unsafe.putInt(o, unsafe.objectFieldOffset(Demo.class.getDeclaredField("n2")), o.n2 + 1);
    System.out.println("Case one:" + o.n2);
    // The second case
    Field n1 = Demo.class.getDeclaredField("n1");
    unsafe.putInt(unsafe.staticFieldBase(n1), unsafe.staticFieldOffset(n1), o.n1 + 1);
    System.out.println("Second case:" + o.n1);
    / / the third kind
    int[] ns = {9.8.7.6};
    unsafe.putInt(ns, unsafe.arrayBaseOffset(ns.getClass()) + 3 * unsafe.arrayIndexScale(ns.getClass()), ns[3] - 1);
    System.out.println("Third case:" + ns[3]);
    // Memory address
    unsafe.putInt(null, VM.current().addressOf(o) + unsafe.objectFieldOffset(Demo.class.getDeclaredField("n2")), o.n2 + 1);
    System.out.println("Fourth case:" + o.n2);
}
/ / the resultThe first case:3The second case:2The third case:5The fourth case:4
Copy the code
Object
@HotSpotIntrinsicCandidate
public native Object getObject(Object o, long offset);

@HotSpotIntrinsicCandidate
public native void putObject(Object o, long offset, Object x);
Copy the code

Put and get are similar.

Just put:

  • If x is null or type match, the result is unambiguous; Otherwise, other errors are likely to occur.
  • If o is not null, update card marks or other memory barriers (JVA-BASED management).
private static void _putAndGetObject(Unsafe unsafe) throws NoSuchFieldException {
    Demo o = new Demo();
    // The first case
    Object i1 = unsafe.getObject(o, unsafe.objectFieldOffset(Demo.class.getDeclaredField("n3")));
    System.out.println("before put: " + i1);
    unsafe.putObject(o, unsafe.objectFieldOffset(Demo.class.getDeclaredField("n3")), Integer.valueOf(5));
    Object i2 = unsafe.getObject(o, unsafe.objectFieldOffset(Demo.class.getDeclaredField("n3")));
    System.out.println("after put: " + i2);
}
/ / the result
before put: 3
after put: 5
Copy the code
Address

Is it a pointer handle on a memory address, direct handle or indirect handle? If it is an indirect handle, it does not represent exactly the address of an object on the heap.

@ForceInline// is required to force inlining
public long getAddress(Object o, long offset) {
    if (ADDRESS_SIZE == 4) {
        // If the native pointer is less than 64 bits wide, it is extended as an unsigned number to long.
        return Integer.toUnsignedLong(getInt(o, offset));
    } else {
        returngetLong(o, offset); }}@ForceInline
public void putAddress(Object o, long offset, long x) {
    if (ADDRESS_SIZE == 4) {
        putInt(o, offset, (int)x);
    } else{ putLong(o, offset, x); }}// Get the uncompressed pointer, ignoring the JVM's compressed pointer Settings
public native Object getUncompressedObject(long address);
Copy the code

Get: gets a local pointer from a specified memory address.

Put: Stores a native pointer to a given memory address.

  • If the address is zero, or is not directed fromallocateMemory()Obtain memory, the result is uncertain.
  • If the native pointer is less than 64 bits wide, it is extended as an unsigned number to long.
    • A pointer can be indexed by any given byte offset, simply by adding the offset (as a simple integer) to the long that represents the pointer.
    • The actual number of bytes read from the destination address can be calledaddressSize()To determine.
private static void _putAndGetAddress(Unsafe unsafe) throws NoSuchFieldException {
    long start = unsafe.allocateMemory(4);
    int i = unsafe.getInt(start);
    System.out.println("before put:" + i);
    unsafe.putAddress(start, 1000);
    i = unsafe.getInt(start);
    System.out.println("after put1:" + i);
    System.out.println("after put2:" + unsafe.getAddress(start));
}
/ / the result
before put:0
after put1:1000
after put2:1000
Copy the code
2. Memory management
/** * Allocated memory (in bytes), the contents of the memory have not been initialized and are garbage to be collected. * Normally allocated memory returned will never start at 0 and will be aligned for all value types. Or use reallocateMemory to resize it. * /
public long allocateMemory(long bytes);

/** * Adjusts the new native memory block to the given byte size. The contents of a new block that exceeds the size of the old block are not initialized; They are usually rubbish. * The resulting native pointer is zero if and only if the request size is zero. * The resulting native pointer will be aligned for all value types. * Process this memory by calling freeMemory, or resize it using reallocateMemory. * The address passed to this method can be null, in which case the first allocation */ will be performed
public long reallocateMemory(long address, long bytes);

/** * Sets a specified number of bytes in a given memory block to a fixed value (usually zero). * * This method takes two arguments to determine the base address of the block, so it (in effect) provides a two-register addressing mode, such as getInt(Object, Long). * When the object reference is null, the offset will provide the absolute base address. * * storage is expressed in coherent (atomic) units, whose size is determined by address and length parameters. If both the effective address and length are even modulo 8, the storage is in units of "long". * If the valid address and length (modulo 4 or 2) are stored in units of "int" or "short" */
public void setMemory(Object o, long offset, long bytes, byte value);

/** * Sets a specified number of bytes in a given block of memory to a copy of another block. * /
public void copyMemory(Object srcBase, long srcOffset,
                       Object destBase, long destOffset,
                       long bytes);

public void copyMemory(long srcAddress, long destAddress, long bytes);

/** * Dispose of native memory blocks obtained from allocateMemory or reallocateMemory. The address passed to this method can be null, in which case nothing is done */
public void freeMemory(long address);
Copy the code

Applications (image: tech.meituan.com/2019/02/14/…

Cleaner application with the above link.

3. Class-related operations
Gets the attribute offset

In the first section of reading and writing an object property by offset, you often need to find the memory address of the property by its offset in the object.

These operations to calculate offsets, which Unsafe also provides.

// The offset of the instance field in the object. If it's the first field, it's actually the length of the object header.
public long objectFieldOffset(Field f)
// Get the offset of the instance field in the object based on the Class and field name
public long objectFieldOffset(Class
        c, String name)
// The memory address of the static field can be obtained together with the following two. A series of methods that are actually used like getInt.
public long staticFieldOffset(Field f)
public Object staticFieldBase(Field f)
// Both of the following presets the corresponding values for primitive types and arrays of type Object
// The offset of the first element of a given array class is actually the length of the array header
public int arrayBaseOffset(Class
        arrayClass)
// The length of each element in the array
public int arrayIndexScale(Class
        arrayClass)
Copy the code
Check initialization
// Determine whether a class needs to be initialized, usually when fetching a class's static attributes (because a class's static attributes are not initialized if it is not initialized).
// Return false if and only if the ensureClassInitialized method is not in effect.
public boolean shouldBeInitialized(Class
        c)
// Check whether the given class is already initialized. Usually used when retrieving a class's static properties (because a class's static properties are not initialized if it is not initialized).
public void ensureClassInitialized(Class
        c)
Copy the code
Create a class
// Define a class that skips all security checks for the JVM. By default, instances of ClassLoader and ProtectionDomain come from the caller
public nativeClass<? > defineClass(String name,byte[] b, int off, int len, ClassLoader loader, ProtectionDomain protectionDomain);
// Define an anonymous class
public nativeClass<? > defineAnonymousClass(Class<? > hostClass,byte[] data, Object[] cpPatches);
Copy the code

The application scenario of detecting class initialization and creating classes or anonymous classes is actually quite common in JDK versions 1.8 and older.

Because it is used in Lambda expression processing.

public  void test(a) {
    List<Integer> list = new ArrayList<>(16);
    list.add(1);
    list.add(2);
    list.add(3);
    list.forEach(i -> {
        System.out.println(i);
    });
    list.forEach(i -> {
        System.out.println(i + num);
    });
}
Copy the code

The decompilation results in the following:

The compiler generates instance methods or static methods with special names for Lambda expressions. Through the UNSAFE defineAnonymousClass creating anonymous classes, then instantiated. Finally returns the call point associated with the method handle of the functional method in this anonymous class; This call point can then be used to implement the ability to invoke the corresponding Lambda expression definition logic.

LambdaMetafactory.metafactory() -> InnerClassLambdaMetafactory.buildCallSite() -> InnerClassLambdaMetafactory.spinInnerClass() ->UNSAFE.defineAnonymousClass()

JVM crashes at libjjvm. So, and the following is a brief excerpt to explain this method.

  • 1. VM Anonymous Class can be viewed as a template mechanism. If a program wants to dynamically generate many classes with the same structure but different variables, it can first create a normal Class containing placeholder constants as a template and then use themsun.misc.Unsafe#defineAnonymousClass()The host class (host class, host class, or template class) and an array as “constant Pool path” are passed in to replace the specified constant with any value. The result is a constant that replaces the constantVM Anonymous Class.
  • 2,VM Anonymous ClassIs truly “nameless” from the VM’s point of view and can only be passed once constructedUnsafe#defineAnonymousClass()Return an instance of Class for reflection.

There are a few other points to read for yourself. This method translates as “define anonymous class”, but the class it defines is a little different from the actual anonymous class, so we generally don’t use this method.

4. System related
// Memory page size, in bytes. It must be 2 to the NTH power.
public native int pageSize(a);
// The length of the local pointer is 4 or 8 bytes
public int addressSize(a) {
    return ADDRESS_SIZE;
}
// Get the average load of the system. The double array loadavg will store the result of the load value.
// Nelems determines the number of samples. The value of nelems can be 1 to 3, representing the average load of the system in the last 1, 5, and 15 minutes respectively.
// This method returns -1 if the load of the system cannot be retrieved, otherwise returns the number of samples retrieved (the number of valid elements in loadavg).
public int getLoadAverage(double[] loadavg, int nelems) {
    if (nelems < 0 || nelems > 3 || nelems > loadavg.length) {
        throw new ArrayIndexOutOfBoundsException();
    }

    return getLoadAverage0(loadavg, nelems);
}
Copy the code
5. Object management
Unconventional instantiation
Bypassing the instance constructor init (instance variable initialization, code block, constructor), JVM checks, etc., just rely on Class to instantiate objects
// At the same time, it inhibits modifier detection, meaning that the constructor can be instantiated using this method even if it is private.
public native Object allocateInstance(Class
        cls)
// Just instantiate arrays of primitive types. Performance is higher than normal new
public Object allocateUninitializedArray(Class<? > componentType,int length) 
Copy the code
  • General instantiation

    Instantiate with the new keyword. Instantiation performs instance variable initialization, code block execution, and constructors; Second, when a class defines a constructor with arguments (instead of redefining a no-argument constructor), it must specify parameters on instantiation. Furthermore, in the case of singletons, for example, constructors are private and cannot be accessed.

  • Unconventional instantiation

    Bypassing the instance constructor init (instance variable initialization, code blocks, constructors), JVM checks, etc., and relying solely on Class to instantiate objects.

public class InstanceDemo {
    Integer n1 = 1;
    Integer n2;
    Integer n3;
    {
        n2 = 2;
    }

    public InstanceDemo(int n3) {
        this.n3 = n3;
    }

    private InstanceDemo(a) {
        this.n3 = 3;
    }

    public static void main(String[] args) throws Exception {
        _unsafeInstance();
        _newInstance();
    }

    private static void _newInstance(a) {
        InstanceDemo demo = new InstanceDemo(3);
        System.out.println("n1 : " + demo.n1);
        System.out.println("n2 : " + demo.n2);
        System.out.println("n3 : " + demo.n3);
    }

    private static void _unsafeInstance(a) throws Exception {
        Unsafe unsafe = UnsafeTest.reflectUnsafe();
        InstanceDemo instance = (InstanceDemo) unsafe.allocateInstance(InstanceDemo.class);
        System.out.println("n1 : " + instance.n1);
        System.out.println("n2 : " + instance.n2);
        System.out.println("n3 : "+ instance.n3); }}/ / the result:
//unsafe
n1 : null
n2 : null
n3 : null
// new
n1 : 1
n2 : 2
n3 : 3
Copy the code

new :

Unsafe:

The JVM issues LDC, Invokevirtual, and checkcast instructions that enforce type conversion checks.

Put /get with Volatile semantics
public native Object getObjectVolatile(Object o, long offset);
/** Acquire version of {@link #getObjectVolatile(Object, long)} */
public final Object getObjectAcquire(Object o, long offset) {
    return getObjectVolatile(o, offset);
}
/** Opaque version of {@link #getObjectVolatile(Object, long)} */
public final Object getObjectOpaque(Object o, long offset) {
    return getObjectVolatile(o, offset);
}

 public native void  putObjectVolatile(Object o, long offset, Object x);
/** Opaque version of {@link #putObjectVolatile(Object, long, Object)} */
public final void putObjectOpaque(Object o, long offset, Object x) {
    putObjectVolatile(o, offset, x);
}
Copy the code

Use the same as getObject/putObject, but with volatile semantics. That is, the memory of the operation is out of cache.

Year-over-year, there are basic types of operations.

Sequentially delayed version of PUT/GET
// The ordered, delayed version of putObjectVolatile does not guarantee that a value change is immediately visible to other threads.
// Only if the field is modified by the volatile modifier
public final void putObjectRelease(Object o, long offset, Object x) {
    putObjectVolatile(o, offset, x);
}
Copy the code

Year-over-year, there are basic types of operations.

6.CAS
/**
  *  CAS
  * @paramO contains the object to modify the field *@paramOffset Specifies the offset * of a field in the object@paramExpected value *@paramUpdate Updates the value *@return          true | false
  */
public final native boolean compareAndSwapObject(Object o, long offset,  Object expected, Object update);
Copy the code

What is CAS? Compare and replace, a technique that is often used to implement concurrent algorithms. The CAS operation contains three operands — memory location, expected original value, and new value.

When the CAS operation is performed, the value of the memory location is compared to the expected value. If it matches, the processor automatically updates the value of the memory location to the new value; otherwise, the processor does nothing.

Addressing Unsafe, CAS is an atomic CPU instruction (CMPXCHG) that doesn’t cause data inconsistencies. Its underlying implementation of CAS methods such as compareAndSwapXXX is the CPU instruction CMPXCHG.

The corresponding operations have the memory semantics of volatile reads and writes.

The extensions are:

  • Operating on basic types:

  • GetAndSet, same as above;
  • GetAndAdd, same as above.
7. Memory barriers
/** * Ensure that reads before the barrier are not reordered with reads and writes behind the barrier: equivalent to a "read" barrier + a "read" barrier * Since most require a "read" barrier, no single "read" barrier is provided. */
@HotSpotIntrinsicCandidate
public native void loadFence(a);

/** * The writes in front of the barrier will not be reordered with the writes behind the barrier, equivalent to a "write "+" read" * Since most of the "write" barrier will be required, there is no single "write" barrier provided. */
@HotSpotIntrinsicCandidate
public native void storeFence(a);

/** * The read and write before the barrier will not be reordered with the read and write behind the barrier: loadFence + storeFence + write read */
@HotSpotIntrinsicCandidate
public native void fullFence(a);

public final void loadLoadFence(a) {
    // loadFence
    loadFence();
}

public final void storeStoreFence(a) {
    // storeFence with additional writing, direct use
    storeFence();
}
Copy the code
8. Thread scheduling
Thread suspended recovery
// Unblock the thread
public native void unpark(Object thread);
// Block the thread
public native void park(boolean isAbsolute, long time);
Copy the code

In terms of applications, LockSupport wraps it up, separating out more fine-grained methods for use.

Commonly used, it must be related to concurrent packet, such as AQS, FutureTask, and Sanorecovery, etc.

Low-level synchronization primitives

It has been removed since 1.9.

// Get the object lock (reentrant lock)
@Deprecated
public native void monitorEnter(Object o);
// Release the object lock
@Deprecated
public native void monitorExit(Object o);
// Try to obtain the object lock
@Deprecated
public native boolean tryMonitorEnter(Object o);
Copy the code

conclusion

It’s very rich and difficult to understand.

I think you can do a lot of things with a big imagination.

However, it is recommended that you do not use it. Do your thing within the rules.

Shoulders of giants

1. cloud.tencent.com/developer/a…

2. tech.meituan.com/2019/02/14/…