As we know, in the JVM, the process of loading a class is roughly divided into five stages: loading, linking (validation, preparation, parsing), and initialization. And we usually refer to the class loading, refers to the use of ClassLoader (ClassLoader) through the fully qualified name of the class to obtain the binary bytecode stream that defines this class, and then construct the definition of the class.
Flink, as a JVM-based framework, provides the classloader.resolve-order parameter to control the class loading policy in flink-conf.yaml. The options are child-first (default) and parent-first. This article will briefly analyze the meaning behind this parameter.
Parent-first class loading strategy
The ParentFirstClassLoader and ChildFirstClassLoader classes have a FlinkUserCodeClassLoader abstract class.
public abstract class FlinkUserCodeClassLoader extends URLClassLoader { public static final Consumer<Throwable> NOOP_EXCEPTION_HANDLER = classLoadingException -> {}; private final Consumer<Throwable> classLoadingExceptionHandler; protected FlinkUserCodeClassLoader(URL[] urls, ClassLoader parent) { this(urls, parent, NOOP_EXCEPTION_HANDLER); } protected FlinkUserCodeClassLoader( URL[] urls, ClassLoader parent, Consumer<Throwable> classLoadingExceptionHandler) { super(urls, parent); this.classLoadingExceptionHandler = classLoadingExceptionHandler; } @Override protected final Class<? > loadClass(String name, boolean resolve) throws ClassNotFoundException { try { return loadClassWithoutExceptionHandling(name, resolve); } catch (Throwable classLoadingException) { classLoadingExceptionHandler.accept(classLoadingException); throw classLoadingException; } } protected Class<? > loadClassWithoutExceptionHandling(String name, boolean resolve) throws ClassNotFoundException { return super.loadClass(name, resolve); }}Copy the code
FlinkUserCodeClassLoader inherits from URLClassLoader. Because the user code of the Flink App is determined at runtime, it is appropriate to find the class corresponding to the fully qualified name in the JAR package through the URL. ParentFirstClassLoader is just an empty class that inherits FlinkUserCodeClassLoader.
static class ParentFirstClassLoader extends FlinkUserCodeClassLoader { ParentFirstClassLoader(URL[] urls, ClassLoader parent, Consumer<Throwable> classLoadingExceptionHandler) { super(urls, parent, classLoadingExceptionHandler); }}Copy the code
ParentFirstClassLoader calls the loadClass() method of the parent loader directly. As mentioned earlier, the hierarchy of classloaders in the JVM and the logic of the default loadClass() method are represented by the Parents Delegation Model. To review the implications:
If a class loader tries to load a class, it does not try to load the class itself. Instead, it delegates the loading request to the parent loader. All classloading requests should eventually be passed to the topmost start classloader. Only if the parent loader fails to load the class will the child loader attempt to load it itself.
Flink’s parent-first class loading strategy is a copy of the parent delegate model. That is, the ClassLoader for the user code is the Custom ClassLoader, and the ClassLoader for the Flink framework itself is the Application ClassLoader. Classes in user code are first loaded by the Flink framework’s class loader, and then by the user code’s class loader. However, Flink doesn’t use parent-first by default, and instead uses the child-first strategy below.
Child-first class loading policy
As we have seen, the benefit of the parent delegate model is that the hierarchy of loaded classes is guaranteed along with the class loader hierarchy, thus ensuring the security of the Java runtime environment. However, in an environment with complex dependencies like Flink App, the parent delegate model may not be suitable. For example, the Flink-Cassandra Connector introduced in the program always relies on a fixed version of Cassandra, and a lower or higher dependency is introduced in the user code to be compatible with the Cassandra version actually used. Different versions of the same component may have different class definitions (even if the fully qualified name of the class is the same), and if you still use the parental delegate model, you will have inexplicable compatibility problems because the Flink framework specifies the version of the class to be loaded first. Such as NoSuchMethodError and IllegalAccessError.
For this reason, Flink implements the ChildFirstClassLoader class loader as the default policy. It breaks the parental delegation model and causes classes of user code to be loaded first, which is referred to in the official documentation as “Inverted Class Loading”. The code is still not very long.
public final class ChildFirstClassLoader extends FlinkUserCodeClassLoader { private final String[] alwaysParentFirstPatterns; public ChildFirstClassLoader( URL[] urls, ClassLoader parent, String[] alwaysParentFirstPatterns, Consumer<Throwable> classLoadingExceptionHandler) { super(urls, parent, classLoadingExceptionHandler); this.alwaysParentFirstPatterns = alwaysParentFirstPatterns; } @Override protected synchronized Class<? > loadClassWithoutExceptionHandling( String name, boolean resolve) throws ClassNotFoundException { // First, check if the class has already been loaded Class<? > c = findLoadedClass(name); if (c == null) { // check whether the class should go parent-first for (String alwaysParentFirstPattern : alwaysParentFirstPatterns) { if (name.startsWith(alwaysParentFirstPattern)) { return super.loadClassWithoutExceptionHandling(name, resolve); } } try { // check the URLs c = findClass(name); } catch (ClassNotFoundException e) { // let URLClassLoader do it, which will eventually call the parent c = super.loadClassWithoutExceptionHandling(name, resolve); } } if (resolve) { resolveClass(c); } return c; } @Override public URL getResource(String name) { // first, try and find it via the URLClassloader URL urlClassLoaderResource = findResource(name); if (urlClassLoaderResource ! = null) { return urlClassLoaderResource; } // delegate to super return super.getResource(name); } @Override public Enumeration<URL> getResources(String name) throws IOException { // first get resources from URLClassloader Enumeration<URL> urlClassLoaderResources = findResources(name); final List<URL> result = new ArrayList<>(); while (urlClassLoaderResources.hasMoreElements()) { result.add(urlClassLoaderResources.nextElement()); } // get parent urls Enumeration<URL> parentResources = getParent().getResources(name); while (parentResources.hasMoreElements()) { result.add(parentResources.nextElement()); } return new Enumeration<URL>() { Iterator<URL> iter = result.iterator(); public boolean hasMoreElements() { return iter.hasNext(); } public URL nextElement() { return iter.next(); }}; }}Copy the code
Core logic in loadClassWithoutExceptionHandling () method, a brief introduction is as follows:
- The findLoadedClass() method is called to check whether the class corresponding to the fully qualified name name has been loaded. If not, proceed.
- Check the classes to be loaded with alwaysParentFirstPatterns prefix in the collection. If so, the corresponding method of the parent class is called to load it parent-first.
- If the class does not meet the requirements for the alwaysParentFirstPatterns set, and call the findClass () method to look for in the user code and obtain the definition of the class (the default implementation of this method in the URLClassLoader). If not, fallback to the parent loader to load.
- Finally, if the resolve argument is true, the resolveClass() method is called to link the Class and the corresponding Class object is returned.
As you can see, the child-first policy avoids delegating the load request to the parent loader, and only certain classes must “follow the old rules.” AlwaysParentFirstPatterns in the collection of these classes are the foundation of the components such as Java, Flink, cannot be washed away by the user code. It is specified by two arguments:
Classloader.parent-first-pattern. default, not recommended, fixed to the following values:
java.;
scala.;
org.apache.flink.;
com.esotericsoftware.kryo;
org.apache.hadoop.;
javax.annotation.;
org.slf4j;
org.apache.log4j;
org.apache.logging;
org.apache.commons.logging;
ch.qos.logback;
org.xml;
javax.xml;
org.apache.xerces;
org.w3c
Copy the code
- Classloader.parent-first-pattern. additional: Users can specify additional classes (semicolon separated) if they have other classes that conflict in child-first mode and want to load in parent-delegate mode.
Resolve-order (flink-conf.yaml) is used to control the classloader.resolve-order (flink-conf.yaml)
The original link: www.jianshu.com/p/bc7309b03…