The SolarWinds Orion compromise was a nation-state supply chain attack that hit 18,000 organisations, including US government agencies and major tech companies, back in December 2020. A closer look at the incident reveals some disturbing engineering lessons.
The attackers, later linked to Russia's SVR, managed to inject malicious code into the Orion software update package through SolarWinds' compromised build system. What's striking is that the malicious update was cryptographically signed by SolarWinds' own certificate, and it was distributed via the legit update channel. This malware lay dormant for two weeks before setting up a covert command-and-control channel that masqueraded as legitimate Orion traffic.
For instance, I recall a similar incident where a company I worked with had their Jenkins build server compromised due to a weak password. The attackers used it to inject malware into their software updates. It turned out that the build server was reachable from the developer workstations, and no one had bothered to audit the build scripts for unexpected network access. The attackers were able to move laterally and exploit multiple systems before being detected.
The fact that the signature didn't prevent the attack is often misunderstood. Code signing essentially verifies that a binary comes from the named entity; it doesn't check if the code itself is free from malicious modifications. An attacker with access to the signing infrastructure can sign malicious code, making the signature more of a formality than a security guarantee.
In the aftermath of the SolarWinds incident, build system security received much-needed attention. Isolating build infrastructure from development environments is a key control – the build system shouldn't be reachable from developer workstations, for instance. You should also audit build scripts for unexpected network access or resource modification, implement reproducible builds (so the same source code produces the same binary), and compare build outputs against expected results. The SLSA framework codifies these best practices.
For example, a company using CircleCI or GitLab CI/CD should ensure that their build runners are properly isolated and not reachable from the developer workstations. They should also use a tool like in-toto or SLSA to implement reproducible builds and provenance. This way, even if an attacker compromises a build runner, they won't be able to produce a malicious binary that passes verification.
The SolarWinds compromise also highlights the importance of network segmentation and reducing the blast radius. The compromised systems had broad network access, allowing the attackers to execute a lateral movement attack. Implementing zero-trust networking and microsegmentation would have restricted the access a compromised monitoring agent could gain. The takeaway is that the assumed breach posture, where you assume an attacker is already inside the perimeter and design access controls accordingly, is the correct security mindset.