Lacework’s AI Didn’t Work
Why the Lacework crash is a lesson on the limitations of AI in cybersecurity
Could anyone have predicted the spectacular downfall of cloud security heavyweight Lacework? Leading VCs had poured over a billion dollars of funding into the company, attracting high-profile tech executives, top-tier engineering talent, and over 1,000 employees at its peak. What they missed about the company’s AI strategy is instructive for a cybersecurity industry counting more than ever on artificial intelligence.
In my analysis below, I’m relying only on what’s publicy available on Lacework’s website and documentation. Everything that follows is my personal interpretation of how a flawed AI strategy doomed the startup once called “one of this generation’s most important cybersecurity companies.”
Lost in Translation
Lacework’s crown jewel was its Polygraph technology. Applying artificial intelligence across extensive cloud activity logs, Polygraph was described as “the revolutionary way to use your data to automatically find what matters most across your cloud environment.” Out with threat detection rules that define what attacks look like. In with anomaly-based detections that learn what’s expected in the environment and flag the bad guys when they do something out of the ordinary.
Shifting threat detection from signatures to algorithms was not without precedent. Before Lacework created Polygraph, endpoint vendors like Cylance and CrowdStrike used machine learning to disrupt the incumbents in their industry. Symantec, McAfee, and other traditional antivirus providers relied on signatures that encoded telltale bits of malware for detection. As researchers identified new viruses, signature rules were pushed out to millions of endpoint protection agents worldwide.
The new anti-malware solutions took a different approach, relying on machine learning instead of signatures. CrowdStrike explained the difference in a 2016 post titled “CrowdStrike Machine Learning and VirusTotal.”
Traditional AV engines look for signatures or heuristics, i.e. sequences of specific bytes in the file. A malware author can easily change those detected sequences or add obfuscation layers. In contrast, using machine learning, we look at the broader picture and extract so-called “features” from the files analyzed.
CrowdStrike trained its ML models on millions of known malware samples before it could detect threats as effectively as signature-based antivirus. Once its algorithms were sufficiently trained, updates could be delivered less frequently, and new threats could be identified more reliably. Could Lacework achieve a similar breakthrough against threats in the cloud?
Unfortunately for the Polygraph team, there was no equivalent database with millions of cloud attacks that could be fed into an ML model. Researchers have been collecting malware file samples for years. In the cloud, attacks are mainly API calls recorded in log data, spread over time, and are rarely packaged for analysis. Lacework’s “no rules” and “high fidelity” algorithms would need to look elsewhere for their training.
Unusual but Not Malicious
The difficulty of cloud threat detection would never stand in the way of enterprises shifting their data centers to AWS, Azure, and GCP. With the great cloud migration underway, Lacework told anxious security leaders that they could “uncover unknown threats like abnormal logins and escalation of privileges with patented Polygraph anomaly-based approach.”
From this description, we can learn where Lacework’s algorithms would find the needed training data: in the customer’s environment.
The polygraph technology dynamically develops a behavioral model of your services and infrastructure. The model understands natural hierarchies including processes, containers, pods, and machines. It then develops behavioral models that the polygraph monitors in search of activities that fall outside the model’s parameters.
In a post from 2017, one of the Lacework cofounders explained how they apply machine learning to turn the customer’s own activity patterns into highly effective threat detections that attackers would struggle to circumvent.
We use unsupervised machine learning to build a baseline for each cloud deployment. We develop exhaustive insights, with information about all entities and their behaviors. Every baseline is as unique as the deployment it protects, making it easy to accurately spot the changes (using supervised machine learning) that always accompany an attack. A successful hacker would 1) have to have an omniscient understanding of your specific cloud deployment and 2) design an attack that perfectly mimicked normal behavior in that deployment. A tough challenge indeed.
This approach may have worked well in Lacework’s early days when Polygraph ran in a test environment and at early customers— many of whom were startups. Unfortunately, production cloud environments running at full scale are notoriously busy places. The DevOps movement encourages multiple code releases a day. AWS has over 200 services, where cloud users can spin up complex clusters in seconds and tear them down just as easily. Lacework’s Polygraph technology had to monitor environments where “unusual” things happened frequently and rarely involved an attack.
While Lacework promised “high fidelity alert reduction” for the cloud, they didn’t experience the same liftoff as CrowdStrike saw on the endpoint. Instead, the company laid off 20% of its workforce just months after raising $1.3 billion at a whopping $8.3 billion valuation. This wasn’t just a matter of rising interest rates. AI for anomaly detection on cloud activity wasn’t the same magic sauce as AI trained on millions of samples for malware detection.
The “No Rules” Dillema
Lacework heavily promoted its technology for threat detection with “no rules.” Its official documentation states that “polygraph is the first and only zero touch cloud workload protection platform, which requires no rules, no policies, and no logs for breach detection.” The company released an ebook titled “Cloud Security Automation for Dummies” that equated automation with having “No rules to write or maintain.” But even if the AI worked as intended, was the “no rules” philosophy doomed from the start?
While security teams at early-stage startups might appreciate a cloud security solution with few knobs to turn, established security organizations have detection engineering functions dedicated to striking a balance between noisy alerts and missed attacks. This balance requires customization and fine-tuning. Detection-as-code has seen widespread adoption as a best practice for spotting threats. And detection-as-code, at its core, is all about rules.
Likely in response to pressure from up-market customers demanding greater flexibility, Lacework eventually launched the Lacework Query Language (LQL). LQL enabled custom rule creation and detection signatures maintained by the Lacework research team. For example, a blog on Kubernetes threat detection describes how “Lacework has released LQL policies that detect deleted and deprecated API calls for all Kubernetes API versions in this GitHub repository.”
The new slogan for the “No Rules” cloud security platform would be “Rules Optional.”
Adding custom rules with the Lacework Query Language doesn't appear to have achieved the necessary balance between automation and flexibility. The company’s Customers page still contains mainly smaller organizations, and the GitHub project to help customers use LQL for threat hunting hasn’t been updated in over two years. The “no rules” approach to cloud threat detection wasn’t suited for the medium and large enterprises that cybersecurity companies depend on for success in the long term.
AI Lessons Learned
Cybersecurity entrepreneurs, investors, and practitioners can take away several important lessons from Lacework’s story. As artificial intelligence technology hits peak hype, how can you evaluate the limits of an AI strategy for threat detection use cases?
Training data matters: What training datasets were used and were they big enough to support the inference the models will be expected to make?
The black-box/flexibility tradeoff: Does the solution achieve the necessary balance between automation and customization?
Independent validation: What tests could we run to expose model issues like overfitting or knowledge gaps?
You can use these considerations as a framework for emerging AI use cases like security copilot. Ask how the copilot’s models were trained and how much actual analyst activity was used during training. Find out what feedback mechanism exists to tune the copilot and whether customization “as code” options exist for integration with the organization’s triage workflows. Finally, validate independently by asking it to do some basic and not-so-basic analyst work to map out the edges of the copilot’s abilities.
Shifting hard, tedious work from humans to machines has been the story of humanity from the wheel to sliced bread. Artificial intelligence is an abstract and opaque technology, and cybersecurity presents it with a fog of war where effectiveness can be especially hard to tell. Some of Silicon Valley’s sharpest minds missed the flaws in Lacework’s application of machine learning to cloud security. When a lesson costs over a billion dollars, we should pay attention.
Great points @Omer.
We should not think about AI for detection without understanding the needs of detection engineers in terms of explanability and accuracy.
I covered these topics with Max (CPO @Darktrace) last January [1] [2]
The good old alerts from signature has a virtue : they are easy to understand and detection engineers can tweak them if they generate too much false positive. AI or ML based algorithms, especially unsupervised, generate anomalies and often require a long and painful "doubt removal" process within SecOps team...
AI for cyber detection is not a "done problem".
Would love to continue the conversation!
[1] https://cyberbuilders.substack.com/p/ai-and-cyber-from-a-detection-engineers
[2] https://cyberbuilders.substack.com/p/decoding-ai-in-cybersecurity-navigating