The Two-Headed SIEM Monster
Industry trends point to multiple SIEMs becoming a wider problem for security operations
The government defines SIEM as “a single system to improve the detection and remediation of security issues,” but what happens when you have more than one? The SIEM’s role as the place where security events get centralized has eroded as security data exploded simultaneously as analytics requirements for large SOCs became more sophisticated. Now, emerging trends are transforming the problem from too many silos to too many SIEMs. This spells trouble for security operations, where two heads are not at all better than one.
Endpoint Vendors Stock Up on SIEM
Let’s unpack the trends pushing SOCs to rely on multiple SIEMs. We’ll see that these powerful currents will likely shape security operations for years to come.
First, what do all the Gartner EDR MQ leaders have in common? Under the banner of XDR, they’ve all started selling SIEM.
CrowdStrike says we’ve got Next-Gen SIEM and Log Management
Microsoft will sell you An easy and powerful SIEM solution
Palo Alto didn’t like that others copied “XDR” so they called their SIEM by a different name and have written that SIEM solutions will adapt and evolve
And the same goes for the rest. None of these vendors are content with selling endpoint agents that feed into Splunk or other dedicated SIEM platforms.
Splunk’s challenges have likely played a role in this industry trend. The recently Ciscoed company controls more of the security event management space than all the endpoint vendors combined. But its estimated $2 billion in annual security revenue is seen by many as up for grabs, waiting to become a line item in a bill from the vendor that already runs on thousands of your endpoints.
Technological advances have also played a role, as the groundbreaking search performance that previously drew SOC teams to Splunk and Elastic has aged. CrowdStrike paid $400M for Humio, while SentinelOne bought Scalyr for less than $200M. Palo Alto rents big data technology from Google, while Trellix went with Snowflake. As a result, these crossover challengers are able to claim comparable capabilities as pure-play incumbents.
I say “claim” because a healthy dose of skepticism is warranted. Humio had not seen significant enterprise adoption before CrowdStrike stuck its logo on the tin. Splunk is infinitely more proven at large-scale analytics for hunting and threat detection. I believe endpoint vendors hope their search platform is “good enough” to replace SIEM beyond the SME market. Given that the EDR vendors have thousands of agents deployed across every enterprise, they’ll likely get a shot to prove it.
Your Cloud Comes With SIEM
Since every large business and government agency now has a public cloud footprint, another significant force towards multiple SIEMs is that each major CSP has its own SIEM offerings. AWS is still in the early days with its Security Lake offering, but Azure and Google are serious about theirs.
Both have poured billions into related acquisitions and engineering. Microsoft reports that thousands of customers use Sentinel and offer significant subsidies for keeping Azure and Office365 event logs on the platform. As a result, many organizations have chosen to use the CSP’s built-in SIEM, at least as a stopgap measure for analyzing that platform’s logs.
When a SIEM is Just a Side-SIEM
Several factors will keep these new SIEMs from replacing King Splunk at large enterprises. If several of these are present, the XDR or cloud SIEM would take an additive position as a side-SIEM.
Data also used by other teams: Splunk is often used by teams outside of cybersecurity, including IT and DevOps. Much of the data collected to Splunk may also be used by these teams, which have their own requirements to support observability use cases.
Other active security use cases: Splunk may be used for security use cases beyond security operations, such as vulnerability management, fraud detection, and regulatory compliance.
Analytics requirements: Enterprise detection teams perform significant analytics, including correlation, anomaly detection, and behavior modeling. Underlying search technology with limited joining abilities and no data science support will not provide the parity needed for a complete replacement.
Reporting requirements: Splunk dashboards can be pretty epic.. and many security leaders rely on them for metrics and trend charts.
Cloud egress: A SIEM that’s only offered in one cloud may impose prohibitive egress costs and compliance issues for data shipped out from other clouds used by the enterprise.
Integration requirements: Hybrid environments and enterprises with infrastructure deployed over decades have a “long tail” of integration requirements. Switching from Splunk would require significant log collection and normalization development.
Content requirements: Solutions provided as part of a highly opinionated suite (such as Palo Alto Networks) tend to be optimized for their products. Third-party EDR or CNAPP may receive little or no pre-built content. The effort involved with building and maintaining effective rules and correlations would fall on the individual security team.
Beyond these reasons that Splunk will likely stick around, there’s also the soft side of cyber. Many security analysts are deeply invested in SPL expertise and certifications that would become obsolete if their Splunk were decommissioned.
The Impact on Security Operations
The problems caused by working across multiple SIEM solutions have not received much attention. Network security provider Corelight, which generates oodles of data and knows this problem well, has observed that “defenders have been deploying a secondary SIEM” in a post titled One SIEM is not enough? Anton Chuvakin warned in Living With Multiple SIEMs about the “Complexity and hence fragility of the multi-system setup (due to both data flow integration needs and detection content organization). Complexity kills security.” But the trends described above warrant a deeper analysis of the dangers.
First, it’s harder to detect threats across multiple SIEMs. Attackers don’t limit themselves to one part of the environment; They strike through email to land on the endpoint and pivot to the cloud control plane from where they extract data through the network. A sensor limited to just one of these areas will not reliably identify the attack as it unfolds.
The defenders must also contend with hundreds of detections managed in separate repositories. Detection engineering at scale requires a lifecycle with version control, testing, and repeatable processes to ensure that detections work as expected. This is hard enough within a single platform and is unlikely to succeed across multiple sets of detection rules.
Consider what happens when a new zero-day vulnerability hits the headlines. Detection engineers scramble to ensure that exploitation attempts are identified, contained, and mitigated immediately. Comparing coverage from the various SIEM providers, for example, one in Azure and one in AWS, would require the team to manually confirm that both have covered the relevant services and TTPs.
Consider that each detection and response solution may use a different language with its proprietary rule syntax and format. Teams may be tempted to divide responsibility between the analysts, with some specializing in SIEM A and some in SIEM B. Rule development would progress in separate tracks, with coverage progressing piecemeal. At scale, writing good queries is vital to optimize performance, and this expertise is hard to develop across multiple solutions.
There’s also the challenge of ensuring data quality. Without one central SIEM, the security operation takes on substantial complexity to ensure the expected datasets consistently arrive as expected. This is also true for maintaining data models, as changes to upstream data must be applied in different locations. A unified pipeline, not exclusive to one of the SIEMs, helps here, but data quality can still be affected by varying retention policies and modifications in the different systems.
Incident response is slower and less effective in a spinning swivel-chair, with investigations performed across separate search consoles. Each set of results may represent a portion of the attack and must be manually reconciled into a unified timeline. This becomes exceedingly difficult when an attack spans significant time and assets.
Finally, the all-important goal of automation takes a drastic hit. Each SIEM requires separate integration through a distinct set of APIs running against a unique data model. Playbooks developed for one set of alerts would need to be built from scratch for a parallel environment monitored in a separate SIEM. Attempts to leverage data science and machine learning for automating the routine tasks that grind down the SOC might be supported in one SIEM or another but would not be portable across systems.
Conclusion
Security operations leaders should be mindful of the downsides of using multiple SIEMs. While eager vendors may downplay the difficulties introduced with their side-SIEM, buyers should take time during planning to connect with similar organizations that have deployed the same set of multiple SIEMs. Discussing key concerns should include detection fidelity, rule management complexity, ensuring data quality, investigation effort, and automation success. While industry trends have made adopting multiple detection and response platforms increasingly tempting, potential cost savings must be weighed against the risk and complexity inherent in the “two-headed” approach.