8 SIEM Cost Factors You Need to Know
Navigate the complexity of SIEM by breaking down what's costing you money
Misalignment between your environment and your SIEM’s cost model can break your security operation. Cost models pressure data collection, analysis, and retention, impacting key SOC metrics like the miss rate, false positive rate, and response time. As a security leader, one of the best ways you can help your security operations team is to understand all the ways in which the deployed SIEM impacts your budget and then use that knowledge to reduce constraints on the SOC.
Overview
The cost model defines how your SIEM draws from the cybersecurity budget across the security operations lifecycle. The move to the cloud added significant complexity for larger SOCs, making it especially important to enumerate the ways in which you incur hard costs. Your SIEM cost model should include all the factors listed below. You should even track factors that don’t apply to you today, recording them as zero cost and using the complete list for comparing alternative approaches.
Data collection
Hot retention
Cold retention
Detection processing
Investigation processing
Archive processing
Cloud egress
SIEM solution
Each cost factor will have a way in which the provider meters it and a cost level. One solution can be more cost-effective for a given factor by metering differently (e.g. charging by feature instead of by data volume) or by having a lower cost level (e.g. $20 per TB instead of $50 per TB). Let’s review what goes into each cost factor and consider illustrative examples.
Cost Factor: Data Collection
What you spend on extracting source data and loading it to your data platform.
Event logs and other security records must be copied from their origin across your network, cloud accounts, or point solution APIs. The initial leg of the data journey is sometimes performed on SIEM infrastructure, such as Splunk forwarders, and in other cases tooling with additional metering, such as Cribl Stream.
Data collection costs are typically measured in terms of annual spend by daily volume (e.g. $900k annual cost for 1.5 TB/d) but some solutions price by ingest amount (e.g. $0.10 per TB). Some simple multiplication can get you to an annualized “TB/d” figure either way. Don’t forget to include costs associated with related services such as Logstash virtual machines, AWS-managed Kafka, or Snowflake’s Snowpipe service.
Cost Factor: Hot Retention
What you spend on storing your security data for active use.
Hot retention refers to the storage of data that is frequently accessed and used for immediate analysis in a SIEM environment. This type of data storage is essential for real-time threat detection and rapid response in security operations. The expectation is that hot data can be used without rehydration processes which introduce delays, management overhead and analytics constraints.
Hot retention costs should be captured in a way that lines up with data collection and accounts for the retention period. For example, $200k annual cost for 7 TB/d retained for 90 days. So while individual records may be retained for only 90 days in hot availability, the cost is represented in annual terms. As the retention period can have a dramatic impact on the effectiveness of threat detection and response, you don’t want to lose this aspect when comparing solutions.
Some organizations have a security policy that mandates high availability (HA) for the SIEM. Consider the level of HA that you require, and if multiple copies of the data will be needed as a result.
Bear in mind that storage costs may depend on how the data is compressed. While data collection figures track how much uncompressed data is generated by the environment, storage costs for some solutions are calculated by compressed data volume. When that’s the case, your annual storage cost estimate should factor in the daily volume of data collected, the compression ratio, and the retention period.
Cost Factor: Cold Retention
What you spend on storing security data that is not fully available for active use.
Many security operations teams rely on cold storage archives to meet compliance requirements within budget constraints. This introduces challenges when archived data is need for investigations, IOC sweeps, threat hunting, data science or metrics. But archiving is often necessary with technologies with expensive hot storage or for organizations with multi-year regulatory retention requirements.
Storage costs for cold retention are calculated the same way as for hot retention. The additional costs involved with moving data to and from the archive are factored separately in “Archive Processing” below.
Cost Factor: Detection Processing
What you spend on automated analytics for threat detection.
The security operations team relies on detection rules to automatically identify events of interest in the environment. Running these detection rules requires computational power (“compute” for short) to analyze collected data against logic or algorithms designed to spot attacks.
In the cloud, compute can take many forms. Some services charge by “bytes scanned”, some use abstract CPU cycles, others use query time. The cost model for compute affect the level and predictability of detection processing costs. While compute costs may have been a secondary concern in the on-prem days of fixed hardware and tightly coupled storage/compute, the game is very different in the cloud.
Costs should be estimated based on the expected quantity and frequency of detection rules in production. Developing heuristics that account for ingest volume, data complexity, rule quantity and analytics complexity is a significant challenge. Machine learning models that continually train on new data also play a role in this cost factor. Security operations teams should work with their SIEM vendor to review the heuristics provided for estimating compute requirements. These should reflect extensive experience with similar organizations and help to avoid resource exhaustion or surprise overages.
Cost Factor: Investigation Processing
What you spend to search during triage, incident response and threat hunting.
Like for running detections, investigations also require compute power to crunch the collected data. Some solutions use different engines (e.g. stream vs batch processing) for detections and investigations, while others will use the same for both. Getting to at least a basic understanding of how the query engine works will help the security organization to estimate how much its investigations will cost.
Consumption-based pricing is common in cloud environments, and security operations teams increasingly need to predict costs based on future consumption. But cybersecurity is intrinsically unpredictable, and any SOC can go through quiet periods, fire drills, and full-blown breach response. As with detection processing, planners should work with their vendors to estimate costs based on heuristics developed over time at organizations with similar profiles. Ample buffers and the natural tendency of things to “even out” will help predictions to stay within the target range throughout the year.
An additional benefit of planning with cost factors is that you can set up monitoring and guardrails to avoid surprises and correct problematic trends early.
Finally, a word of caution on solutions that don’t have a variable cost for investigation processing. There’s no free lunch, so watch for hidden limits that the vendor put in place to cap their downside costs. Often such solutions will limit search time windows, performance, or the kinds of supported analytics.
Cost Factor: Archive Processing
What you spend to make data from cold storage searchable again.
Archive processing involves the retrieval and rehydration of old logs from cold storage back into the SIEM system when needed. This process is crucial for retrospective investigations and threat hunting across time periods that extend beyond the hot retention window.
Retrieving data from cold storage typically incurs costs, as this data is often stored in a less accessible format to save on storage expenses. The cost depends on the volume of data being retrieved and the frequency of such retrievals.
Rehydration refers to the process of making archived logs usable again by the SIEM system, often involving re-indexing or transforming the data into a format suitable for analysis. This process can be resource-intensive, especially for large volumes of data, thus impacting the associated costs.
Some solutions impose constraints on rehydration including minimum data volume and time limits on restored data. Read the fine print to ensure that cost estimates take these restrictions into account.
Cost Factor: Cloud Egress
What you spend on moving data between clouds and regions.
Cloud egress refers to the costs associated with transferring data from various cloud services and regions to a SIEM solution. This factor is significant for organizations that utilize cloud services across different providers and geographical locations. When data is moved between cloud providers or regions to get to the SIEM, egress costs are incurred.
This cost factor is especially significant for organizations that depend on a SIEM solution that is only available in one cloud- for example, Chronicle in GCP, Sentinel in Azure, or Securonix in AWS. If the company has a significant footprint in a different cloud, shipping activity logs to the SIEM will drive up egress charges. To get a ballpark sense of these costs, shipping 2 TB/d of CloudTrail from AWS to GCP would run upwards of $50k a year.
Cost Factor: SIEM Solution
In many cases, there is an additional cost for the SIEM solution that operationalizes the data platform for the SOC. Splunk offers purpose-built capabilities and content in its Enterprise Security product. Anvilogic provides a SIEM layer for multiple data platforms, including Splunk and Snowflake.
Most SIEM solutions operate on a licensing model, which can be based on factors like data volume, number of users, or features used.
Conclusion
There is a direct line between how a SIEM charges and how well it performs. A security operations team cannot succeed if it can’t collect the data it needs, keep it for as long as necessary, or apply the required analytics power. When evaluating options to replace or augment an existing SIEM solution, examine each cost factor to ensure that it lines up with your environmental and operational requirements. For more information, see my post on Snowflake cost factors for SIEM use cases.