To Address Security Data Challenges, Decouple Your Data
With so much data and so many security tools, it’s no wonder that organizations are turning to decentralizing their data in their SIEM solutions.
- By Paul Kivikink
- January 24, 2024
A security incident and event management (SIEM) solution is the centerpiece for many organizations’ security operations centers (SOCs) -- at least, for those that haven’t recently updated their data transformation strategies. It’s the central tool that SOCs use to address questions related to threats and their security posture.
However, although SIEM solutions may have sufficed in the past, today’s organizations are grappling with so much data and so many different security tools that this established approach to collecting and correlating security data to deliver meaningful insights can’t keep up.
Increasingly, organizations are interested in decoupling or decentralizing the data in their SIEM tool. In fact, Gartner analysts predicted last year that data decentralization would be a key trend for SIEM, enabling “more cost-effective deployments, with more up-to-date data.”
What does this really mean and why does it matter?
Exploring SIEM Data Challenges
Data completeness. One reason that SIEM solutions are no longer enough for today’s modern threat landscape is that the tool makes it difficult to determine whether you have the correct data. For one thing, you may have blind spots because of older data that has aged out (more on that next) or your SIEM solution may not be able to receive or understand log data from all your organization’s log-generating sources. You need the proper data to determine whether you’ve experienced a security incident and whether it affected your organization. In addition, there are common high-volume log sources that are typically too large (and costly) to adequately store in a SIEM solution that have value from a detection and visibility perspective.
Data retention. There’s a finite retention period for data in a legacy SIEM solution. Once that retention period is over, access to data is lost. Typically, organizations store months of data in the SIEM solution; if you’re lucky, you have a year of data. If data has been moved to cold long-term storage, significant time and effort is required to reload that data into a searchable platform.
Data costs. The cost of SIEM solutions continues to rise. Vendors charge by ingestion or computing power -- either way, the cost comes down to data volume. If you’re charged based on computing resources, you want to ensure the ingestion pipeline limits the processing needed and the amount of data that queries and reports run against. If you’re being charged based on consumption, this can prompt you to avoid storing high volume of security event data in the SIEM solution. The SIEM model creates optimization challenges for data storage and throughput costs instead of allowing customers to focus on security analytics and valuable outcomes.
Data silos. Data silos have been a known challenge in cybersecurity for a long time. Within the SIEM solution and outside it, many security tools are generating valuable data, but that data is produced in isolation. It’s difficult in modern enterprises to combine all security data into a meaningful and unified view.
Another silo that exists is that between security data and other enterprise data. Are you looking at security data in a vacuum or are you looking at it in the context of the larger business, with business data as well? By enriching security data with other enterprise data, organizations can gain even deeper insights into their security, risk, and compliance posture.
Slow and limited. Threat hunters may struggle with their own data challenges, including trying to run complex or compute-intensive threat hunting queries, resulting in long query response times. This isn’t an exaggeration: security teams have started queries, gone out to lunch, and then hoped the results would be complete by the time they return. Another common challenge is that threat hunters typically lack control over what security event data goes into a SIEM solution and what data ages out. They often end up spending significant time making a data set usable -- many security tools generate data in their own formats.
Time to Decouple
One approach to navigating these challenges while still reaping the benefits a SIEM solution can offer is to decouple the data from the SIEM system. The SIEM tool aggregates data from many sources, so decoupling the data is essentially a matter of separating the data platform from SIEM capabilities.
Why is this a good thing? It can ultimately help you gain a holistic perspective of all the security tools you have in your organization to ensure you’re leveraging the intrinsic value of each one.
Most organizations have dozens of security tools, if not more, but most lack a solid understanding or mapping of what data should go into the SIEM solution, what should come out, and what data is used for security analytics, compliance, or reporting. As data becomes more complex, extracting value and aggregating insights become more difficult.
When you decide to decouple the data from the SIEM system, you have an opportunity to evaluate your data. As you move towards an integrated data layer where disparate data is consolidated, you can clean, deduplicate, and enrich it. Then you have the chance to merge that data not only with other security data but with enterprise IT and business data, too.
Decoupling the data into a layer where disparate data is woven together and normalized for multidomain data use cases allows your organization to easily take HR data, organizational data, and business logic and transform it all into ready-to-use business data where security is a use case. The result is data that delivers immediate business context, making it easier to detect and respond faster to threats, aligning security tools and teams to support business operations, and providing necessary context to protect the business more accurately.
Benefits of this approach include:
- Getting the right data you need into one place, which ensures you can get the answers you need from the data. For many organizations, this is where a data lake can play a key role -- not to replace but to offload or complement the SIEM solution.
- Having unified, clean data enables you to start driving better outcomes from that unified data layer.
- Data becomes more usable across different teams including security and governance, risk, and compliance (GRC).
- More trustworthy data.
How Do You Decouple the Data Layer?
Decoupling the data layer from the SIEM solution begins with an analysis of where silos of data exist and breaking those down. We frequently see data silos within security between different tools, each producing its own data in formats that are not easily integrated. Another key example is dissolving the silos between security data and enterprise data, providing security with additional business context to improve detection and response that support business operations. Once you've identified silos, gaps, and data needs for each team, you can start understanding how employees are using the data and for what purpose.
With that information as a guide, start optimizing data for security analysts and threat hunters but also other business or GRC teams. This helps you prioritize data projects and use cases to protect the business. Ideally, you want to govern the data well -- managing data retention, who has access, data cost optimization, and so on.
Bringing together all this data in a cost-effective way requires a new approach, such as building on a security data fabric (or data lake). This lets you consolidate, analyze, and manage security data to enable deeper security insights using data that is clean, enriched, actionable, and easy-to-use across security and enterprise data.
This approach gives you long-term retention that’s available on demand, avoiding stale data or data that’s aged out. You’ll also be able to optimize the cost of data storage without duplicating or compromising quality. Data will reside in a single common schema available for multiple uses and can be coupled with analytics, reporting, and orchestration tools.
Toward a New Data Layer In the ideal scenario, decoupling the data from the SIEM solution allows organizations to bring together the business context that security, risk, and compliance teams need to safeguard an organization’s digital assets and people.
This includes teams of SOC analysts, data analysts, compliance and audit experts, threat hunters, researchers and incident responders, and data scientists. This allows teams to quickly identify actual threats and manage compliance thanks to a unified view of crucial security data with business context.
Tools such as SIEM solutions can still play an important role, but you may not want all your non-security data to live there; a cost-optimized platform for retaining long-term, high-volume data sources is the goal. By decoupling the data layer, you can develop a multi-use data layer to enable a variety of use cases.