SIEM: More Data ≠ Better Detections | Better Data = More Detections

One of the first things we do while setting up a Security Information and Event Management (SIEM) platform is onboard security logs. After all, logs are the fuel for threat detection, correlation, and response.

A common mindset in security operations is:

The more data we ingest, the better detections we can make.

This belief is understandable. More data should mean more visibility, and more visibility should lead to better threat detection. But in reality, this approach has major drawbacks and can actually make security operations less effective.

The Problem with Collecting Too Much Data

Many organizations take an “ingest everything” approach while onboarding logs. The thought process is simple—if all logs are available, nothing will be missed. But this strategy often backfires.

Skyrocketing SIEM Costs

Most modern SIEM solutions, including Splunk, Microsoft Sentinel, IBM QRadar, ELK, and Google Chronicle, follow a data-based pricing model. The more logs you ingest and store, the higher your costs.

Many teams onboard excessive, low-value logs without filtering, leading to unnecessary storage costs without adding much security value.

Alert Fatigue & False Positives

Security teams rely on SIEMs to generate alerts for potential threats. However, if the system is flooded with irrelevant logs, it creates unnecessary noise, leading to:

Too many alerts, making it hard to focus on real threats.
False positives, overwhelming analysts with unnecessary investigations.
Missed critical alerts, buried under volumes of irrelevant data.

Slower Query Performance & Investigations

A bloated SIEM slows down everything. Searching through millions of unnecessary log entries leads to:

Longer investigation times when responding to incidents.
Delayed threat-hunting efforts due to slow query performance.
Overloaded SOC analysts, struggling to extract meaningful insights.

Clearly, more data doesn’t always mean better security.

A Smarter Approach: The Better the Data, The More the Detections

Instead of focusing on how much data is ingested, we should focus on how useful that data is for security operations. The goal should be to collect high-quality, relevant logs that improve detections without unnecessary overhead.

How to Improve Log Ingestion Strategy

Prioritize High-Value Log Sources

Not all logs are equally important. Security teams should focus on logs that provide real detection value, such as:

Firewall & IDS/IPS logs – Detect external threats, unauthorized access attempts.
Endpoint Detection & Response (EDR) logs – Identify malware, lateral movement.
Authentication logs (Active Directory, Okta, etc.) – Spot account compromise, privilege abuse.
Cloud security logs (AWS CloudTrail, Azure Activity Logs, etc.) – Monitor cloud-based attacks.
DNS & Proxy logs – Uncover Command & Control (C2) communication.

If a log source doesn’t contribute to detections or investigations, reconsider its necessity.

Align Log Ingestion with Use Cases

Every log source should have a clear detection purpose. Before onboarding a log, ask:

What threat scenarios does this log help detect?
Is this log mapped to a MITRE ATT&CK technique?
How often will security analysts use this log for investigations?

If the answers are unclear, the log might not be worth ingesting.

Ensure Log Completeness & Context

Poorly structured logs can be useless for security operations. Make sure that each log contains:

Timestamps – For accurate correlation.
Source & Destination IPs – Essential for network-based detections.
User Context – Helps identify compromised accounts.
Process & Command Execution Data – Critical for detecting malware and adversary behavior.

Filter and Normalize Logs at the Source

Instead of blindly ingesting raw logs, security teams should:

Filter out unnecessary events – Reduce noise by excluding non-security events.
Normalize logs – Standardize fields for easier correlation across different log sources.
Use log parsers – Tools like Logstash, Fluentd, or Splunk’s UF/Heavy Forwarder can help clean up log data before ingestion.

The Outcome: A Leaner, More Effective SIEM

A well-optimized SIEM doesn’t need to collect everything—it needs to collect what truly matters. By focusing on better data, not just more data, security teams can:

Reduce SIEM costs by eliminating unnecessary logs.
Improve detection accuracy with high-quality security logs.
Minimize alert fatigue by reducing false positives.
Speed up investigations with lean, structured, and actionable data.

Final Thoughts

The next time you onboard logs into your SIEM, ask:

🔹 Is this data helping my SOC detect real threats?
🔹 Will this log source provide valuable context in an investigation?
🔹 Am I collecting this log just because it’s available, or because it’s needed?

By shifting from “ingest everything” to “ingest what truly matters,” security teams can enhance their detection capabilities, reduce operational overhead, and maximize the ROI of their SIEM investments.

SIEM: More Data ≠ Better Detections | Better Data = More Detections