Resources

filter +
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

SOC teams today are under pressure. Alert volumes are overwhelming, investigations are piling up, and teams are short on resources. Many SOCs are forced to suppress detection rules or delay investigations just to keep pace. As the burden grows, agentic AI solutions are gaining traction as a way to reduce manual work, scale expertise, and speed up decision-making.

At the same time, not all solutions deliver the same value. With new tools emerging constantly, security leaders need a way to assess which systems actually improve outcomes. Demos may look impressive, but what matters is how well the system works in real environments, with your team, your tools, and your workflows.

This guide outlines a practical approach to benchmarking agentic AI in the SOC. The goal is to evaluate performance where it counts: in your daily operations, not in a sales environment.

What Agentic AI Means in the SOC

Agentic AI refers to systems that reason and act with autonomy across the investigation process. Rather than following fixed rules or scripts, they are designed to take a goal, such as understanding an alert or verifying a threat, and figure out the steps to achieve it. That includes retrieving evidence, correlating data, assessing risk, and initiating a response.

These systems are built for flexibility. They interpret data, ask questions, and adjust based on what they find. In the SOC, that means helping analysts triage alerts, investigate incidents, and reduce manual effort. But because they adapt to their environment, evaluating them requires more than a checklist. You have to see how they work in context.

1. Start by Understanding Your Own SOC

Before diving into use cases, you need a clear understanding of your current environment. What tools do you rely on? What types of alerts are flooding your queue? Where are your analysts spending most of their time? And just as importantly, where are they truly needed?

Ask:

  • What types of alerts do you want to automate?
  • How long does it currently take to acknowledge and investigate those alerts?
  • Where are your analysts delivering critical value through judgment and expertise?
  • Where is their time being drained by manual or repetitive tasks?
  • Which tools and systems hold key context or history that investigations rely on?

This understanding helps scope the problem and identify where agentic AI can make the most impact. For example, a user-reported phishing email that follows a predictable structure is a strong candidate for automation.

On the other hand, a suspicious identity-based alert involving cross-cloud access, irregular privileges, and unfamiliar assets may be better suited for manual investigation. These cases require analysts to think creatively, assess multiple possibilities, and make decisions based on a broader organizational context.

Benchmarking is only meaningful when it reflects your reality. Generic tests or template use cases won’t surface the same challenges your team faces daily. Evaluations must mirror your data, your processes, and your decision logic.

Otherwise, you’ll face a painful gap between what the system shows in a demo and what it delivers in production. Your SOC is not a demo environment, and your organization isn’t interchangeable with anyone else’s. You need a system that can operate effectively in your real world, not just in theory but in practice.

2. Build the Benchmark Around Real Use Cases

Once you understand where you need automation and where you don’t, the next step is selecting the right use cases to evaluate. Focus on alert types that occur frequently and drain analyst time. Avoid artificial scenarios that make the system look good but don’t test it meaningfully.

Shape the evaluation around:

  • The alerts you want to offload
  • The tools already integrated into your environment
  • The logic your analysts use to escalate or resolve investigations

If the system can’t navigate your real workflows or access the data that matters, it won’t deliver value even if it performs well in a controlled setting.

3. Understand Where Your Context Lives

Accurate investigations depend on more than just alerts. Critical context often lives in ticketing systems, identity providers, asset inventories, previous incident records, or email gateways.

Your evaluation should examine:

  • Which systems store the data your analysts need during an investigation
  • Whether the agentic system integrates directly with those systems
  • How well it surfaces and applies relevant context at decision points

It’s not enough for a system to be technically integrated. It needs to pull the right context at the right time. Otherwise, workflows may complete, but analysts still need to jump in to validate or fill gaps manually.

4. Keep Analysts in the Loop

Agentic systems are not meant to replace analysts. Their value comes from working alongside humans: surfacing reasoning, offering speed, and allowing feedback that improves performance over time.

Your evaluation should test:

  • Whether the system explains what it’s doing and why
  • If analysts can give feedback or course-correct
  • How easily logic and outcomes can be reviewed or tuned

When it comes to accuracy, two areas matter most:

  • False negatives: when real threats are missed or misclassified
  • False positives: when harmless activity is escalated unnecessarily

False negatives are a direct risk to the organization. False positives create long-term fatigue.

Critically, you should also evaluate how the system evolves over time. Is it learning from analyst feedback? Is it getting better with repeated exposure to similar cases? A system that doesn’t improve will struggle to generalize and scale across different use cases. Without measurable learning and adaptation, you can’t count on consistent value beyond the initial deployment.

5. Measure Time Saved in the Right Context

Time savings is often used to justify automation, but it only matters when tied to actual analyst workload. Don’t just look at how fast a case is resolved: consider how often that case type occurs and how much effort it typically requires.

To evaluate this, measure:

  • How long it takes today to investigate each alert type
  • How frequently those alerts happen
  • Whether the system fully resolves them or only assists

Use a simple formula to estimate potential impact:

  • Time Saved = Alert Volume × MTTR
    (where MTTR = MTTA + MTTI)

This provides a grounded view of where automation will drive real efficiency. MTTA (mean time to acknowledge) and MTTI (mean time to investigate) help capture the full response timeline and show how much manual work can be offloaded.

Some alerts are rare but time-consuming. Others are frequent and simple. Prioritize high-volume, moderately complex workflows. These are often the best candidates for automation with meaningful long-term value. Avoid chasing flashy edge cases that won’t significantly impact operational burden.

6. Prioritize Reliability

It doesn’t matter how powerful a system is if it fails regularly or requires constant oversight. Reliability is the foundation of trust, and trust is what drives adoption.

Track:

  • How often do workflows complete without breaking
  • Whether results are consistent across similar inputs
  • How often manual recovery is needed

If analysts don’t trust the output, they won’t use it. And if they constantly have to step in, the system becomes another point of friction, not relief.

Final Thoughts

Agentic AI has the potential to reshape SOC operations. But realizing that potential depends on how well the system performs in your real-world conditions. The strongest solutions adapt to your environment, support your team, and deliver consistent value over time.

When evaluating, focus on:

  • Your actual alert types, workflows, and operational goals
  • The tools and systems that store the context your team depends on
  • Analyst involvement, feedback loops, and decision transparency
  • Real time savings tied to the volume and complexity of your alerts
  • Reliability and trust in day-to-day performance

The best system is the one that fits — not just in theory, but in the reality of your SOC.

SOC
How to Benchmark Agentic AI in the SOC: A Practical Guide
September 18, 2025
0 min read

Learn how to benchmark agentic AI solutions in your SOC effectively. This guide provides a practical approach to evaluating performance in real-world environments.

Ron Marsiano

The vision of a fully automated SOC is finally within reach, piece by piece. But as we see it, building automation around a theoretical SOC, not a real one, is the wrong approach for enterprise companies.

Some focus on Tier 1 alerts: the repeatable, mundane tasks like standard phishing playbooks. Others tackle Tier 2+ investigations, where human judgment is still essential. Both are valid, but they miss the core reality of how a SOC operates today.

The goal of any AI SOC analyst isn't to replace your team, but to automate and improve the way they actually work.

Right now, your analysts are stuck in browser tabs, pivoting between consoles, copying data, and piecing together the truth manually. This isn’t scalable or efficient. It's why we founded Legion. Our vision is simple: your SOC lives in the browser. Your AI SOC analyst should build automation that reflects exactly that.

How SOC Workflows Actually Work Today

The modern SOC runs on people, browsers, and disconnected tools. Here's what that looks like in practice:

  • Data Ingestion: Data (IPs, threat intel, logs, etc) is pulled from multiple sources and correlated
  • Detection Engineering: Rules are written, tested, and updated based on what was missed or what created noise.
  • Alert Triage: Analysts spend their day pulling data from different systems to figure out if an alert is real or just noise.
  • Threat Hunting: Proactive hunts are a mix of experience and manual queries. Results are often shared ad hoc in Slack or documents, rarely in a repeatable format.
  • Deeper Investigations: When an alert is valid, the manual pivot begins. Analysts jump between logs, threat intel feeds, and internal assets to gain context. Every jump between tools and content loses context.
  • Remediation Actions: Depending on the validity of the alert, remediation actions are completed, and/or the ticket is closed out.
  • Reporting & Incident Summarization: Building an incident timeline and report is a manual process of collecting screenshots, logs, and notes stitched together by hand.
  • Process Hand-Offs: Shift changes and escalations often drop critical context because investigations aren’t documented in a structured way.

Author: Filip Stojkovski, Cybersec Automation

The main point is that most SOC workflows today are repetitive but lack standardization. Even if organizations have created playbooks within their SOAR or workflow automation tools, they are likely outdated or incorrect because automation is not handled by the analysts. The engineers do it.

An Honest Introduction to Automating the SOC

I’ve spent the last 90+ days digging into the AI SOC Analyst and SOAR market, talking to customers, analysts, and more.

Automating the SOC is not an easy problem to solve. Anyone who tells you their tool will work magically out of the box on day one is selling you a fantasy, and Legion is not here to tell you it will either.

  • Some alerts are predictable, but many are context-dependent and demand human judgment.
  • Integrations break. APIs can make things easier, but still need to be managed.
  • And through it all, your good analysts remain your most valuable asset. Automation should make them faster and more effective, not try to replace them.

Legion's approach is built on this reality.

How Legion Security Automates SOC Workflows

Legion’s approach is built on one simple principle: the SOC lives in the browser. Analysts do their real work inside SaaS consoles, cloud admin panels, EDR dashboards, and threat intel portals, all in the browser. That’s where detections are reviewed, logs are queried and analyzed, and decisions are made.

Instead of forcing your team into an abstract "playbook tool" built on API connections, Legion instruments the browser itself. This gives you a clear view of what an analyst clicks, searches, copies, and correlates. This is the actual audit trail of how investigations and responses are conducted. This visibility is (we believe) the best way for automating workflows that actually match how your team operates.

Legion breaks this down into three practical, trust-based modes:

  1. Recording Mode: Legion captures every step your best analysts take. It watches how they handle triage, pull context, enrich data, and close cases. This builds a bank of proven workflows, not theoretical runbooks. These recordings become reusable playbooks grounded in real analyst behavior.

  2. Guided Mode: Next, Guided Mode turns those recordings into automations. When a new alert comes in, the analyst runs the investigation AI-in-the-loop, where Legion completes the investigation and provides recommendations for next steps at each decision node. Junior analysts don’t have to start from scratch. The guidance is readily available, right inside their workflow. This closes skill gaps and standardizes how your team works.

  3. Autonomous Mode: Finally, Legion can run trusted workflows end-to-end in Autonomous Mode. But only for well-understood, repeatable scenarios you've already vetted. When a ticket is opened, Legion executes the steps your team already does manually. There's no black-box decision-making or surprise actions outside what you’ve already proven works.

By focusing on how your real analysts work and only automating what they’ve shown to be effective, Legion enables you to build true automation that adapts and improves over time.

Use Cases for the Legion AI SOC Analyst

  • Workflow Documentation: Create comprehensive workflow maps of how your SOC analysts handle alert triage and investigations.
  • Alert Triage & Investigations: Automate noisy Tier 1 triage, enrich alerts with context, and auto-close junk. These can include cloud, identity, phishing, vulnerability management, and more. Because we are not limited by integrations, Legion can automate any SOC workflow.
  • Reporting & Incident Summarization: Generate incident timelines and report on key metrics such as MTTA/MTTR.
  • Process Improvement: Spot process gaps and bottlenecks, and optimize workflows across analysts.
  • SOC Training: Don’t let your tribal knowledge leave with your best analysts. By mapping out your processes, your junior analysts can train by “looking over the shoulder” of Legion in guided mode.

Final Thoughts

SOC automation shouldn’t be magic (even if it feels like it sometimes). It's grounded in observing, guiding, and learning from your real workflows.

Legion’s AI SOC analyst doesn’t pretend to replace humans. It records what your best people do, guides new analysts, and automates the repeatable. Over time, your analysts can focus on improving workflows, upleveling their security skills, improving detections, and more. Automate your SOC the way your team actually works with Legion.

SOC
Your AI SOC Should Automate How Your Team Actually Works
September 18, 2025
0 min read

Your AI SOC should automate your team's real workflows, not theoretical playbooks. Learn how Legion's approach helps security teams improve processes for faster, more effective security operations.

Liam Barnes