← Back to Blog

Richard Batt |

Agent Teams Are Here: How to Use Claude Opus 4.6's Multi-Agent Feature for Real Business Work

Tags: AI Tools, Productivity

Agent Teams Are Here: How to Use Claude Opus 4.6's Multi-Agent Feature for Real Business Work

What Agent Teams Actually Are

When Anthropic released Claude Opus 4.6 on February 5, 2026, the feature everyone focused on was benchmark performance. Higher reasoning scores. Better coding ability. Longer context windows. But buried in the release notes was something more transformative: multi-agent coordination: the ability for teams of agents to decompose complex tasks, specialize in sub-domains, and hand off work to each other in coordinated workflows.

This isn't just running multiple requests in parallel. That's trivial and doesn't improve output quality. This is actual task decomposition with structured inter-agent communication. One agent gathers information from multiple sources. A second agent receives that gathered information and analyzes it for patterns. A third agent receives the analysis and synthesizes findings into a client-ready report. Agents explicitly understand who consumes their output and what format that consumer needs. They coordinate handoffs. They refine each other's work by building on it rather than re-doing it. It's genuinely different from single-agent processing, which runs all stages sequentially within one model.

Why Agent Teams Matter for Business

Most business workflows are inherently multi-stage. Research projects need gathering information from diverse sources, analyzing for patterns and meaning, and synthesizing findings into narrative. Code reviews need evaluation for quality and problems, implementation of improvements, and testing of changes. Client deliverables need content creation, design and formatting, and quality assurance checks. These stages are sequential and interdependent, but they don't all require the same type of thinking. Gathering is different from analyzing. Analyzing is different from synthesizing.

A single agent can run through all these stages sequentially: gather data, analyze it, write report. It works. The output is coherent because one model understands the full context. But agent teams parallelize the work while maintaining context and coordination. A Researcher can gather sources while an Analyzer independently analyzes data (potentially from previous research) while a Report Writer synthesizes findings: all happening in overlapping stages instead of a purely linear one. The Researcher doesn't need to wait for analysis to start gathering. The Analyzer doesn't need to wait for the full research to complete before starting analysis.

The result: faster completion (elapsed time shrinks even if token cost slightly increases), better output quality (because agents can specialize in their domain instead of being generalists), and often lower per-output token cost (because agents focus only on their stage, not re-processing work already done). The business outcome is faster time-to-value on complex tasks.

When Single Agents Are Actually Better

Before building agent teams, understand when they're overkill and when the overhead outweighs the benefits. Single agents win when:

  • The task is inherently sequential and tightly coupled: If stage 2 depends entirely on stage 1's specific output format and stage 1 must be completely finished before stage 2 can begin meaningfully, inter-agent communication overhead exceeds time savings. One agent moving through stages is faster because there's no handoff latency and the model maintains perfect context continuity.
  • Context is extremely tight: If the entire task fits comfortably in one agent's context window (under 20,000 tokens), splitting it across agents costs tokens on inter-agent communication (passing context between agents, redundant framing) that you'd save by staying unified. The token cost of splitting exceeds the token cost of sequential processing.
  • The task is small: If research takes 5 minutes end-to-end, agent team setup overhead (creating separate agent systems, managing handoffs, coordination latency) isn't worth it. Single agents scale down elegantly. Teams don't. Task decomposition only makes sense for tasks large enough that parallelization saves meaningful time.
  • Output format is unforgiving: If the output must be perfectly deterministic: identical every time it runs: and inter-agent variation would break downstream systems, keep it unified. One agent, one context, one reasoning path, zero variation. Multiple agents introduce variation through different reasoning paths.

Agent teams are most valuable for complex, multi-stage workflows where parallelization and specialization matter more than simplicity, where tasks are large enough that elapsed time matters, and where agents can operate independently on their sub-tasks.

Real Use Case 1: Research + Report Generation

One of our partners runs market research for enterprise software. They need to gather news, analyze trends, synthesize findings, and generate a formatted report: all weekly. Previously, one analyst (or one agent) did all four steps sequentially.

We restructured it as a three-agent team:

  • Researcher Agent: Gathers data from news sources, industry reports, earnings calls, and analyst notes. Its only job is to collect raw information and structure it into a findings document.
  • Analyst Agent: Receives the researcher's output. Analyzes trends, identifies patterns, assesses competitive implications. Produces an analysis document with labeled insights.
  • Report Agent: Receives the analyst's insights. Formats findings into an executive summary, key takeaways, and recommendation sections. Produces the final report.

The researcher and analyst work in parallel while the report agent waits. Once both are done, the report agent synthesizes the output. The workflow is: Researcher and Analyst run simultaneously (elapsed time: ~30 seconds), then Report Agent runs (elapsed time: ~15 seconds). Total: ~45 seconds.

A single agent doing all three stages sequentially: ~60 seconds. The time savings seem modest, but they compound across 50 reports weekly, plus the analyst agent produces better analysis because it's focused solely on analysis, not also managing research data.

Real Use Case 2: Code Review + Refactoring

An engineering team was using Claude Opus for code review. They'd paste a PR, ask for analysis, then ask for refactoring suggestions, then ask for test cases. Three sequential steps, one agent, repetitive context-loading.

We split it into three agents:

  • Reviewer Agent: Takes the code. Reviews it for quality, security, performance, and design issues. Produces a review document with specific findings.
  • Refactorer Agent: Receives the reviewer's findings. Implements recommended changes, maintains test coverage. Produces refactored code with explanations.
  • Test Agent: Receives the refactored code. Expands test coverage based on refactoring changes. Produces updated tests and coverage reports.

Result: Reviewer and Refactorer run in parallel. Test Agent waits for Refactorer's output. The workflow is faster, plus each agent optimizes for its specific task. The reviewer focuses on analysis, not implementation feasibility. The refactorer focuses on code quality, not test design.

Real Use Case 3: Client Deliverable Assembly

A consulting firm was manually assembling client reports: analysts write sections, designers format them, QA checks everything. Handoff between each stage. Rework when formatting breaks content. Days of elapsed time.

We built a three-agent team:

  • Content Agent: Writes all deliverable sections: executive summary, findings, recommendations, appendices. Produces raw content without formatting.
  • Format Agent: Receives content sections. Applies branding, formats for client brand guidelines, optimizes layout for readability. Produces formatted output.
  • QA Agent: Receives formatted deliverable. Checks for consistency, verifies claims against source data, ensures all client requirements are met. Produces final deliverable with QA sign-off.

Content Agent and Format Agent can partially overlap (Format Agent can start formatting as sections arrive). QA Agent waits for Format Agent. The workflow compresses from 3 days to 3-4 hours elapsed time because stages overlap and each agent specializes.

Structuring Agent Team Prompts

The critical skill is writing prompts that make agent specialization work. Each agent needs to understand three things: its specific role, what output it produces, and how other agents will use that output.

Here's the pattern. For a Researcher Agent in a market research team:

"You are the Researcher Agent on a market analysis team. Your role is to gather market intelligence from news, reports, and earnings calls. You output a structured research document with categories: Market Drivers, Competitive Activity, Regulatory Changes, Technology Trends. Your output will be analyzed by the Analyst Agent, so structure it clearly with specific, cited findings. Do not attempt analysis or recommendation: that's the Analyst's job. Focus exclusively on accurate data gathering and clear structuring."

Compare to a generic "analyze this market" prompt. The specialist prompt is clear about scope, output format, and downstream usage. Agents that understand their niche produce better handoffs.

Here's the Analyst Agent prompt:

"You are the Analyst Agent on a market analysis team. You receive a Research Document from the Researcher Agent. Your job is to identify patterns, assess competitive implications, and recommend strategic focuses. Output a structured analysis with sections: Trend Analysis, Competitive Assessment, Strategic Implications. Your output feeds the Report Agent, which will synthesize your insights into an executive summary, so prioritize clarity and actionability. Do not re-research: trust the Researcher's findings and focus on interpreting them."

This agent understands it's consuming research (not producing it), producing analysis (not reports), and handing off to a report writer. Clear scope, clear handoff points, clear output format.

How Agents Communicate and Coordinate

Inter-agent communication in Claude Opus happens through structured handoff. One agent completes its work and produces output. The next agent receives that output as context (or as part of its system prompt), then processes it.

The handoff looks like this: Researcher Agent produces a document like "Market Drivers: Rising enterprise spending on AI infrastructure (+18% YoY in Q4). Key vendors: OpenAI partnership announcements, Google Gemini enterprise releases, Anthropic focus on enterprise safety..." The Analyst Agent receives this exact document and analyzes it.

This is simpler than you might expect. It's not message passing or complex inter-process communication. It's one agent's output becoming the next agent's input. The coordination happens through prompt design: each agent understands what to expect from previous agents and what format subsequent agents need.

For teams with sequential dependencies, use explicit output format specifications. Tell the Researcher Agent: "Output a JSON array with objects containing 'finding', 'source', 'date', 'confidence_level'. The Analyst will parse this format." Then tell the Analyst: "Input is a JSON array of research findings. Parse it and analyze each entry."

For teams with parallel stages that merge, use a consolidation agent. Researcher and Analyst run simultaneously. Report Agent receives both outputs and synthesizes them. This requires the Report Agent to handle multiple input formats, so be explicit: "You will receive two documents: Research Findings (structured as...) and Analysis (structured as...). Integrate both into a cohesive report."

Common Mistakes and How to Avoid Them

Mistake 1: Expecting Perfect Inter-Agent Consensus

When three agents contribute to a deliverable, they might have different interpretations or recommendations. The Reviewer might flag a security concern. The Refactorer might prioritize performance. The Test Agent might need more coverage. These agents don't automatically agree.

Solution: Build a Consensus or Arbitration agent that reviews disagreements and decides. Or structure prompts so agents understand they should defer to specific other agents on specific topics: "If the Reviewer flags a security issue, treat it as highest priority regardless of other concerns."

Mistake 2: Assuming Parallel Agents Are Faster

Running agents in parallel increases aggregate token cost because each agent loads the full context independently. Two agents in parallel costs roughly 2x tokens. The speedup is real (elapsed time cuts in half), but token efficiency gets worse.

Solution: Parallelize only when the speed gain justifies the token cost. For a 10-minute task split into parallel 5-minute tasks, the parallelization saves 5 minutes of elapsed time at 2x token cost. Worth it if speed matters. For background batch processing where speed doesn't matter, keep it sequential.

Mistake 3: Poor Handoff Specification

If the Researcher outputs free-form text and the Analyst expects structured JSON, the Analyst wastes tokens parsing and extracting. If the Review Agent writes detailed analysis but the Refactorer needs only specific recommendations, the Refactorer wastes tokens filtering signal from noise.

Solution: Explicitly specify output format in every agent's prompt. "Output a JSON object with keys: 'issues', 'recommendations', 'priority_level'." Then the next agent consumes this exact structure without parsing overhead.

Mistake 4: Over-Specialization

If you split work into eight agents (Researcher, Analysis-Financial, Analysis-Competitive, Analysis-Technical, Report-Executive, Report-Detailed, QA-Content, QA-Format), you introduce coordination complexity that outweighs specialization benefits. Too many agents become slow and token-expensive.

Solution: Aim for 2-4 agent teams. Each agent is specialized enough to excel at its task, but not so specialized that you need a management layer just to coordinate them. Three agents is often ideal. Five is usually too many.

Mistake 5: Not Monitoring Agent Failure

If the Researcher Agent hallucinates sources or the Analyst misunderstands data, the downstream agents inherit that error. Unlike human teams where QA catches mistakes, agent teams can fail silently in distributed ways.

Solution: Every agent team needs a QA or Validation stage. Even if it's just one agent reviewing the final output against original requirements, validation matters. Or build explicit error-checking into handoff prompts: "Verify that all facts in the Researcher's output are properly sourced before analyzing them."

The Prompt Template for Building Agent Teams

Here's a reusable template for any agent team:

"You are the [Agent Name] on a [Team Description] team. Your role is [Specific Responsibility]. You receive [Input Description] from [Previous Agent Name]. You produce [Output Description] for [Next Agent Name]. Your output format is [Specific Format]. Constraints: [What you should NOT do, which other agents will handle]. Focus exclusively on [Your Specific Domain]. Do not [Tasks delegated to other agents]."

Examples:

"You are the Researcher on a market analysis team. Your role is gathering intelligence. You receive analyst requirements from the Analyst Agent (as a JSON list of research topics). You produce a Research Document with findings organized by topic. Your output format is JSON with keys 'topic', 'findings', 'sources', 'confidence'. Constraints: Do not interpret or analyze: stick to facts. Focus exclusively on accurate research and clear citation. Do not attempt recommendations or strategy analysis."

"You are the Report Generator on a market analysis team. Your role is synthesizing findings into client-ready output. You receive Research Document from Researcher and Analysis from Analyst. You produce a formatted report with sections: Executive Summary (3 paragraphs), Key Findings (5 bullets), Strategic Implications (3 paragraphs), Appendix (supporting data). Focus exclusively on clarity and client communication. Do not re-analyze or re-research: synthesize existing work."

Measuring Agent Team Performance

How do you know if your agent team is working? Track four metrics:

  • Elapsed time: How long from initiation to completion? Agent teams should be 30-60% faster than sequential single-agent approaches for the same task.
  • Token efficiency: How many tokens per unit of output? Agent teams might use more total tokens (due to parallel processing and redundant context), but should use fewer tokens-per-value-unit because agents specialize.
  • Output quality: Do deliverables meet requirements? Agent teams should improve quality because specialization allows deeper focus on each stage. Measure against rubrics, not just completion.
  • Error rate: What percentage of outputs need rework? If agent teams introduce more errors due to coordination issues, the specialization benefit is lost. Track this carefully.

If your agent team is slower than single-agent, costing more tokens, or producing lower quality, it's misconfigured. Go back to single-agent or restructure the team.

When to Add a Fifth Agent: The Coordinator

If your team grows beyond three agents or dependencies become complex, add a Coordinator agent. This agent doesn't produce the core deliverable: it manages other agents.

The Coordinator: (1) Receives the initial task. (2) Decomposes it into subtasks for other agents. (3) Routes subtasks to appropriate agents. (4) Receives outputs from agents. (5) Decides next steps (more analysis? more research? synthesis ready?). (6) Orchestrates the final output.

The Coordinator pattern works for complex knowledge work but adds a layer of latency and cost. Only use it if managing the team without coordination becomes error-prone.

Token Economics: When Agent Teams Save Money

This might seem counterintuitive: agent teams sometimes reduce total token cost despite running multiple agents. How?

Scenario 1: Parallel agents with task specialization. A single agent processes research, analysis, and reporting sequentially. It loads all context for each stage. An agent team has Researcher load research context, Analyst load analysis context, Report Generator load report context. Each agent loads only its domain-specific context, reducing redundant token loading. Total tokens: agent team 20% less than single agent, despite running in parallel.

Scenario 2: Handoff efficiency. A single agent generates research, then analyzes it, then reports on it. Each stage reprocesses previous work. Agent teams hand off structured output. The Analyst receives JSON research findings, not the full reasoning process. The Report Generator receives high-level insights, not raw research. Handoffs compress context, reducing downstream token cost.

Scenario 3: Termination gates. A single agent sometimes over-thinks a task. "Let me gather more data, analyze it more deeply, consider more implications." Agent teams have clear termination points. Researcher finishes gathering (with clear scope), hands off. Analyst finishes analyzing (with clear scope), hands off. Termination gates prevent analysis paralysis and reduce token cost.

In practice: a single agent processing a research task might use 4,000 tokens (full process with redundancies). An agent team might use 3,200 tokens (parallel processing with specialization and compression). Token savings: 20%. Speed improvement: 40-60% faster. Quality: often better because of specialization. Agent teams can dominate on cost, speed, AND quality when designed well.

Practical Example: Building a Research Agent Team

Let's walk through building an actual agent team for a market research task. You need: (1) Gather news about AI spending. (2) Analyze competitive positioning. (3) Produce a client report.

Researcher Agent Setup:

System Prompt: "You are the Researcher on a market analysis team. Your role is gathering current intelligence. You are given a topic: [Topic]. Search, curate, and cite findings about [Topic] from the past 30 days. Output a JSON object with keys: 'topic', 'findings' (array of objects with 'claim', 'source', 'date', 'link'), 'summary' (1 paragraph). Focus on accuracy and citation. Do not interpret or analyze."

Analyst Agent Setup:

System Prompt: "You are the Analyst on a market analysis team. You receive a Research Document. Analyze the findings for: (1) Market trends (what's changing?), (2) Competitive implications (how does this affect the competitive market?), (3) Strategic importance (what should leaders care about?). Output a JSON object with keys: 'trends' (array of strings), 'competitive_implications' (paragraph), 'strategic_importance' (paragraph). Do not re-research. Interpret the Researcher's findings only."

Report Agent Setup:

System Prompt: "You are the Report Generator on a market analysis team. You receive Research Document and Analysis. Produce a client-ready report with sections: (1) Executive Summary (3 paragraphs summarizing key findings), (2) Market Trends (2 paragraphs), (3) Competitive Impact (2 paragraphs), (4) Recommendations (3 bullets). Output as HTML with proper formatting. Focus on client communication, not additional analysis."

Execution: Researcher and Analyst run in parallel. Once both complete, Report Agent synthesizes and produces final output.

Scaling Agent Teams Across Your Business

Start with one working team. Document the setup, measure performance, refine. Once you have one successful pattern, replicate it to similar workflows. Market research teams. Code review teams. Customer support escalation teams. Client deliverable teams. Internal knowledge synthesis workflows.

Each will have slightly different agent roles and handoff formats, but the underlying structure is the same: specialized agents with clear scope, explicit handoff formats, coordinated workflow orchestration, and measurable impact metrics.

Don't try to build five agent teams simultaneously. That's an organizational coordination nightmare and you'll optimize for none of them well. Master one team. Measure it. Document what works. Optimize based on data. Then expand to similar workflows. Agent teams are powerful, but their power comes from specialization and clear inter-agent communication, which requires careful design, iteration, and real-world testing. Don't architect teams in a conference room. Build them iteratively in production.

Measuring Real Impact: A 90-Day Roadmap

Week 1-2: Pick your first agent team workflow (preferably high-volume and multi-stage). Design the agents and prompts based on the framework above. Test with 10 real examples. Measure elapsed time and token cost versus your current single-agent baseline.

Week 3-4: Run the agent team on 100 real examples from your actual workflow. Measure output quality against your own reference standards (human judgments or existing rubrics). Track error rates and rework percentage. Compare token-per-output cost against baseline.

Week 5-8: Optimize based on actual data. Adjust handoff formats if downstream agents are wasting tokens parsing upstream output. Refine agent prompts if you see specific failure patterns. Run another 100 examples and re-measure against your baseline.

Week 9-12: Document the successful pattern and configuration. Build playbooks for replicating this agent team structure to similar workflows in your organization. Plan rollout to 2-3 additional workflows.

By week 13, you have one proven agent team running in production, real data on its impact, and a clear pattern for scaling. This beats trying to architect five theoretical teams in parallel and delivering none of them well.

The Future: Agent Teams Become Your Competitive Moat

Agent teams are where AI goes from "powerful single tool" to "organizational infrastructure." Right now they feel novel because they're new. In 12 months, they'll feel like standard architecture for any complex knowledge workflow. By 2027, teams that haven't built agent capability will be at a severe disadvantage.

The organizations that build agent capability now: that understand how to decompose complex tasks into agent sub-roles, specify clear handoff formats, coordinate agents, and measure impact: will have massive competitive advantages when agent systems become the standard deployment model for knowledge work. Start now while the patterns are still being established. The teams that figure out their specific agent configurations first will own those workflows, reduce time-to-value, lower operational costs, and improve output quality simultaneously.

Richard Batt has delivered 120+ AI and automation projects across 15+ industries. He helps businesses deploy AI that actually works, with battle-tested tools, templates, and implementation roadmaps. Featured in InfoWorld and WSJ.

Frequently Asked Questions

How long does it take to implement AI automation in a small business?

Most single-process automations take 1-5 days to implement and start delivering ROI within 30-90 days. Complex multi-system integrations take 2-8 weeks. The key is starting with one well-defined process, proving the value, then expanding.

Do I need technical skills to automate business processes?

Not for most automations. Tools like Zapier, Make.com, and N8N use visual builders that require no coding. About 80% of small business automation can be done without a developer. For the remaining 20%, you need someone comfortable with APIs and basic scripting.

Where should a business start with AI implementation?

Start with a process audit. Identify tasks that are high-volume, rule-based, and time-consuming. The best first automation is one that saves measurable time within 30 days. Across 120+ projects, the highest-ROI starting points are usually customer onboarding, invoice processing, and report generation.

How do I calculate ROI on an AI investment?

Measure the hours spent on the process before automation, multiply by fully loaded hourly cost, then subtract the tool cost. Most small business automations cost £50-500/month and save 5-20 hours per week. That typically means 300-1000% ROI in year one.

Which AI tools are best for business use in 2026?

For content and communication, Claude and ChatGPT lead. For data analysis, Gemini and GPT work well with spreadsheets. For automation, Zapier, Make.com, and N8N connect AI to your existing tools. The best tool is the one your team will actually use and maintain.

Put This Into Practice

I use versions of these approaches with my clients every week. The full templates, prompts, and implementation guides, covering the edge cases and variations you will hit in practice, are available inside the AI Ops Vault. It is your AI department for $97/month.

Want a personalised implementation plan first?Book your AI Roadmap session and I will map the fastest path from where you are now to working AI automation.

← Back to Blog