Network diagram showing multiple AI agents communicating and collaborating

Debate, Critique, and Adversarial Agents

AGAI 301 · Coordination, Consensus, and Emergent Behavior

Learn how debate and critique patterns use disagreement productively to improve reasoning, catch errors, and expose weak assumptions.

Key terms

critique = draft + rubricdebate → exposed assumptionsadversarial agent = controlled red teamjudge quality determines final quality

Learning objectives

Differentiate critique, debate, and adversarial agent patterns.
Design a critique rubric for agent outputs.
Explain how debate can improve reasoning and tradeoff analysis.
Identify risks introduced by adversarial or debate-based systems.

Multi-agent systems can use disagreement as a tool. Instead of asking one agent to produce an answer, you can ask multiple agents to propose, challenge, critique, and revise. This is especially useful for reasoning-heavy tasks, policy analysis, code review, research synthesis, and safety evaluation.

A basic critique architecture looks like this:

[Generator] → draft
[Critic] → identifies problems
[Reviser] → improves draft
[Judge] → accepts or requests another revision

A debate architecture uses multiple agents to argue for different conclusions or approaches:

[Agent A: Position 1] ↔ [Agent B: Position 2]
              ↓
           [Judge]
              ↓
          Final answer

The purpose is not theatrical argument. The purpose is error discovery.

Critique agents

A critique agent reviews another agent’s output against a rubric. It may check for factual support, missing requirements, invalid format, safety issues, or weak reasoning.

Example critic prompt:

You are a strict technical reviewer. Review the draft for:
1. Unsupported factual claims
2. Missing edge cases
3. Incorrect assumptions
4. Ambiguous recommendations
5. Security or privacy concerns

Return only a list of issues and suggested fixes. Do not rewrite the draft.

The instruction “Do not rewrite the draft” matters. It keeps the critic focused.

Debate agents

Debate agents are useful when there are multiple plausible answers. For example, selecting an architecture for an AI product might involve tradeoffs among speed, reliability, cost, and flexibility.

A debate setup:

Agent A: Argue for a ReAct architecture.
Agent B: Argue for a structured workflow architecture.
Agent C: Judge both arguments against reliability, cost, and safety.

The final answer may combine insights:

Use a structured workflow for high-impact actions, with a ReAct research node for open-ended information gathering.

Debate is most useful when agents are instructed to represent distinct criteria, not when they randomly disagree.

Adversarial agents

An adversarial agent intentionally searches for weaknesses. It may attempt to break assumptions, find security flaws, identify prompt-injection risks, or construct edge cases.

Examples:

A red-team agent tests whether a customer support agent can be tricked into revealing private data.
A security agent reviews model-generated code for injection flaws.
A policy agent checks whether a draft violates compliance rules.
A test-generation agent creates inputs likely to break an extraction pipeline.

Adversarial agents should be constrained. They are meant to improve safety, not to execute harmful actions.

Example: multi-agent code review

A code review workflow might include:

[Implementer]
  ↓ creates patch
[Correctness Reviewer]
  ↓ checks whether patch solves the bug
[Security Reviewer]
  ↓ checks unsafe inputs, auth, secrets
[Performance Reviewer]
  ↓ checks complexity and resource use
[Test Agent]
  ↓ runs tests or proposes missing tests
[Final Judge]
  ↓ summarizes required changes

Each reviewer has a different lens. This is more useful than asking five agents to perform a generic review.

Structured reviewer output:

{
  "reviewer": "SecurityReviewer",
  "status": "changes_requested",
  "findings": [
    {
      "severity": "high",
      "issue": "User-controlled input is interpolated into SQL query.",
      "recommendation": "Use parameterized queries."
    }
  ]
}

The final judge can merge these findings.

Risks of debate and critique

Debate can improve quality, but it can also create problems.

Risks include:

Agents may produce persuasive but unsupported arguments.
The judge may prefer confident language over correct evidence.
Agents may converge on a shared false assumption.
Debate may increase cost and latency.
Critique may become superficial if the rubric is vague.
Adversarial agents may generate unsafe content if not constrained.

The solution is grounding. Debate should use evidence, tool results, citations, tests, schemas, or deterministic validators whenever possible.

Judge design

The judge is critical. A weak judge can choose the wrong output. A good judge should follow explicit criteria.

Example judge rubric:

Choose the answer that:
1. Best satisfies the user request.
2. Is best supported by provided evidence.
3. States uncertainty accurately.
4. Avoids unsupported claims.
5. Respects safety and permission constraints.

For high-risk decisions, the judge should not be only a model. Use deterministic checks or human review.

Practical takeaway

Debate and critique patterns turn disagreement into quality control. They are valuable when tasks involve tradeoffs, uncertainty, safety, or complex reasoning.

Use them deliberately. Define roles, rubrics, evidence requirements, and stopping rules. Do not add debate just to make the system seem intelligent. Add debate when it catches errors that a single agent is likely to miss.

Ask your AI guide

AI Chat· Multi-Agent Systems — Debate, Critique, and Adversarial Agents

🤖

Ask anything about Multi-Agent Systems — Debate, Critique, and Adversarial Agents, or choose a suggested question below.

AI responses are educational and may not be perfectly accurate. Press Enter to send, Shift+Enter for new line.