Agent Design Principles

Before building an agent, ask three questions:

What goal is the agent trying to accomplish?
What tools and information does it need?
What should it never be allowed to do?

These questions are more important than the model choice. Many agent failures come not from weak models, but from vague goals, excessive permissions, missing feedback, and unclear stopping conditions.

A strong design principle is minimal authority. Give the agent the least power required to complete the task. If it only needs to read files, do not give it write access. If it only needs to draft a message, do not let it send the message automatically. If it only needs to recommend a database change, do not let it execute the change without review.

Another principle is reversibility. Prefer actions that can be undone. Drafting is safer than sending. Creating a pull request is safer than pushing directly to production. Archiving is safer than deleting. Simulating a workflow is safer than executing it.

Agents should also be transparent. Users and developers need to understand what the agent attempted, what tools it used, what changed, and where uncertainty remains.

Finally, agents should degrade gracefully. When they cannot complete a task, they should explain the blocker, preserve useful partial progress, and avoid pretending that the task succeeded.

The system prompt is one of the main tools for shaping agent behavior. It defines the agent’s role, goals, constraints, tool-use policy, safety boundaries, and response style.

Worked example: minimal authority

Imagine you are building a simple research agent for technical topics. The user wants it to gather information, compare sources, and produce a concise briefing.

A tempting tool set might include:

- web_search
- fetch_url
- read_pdf
- summarize_document
- send_email
- write_to_database
- post_to_company_wiki
- browse_any_website

This is too much authority for a first version. The agent’s purpose is research, not publishing or communication. A safer initial tool set would be:

- web_search(query)
- fetch_url(url)
- extract_text_from_pdf(url)
- save_draft(title, body)

The excluded tools are just as important as the included tools. The agent should not send emails because research results may need review. It should not post to the company wiki because publishing has organizational consequences. It should not write to arbitrary databases because that creates data integrity risk.

The save_draft tool is a good compromise. It lets the agent produce useful work while keeping a human in the loop. If the draft is wrong, it can be edited or discarded.

Minimal authority does not make an agent weak. It makes the agent safer, easier to evaluate, and easier to trust.

Sample system prompt for a research agent

A system prompt should be specific enough to guide behavior but not so long that the model loses the central task. Here is a sample:

You are a technical research agent. Your job is to help users produce accurate, concise research briefings on software and AI topics.

Goals:
- Clarify the user's research question when needed.
- Use available tools to gather current and reliable information.
- Prefer primary sources such as official documentation, standards, release notes, and research papers.
- Compare sources when claims are uncertain or contested.
- Produce a final briefing that separates facts, interpretations, and uncertainties.

Tool policy:
- Use web_search when you need to locate sources.
- Use fetch_url only for pages that are likely to be relevant.
- Use extract_text_from_pdf for papers, reports, or official PDFs.
- Do not invent citations or claim to have read a source you did not inspect.

Safety and authority:
- You may create drafts, summaries, and recommendations.
- You may not send messages, publish content, purchase products, or modify external systems.
- If a user asks for an action outside your authority, explain the limitation and offer a draft or checklist instead.

Failure behavior:
- If sources conflict, describe the disagreement.
- If information is unavailable or stale, say so.
- If the task cannot be completed, provide partial findings and the reason.

Response style:
- Be clear, professional, and concise.
- Include source notes when relevant.
- End with practical next steps when useful.

The first section defines the role. The goals define success. The tool policy tells the agent when to use tools. The safety section limits authority. The failure behavior prevents false confidence. The response style keeps outputs consistent.

A good system prompt does not guarantee correct behavior, but it creates a stable behavioral contract between the application and the model.

Designing for error states

Agents fail in predictable ways. They may call the wrong tool, pass invalid arguments, retrieve irrelevant information, misunderstand the user’s goal, encounter unavailable services, or reach a contradiction.

You should design error behavior before deployment.

For example, if a search tool returns no results, the agent should not hallucinate. It might try a broader query, ask for clarification, or report that it could not find reliable information.

If a tool returns an error, the agent should distinguish between temporary and permanent failures:

Temporary failure:
- Network timeout
- Rate limit
- Service unavailable

Permanent or task-level failure:
- User lacks permission
- Record does not exist
- Required field is missing
- Requested action is outside policy

For temporary failures, retrying may be reasonable. For permanent failures, retrying wastes time and may cause harm. The agent should explain the blocker and stop or ask for the missing requirement.

When an agent cannot complete a task, a good response includes:

What it attempted
What failed
What partial result it has
What the user or system can do next

This is much better than a vague apology. In professional systems, partial progress is valuable.

Stopping conditions

A common beginner mistake is designing an agent that keeps going without clear stopping rules. Every agent needs conditions for success, failure, and escalation.

Success might mean:

A final answer has been produced with sufficient evidence.
A test suite passes.
A draft has been created.
A requested record has been found.

Failure might mean:

Required data is unavailable.
The user request is ambiguous and cannot be resolved.
A tool repeatedly fails.
The task requires authority the agent does not have.

Escalation might mean:

Ask the user for approval.
Hand off to a human expert.
Create a draft instead of executing an action.
Open a ticket for review.

Stopping conditions prevent wasted tool calls and reduce the chance of compounding errors.

Introduction to evaluation

Evaluation answers the question: How do you know the agent is working?

For a simple research agent, you might evaluate final answers using criteria such as:

Accuracy
Source quality
Completeness
Clarity
Appropriate uncertainty

But agent evaluation should also inspect the process:

Did the agent use tools when needed?
Did it avoid unnecessary tools?
Did it choose reliable sources?
Did it recover from failed tool calls?
Did it respect authority limits?
Did it stop at the right time?

A practical evaluation set might contain 25 to 100 representative tasks. Include easy cases, ambiguous cases, tool-failure cases, and cases where the agent should refuse or escalate.

Example test case:

{
  "task": "Create a short briefing on the latest stable release of Framework Y.",
  "expected_behaviors": [
    "Searches for official documentation or release notes",
    "Does not rely only on model memory",
    "Reports the version with source context",
    "Mentions uncertainty if sources disagree",
    "Does not invent unsupported claims"
  ]
}

Evaluation does not need to be perfect at the beginning. The key is to make behavior observable. Log tool calls, inputs, outputs, errors, and final responses. Review failures and turn them into new test cases.

Practical build sequence

For your first agent, keep the scope narrow:

Pick one task type.
Give the agent read-only tools first.
Add one safe output action, such as creating a draft.
Log every tool call.
Test with realistic examples.
Add permissions only when necessary.
Require confirmation for irreversible actions.

This sequence creates a path from prototype to production without giving the agent excessive autonomy too early.

The best beginner agents are not the most autonomous. They are the clearest, safest, and easiest to inspect. Once you can trust the loop, the tools, the memory, and the evaluation process, you can gradually expand what the agent is allowed to do.

Learning objectives