Environments and Memory

An agent does not operate in a vacuum. It acts inside an environment: the external context it can observe, reason about, and change.

For a web-browsing agent, the environment includes pages, links, forms, search results, browser state, and network responses. For a coding agent, the environment includes the repository, files, tests, package manager, terminal output, and version control state. For a customer support agent, the environment includes user messages, account records, support policies, ticket history, and available business actions.

Environments can be described using several properties:

Observable or partially observable — Can the agent see the full state, or only part of it?
Deterministic or stochastic — Does the same action always produce the same result?
Static or dynamic — Does the environment change while the agent is thinking?
Discrete or continuous — Are states and actions clearly separated, or fluid and ongoing?
Single-agent or multi-agent — Is the agent acting alone, or alongside users, systems, or other agents?

Memory matters because agents often need context beyond the current prompt. Memory helps the agent remember goals, previous actions, user preferences, intermediate results, and long-term knowledge.

Common memory types include:

Working memory — temporary state used during the current task.
Episodic memory — records of past events, interactions, or actions.
Semantic memory — durable facts and concepts.
Procedural memory — learned processes, routines, or strategies.

Environment property examples

A fully observable environment is one where the agent can see everything relevant to the task. A simple chess engine, for example, can observe the entire board. A coding agent working in a small repository may be close to fully observable if it can read all files, run tests, and inspect configuration.

A partially observable environment is more common. A web-browsing agent may see the current page but not the server-side logic, hidden form validation, user session state, or future page changes. A support agent may see a customer’s ticket history but not the customer’s emotional state, internal business exceptions, or missing records in another system.

Deterministic environments are predictable. If a formatter is run on the same file with the same configuration, it should produce the same output. Stochastic environments include uncertainty. A web search may return different results over time. A recommendation API may rank results differently. A human user may respond unpredictably.

Static environments do not change while the agent is working. A local text file is mostly static unless another process edits it. Dynamic environments change during the task. A stock-trading environment, live chat queue, multiplayer game, or production incident dashboard may shift second by second.

These properties affect design. In a partially observable environment, the agent should ask clarifying questions or gather more evidence. In a stochastic environment, it should avoid assuming that one result is guaranteed. In a dynamic environment, it may need to refresh observations before acting.

Worked example: web browsing agent

Consider an agent asked to compare prices for a product.

The environment is partially observable because the agent can see web pages but not every retailer, hidden fees, inventory changes, or personalized pricing rules. It is dynamic because prices and availability can change. It is stochastic because search results and recommendations may vary.

A robust agent should therefore:

Search multiple sources.
Prefer official or reputable pages.
Record timestamps.
Distinguish listed price from final checkout price.
Avoid making purchases without explicit user confirmation.
Report uncertainty clearly.

The same task would be simpler in a static internal database where all prices are known and updated on a fixed schedule.

In-context memory versus vector retrieval

Memory can be implemented in several ways. Two common approaches are in-context memory and vector store retrieval.

In-context memory means placing relevant information directly into the model’s prompt. For example:

User preference memory:
- The user prefers Python examples.
- The user works mostly on Windows.
- The user wants concise explanations unless debugging.

This is simple and fast when the memory is small. The model can immediately use the information because it is already in the context window. But in-context memory does not scale indefinitely. Context windows have limits, and too much irrelevant memory can distract the model.

Vector store retrieval stores many documents or memory records as embeddings. When the user asks a question, the system retrieves the most relevant records and inserts only those into the prompt.

For the same use case, suppose a developer frequently asks questions about many projects. In-context memory might include a short summary of the current project. Vector retrieval might search thousands of past notes, code snippets, architecture decisions, and error logs to find the few most relevant records.

A simplified retrieval flow looks like this:

User asks a question
→ Convert question to embedding
→ Search vector database for similar memory records
→ Retrieve top results
→ Insert results into model context
→ Generate answer

Vector retrieval scales better, but it introduces retrieval risk. The system may retrieve irrelevant records or miss important ones. Developers must tune chunking, metadata, ranking, freshness, and filtering.

Tradeoffs between memory types

Working memory is fast and task-specific. It is ideal for tracking a plan, intermediate results, and recent tool outputs. But it disappears when the task ends unless saved.

Episodic memory is useful for remembering what happened. A support agent might remember that it already offered a refund option. A coding agent might remember that a test failed before the last patch. Episodic memory helps avoid repetition and supports auditability.

Semantic memory stores durable facts. This could include product documentation, user preferences, company policies, or domain knowledge. Semantic memory is useful when facts remain valid across tasks, but it must be updated when reality changes.

Procedural memory stores how to do things. For example, an internal operations agent might learn the standard sequence for triaging a deployment failure. Procedural memory can make agents more efficient, but it must be carefully governed. A bad procedure repeated confidently can cause repeated failures.

The main tradeoffs are:

Speed: In-context memory is usually fastest; retrieval adds latency.
Cost: Larger prompts and retrieval infrastructure cost more.
Reliability: Explicit memory is easier to inspect; retrieved memory depends on search quality.
Freshness: Stored memory can become stale.
Privacy: Persistent memory must be handled with consent, access controls, and deletion mechanisms.

Choosing the right memory approach

Start with the task. A short, one-session data analysis agent may only need working memory. It needs to remember the user’s goal, the files inspected, and the transformations already performed.

A personal assistant may need semantic memory for stable preferences, such as preferred meeting hours or writing style. But it should avoid storing sensitive or unnecessary details.

A research assistant may benefit from vector retrieval over a document collection. It needs to find relevant passages from a large corpus, not remember every document in the prompt.

A customer support agent may need episodic memory inside the current ticket and semantic memory for policies. It should remember what has already been tried while grounding decisions in approved policy documents.

A workflow automation agent may need procedural memory in the form of explicit playbooks rather than vague learned habits. For high-stakes workflows, procedures should be reviewed and versioned.

Practical design rule

Use the smallest memory system that solves the problem. Do not add persistent memory simply because it seems advanced. Memory increases capability, but it also increases risk.

Ask these questions:

Does the agent need to remember information after the current task?
Is the memory stable enough to reuse later?
Can the memory be wrong or stale?
Should the user be able to inspect, edit, or delete it?
Would retrieval improve performance, or would a short context summary be enough?

Agents become more useful when they can remember, but they become more trustworthy when memory is intentional, transparent, and bounded.

Learning objectives