MWMS Agent Loop Control Framework

System: MWMS
Document Type: Framework
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, Deep Search Workflows, Research Workflows, Affiliate Evaluation, Ads Review, Content Workflows, Experimentation Brain, HeadOffice Intelligence, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, AI Employee Dashboards, Future Agent Control Dashboards
Parent Page: HeadOffice
Source Of Truth: MCR
Related Frameworks: MWMS Deep Search Quality And Observability Framework, MWMS AI Observability Metadata Standard, MWMS AI Employee Evaluation Scorecard Standard, MWMS Research Planning And Query Rewriting Standard
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration


Purpose

The purpose of this framework is to define how MWMS controls AI Employee agent loops.

An agent loop is the repeated decision cycle where an AI Employee decides what to do next, performs an action, updates context, evaluates progress, and either continues, answers, escalates, or stops.

MWMS must not rely blindly on hidden SDK-managed loops, overloaded prompts, or uncontrolled tool-calling behaviour for important workflows.

This framework ensures AI Employees operate through:

  • explicit parameters
  • controlled actions
  • shared workflow context
  • next-action selection
  • clear stop conditions
  • fallback behaviour
  • observability
  • evaluation
  • cost control
  • HeadOffice governance

The goal is to give AI Employees controlled autonomy without allowing them to wander, loop endlessly, overuse tools, ignore context, or produce unsupported decisions.


Scope

This framework applies to all MWMS systems where an AI Employee or AI workflow performs multi-step work.

This includes:

  • Deep Search workflows
  • Research Brain source investigation
  • Affiliate Brain offer evaluation
  • Ads Brain compliance review
  • Content Brain research workflows
  • HeadOffice Intelligence workflows
  • Newsletter Intelligence routing
  • Experimentation Brain test analysis
  • Data Brain validation processes
  • Finance Brain cost or budget review
  • Brain Room AI assistance
  • Dev Console AI assistance
  • future client-facing AIBS agents
  • future autonomous AI Employees
  • future workflow agents using tools, databases, source inspection, or routing logic

This framework does not define exact code implementation, TypeScript structure, SDK usage, Langfuse setup, Evalite setup, or model provider configuration.

It defines the MWMS governance architecture for agent-loop behaviour.


Core Rule

An AI Employee must not be governed by one overloaded prompt that performs every function at once.

The MWMS rule is:

One prompt should not control planning, search, source selection, scraping, reasoning, answering, routing, stopping, and escalation all at once.

Important AI work must be broken into controlled actions that can be traced, evaluated, improved, and governed.


Definition Of An Agent Loop

An agent loop is a structured cycle where an AI Employee repeatedly:

  1. Reviews the workflow goal
  2. Reviews the current context
  3. Chooses the next approved action
  4. Executes that action
  5. Records the result
  6. Updates workflow context
  7. Checks stop conditions
  8. Decides whether to continue, answer, route, escalate, or stop

In MWMS, an agent loop is not just a model calling tools.

It is a governed workflow control system.


Why Agent Loop Control Matters

Without controlled loops, AI Employees may:

  • search too many times
  • stop too early
  • answer with weak evidence
  • ignore the original question
  • use tools unnecessarily
  • choose poor sources
  • fail silently
  • exceed cost limits
  • exceed token limits
  • fail to route correctly
  • repeat the same action
  • hallucinate final answers
  • become difficult to debug
  • become difficult to evaluate

A controlled loop prevents important AI behaviour from being hidden inside a single model action or SDK-managed tool loop.


Single Prompt Fragility Rule

A single system prompt becomes fragile when it is responsible for too many jobs.

Examples of jobs that should not all be handled by one undifferentiated prompt:

  • understanding the user request
  • deciding whether research is needed
  • choosing search queries
  • deciding which URLs to inspect
  • scraping or opening sources
  • evaluating source quality
  • deciding whether enough evidence exists
  • writing the final answer
  • deciding whether to stop
  • deciding whether to escalate
  • deciding whether to route to another Brain
  • formatting final output
  • applying compliance checks
  • assigning confidence

When one prompt handles all of these, improving one behaviour can damage another.

For example:

  • improving source selection may weaken final answer style
  • adding compliance caution may reduce search tenacity
  • changing answer format may damage tool usage
  • adding cost limits may cause premature stopping
  • improving speed may weaken evidence quality

MWMS must avoid overloaded prompt architecture.


Agent Vs Workflow Rule

Not every AI system should be treated as an open-ended agent.

Before MWMS creates or upgrades an AI Employee loop, the system must decide whether the task should be:

  • a deterministic workflow
  • an assisted workflow
  • a controlled agent loop
  • a broader agentic workflow
  • a future autonomous agent

The key rule is:

Use AI judgement where judgement is useful. Use workflow rules where predictability is better.

Some tasks should not require the AI Employee to decide every step.

For example, if a workflow always needs a source inspected after search, MWMS should not ask the AI every time whether to inspect the source. It should define a controlled workflow where search and source inspection happen together.

The more business risk involved, the more MWMS should favour controlled workflows over open-ended agent behaviour.


Agentic Dial Rule

MWMS should treat autonomy as a dial, not a yes/no switch.

Each workflow should be assigned an autonomy mode.

ModeMeaningSuitable Use
Deterministic WorkflowFixed path, little or no AI choicehigh-risk, repeatable, compliance-sensitive workflows
Assisted WorkflowAI helps within controlled stepsearly AI Employee work, internal support, structured review
Controlled AgentAI chooses next action from approved optionsDeep Search, research, evaluation, source inspection
Agentic WorkflowAI has broader control but is still logged and limitedmature internal workflows with strong evals
Autonomous AgentAI operates with high autonomy under governanceonly for mature, proven, low-risk or explicitly approved systems

Early MWMS AI systems should usually operate as Assisted Workflow or Controlled Agent, not full autonomy.

AI Employees should only move up the autonomy dial when evaluation, observability, cost control, safety, and human review results prove the workflow is stable.


Workflow First Governance Rule

Before creating an agent loop, MWMS must ask:

Can this be handled better as a workflow?

A workflow-first approach is preferred when:

  • the steps are predictable
  • the source requirements are known
  • the compliance risk is high
  • the cost risk is high
  • the output format must be consistent
  • the process needs repeatability
  • the task must be auditable
  • the AI does not need much judgement
  • failure would create business risk

An agent loop is preferred when:

  • the path is not fully known in advance
  • the AI needs to choose between multiple evidence paths
  • the research question is complex
  • sources may conflict
  • the system must adapt based on findings
  • the workflow requires judgement between approved options
  • the task needs iterative investigation

The rule:

MWMS should not make a workflow agentic unless the task genuinely benefits from agentic behaviour.


Action Consolidation Rule

If two actions almost always belong together, MWMS should consider consolidating them into one workflow step.

This reduces:

  • unnecessary AI decisions
  • cost
  • latency
  • failure points
  • action drift
  • repeated tool use
  • avoidable complexity

Examples:

Separate ActionsConsolidated Workflow Pattern
Search, then maybe inspect sourceSearch plus inspect top eligible sources
Inspect source, then maybe summariseInspect plus summarise source
Read newsletter, then maybe routeExtract signal plus propose routing
Inspect offer page, then maybe check claimsOffer page inspect plus claim extraction
Review ad, then maybe check complianceAd review plus compliance risk check
Source inspect, then maybe scoreSource inspect plus trust/freshness score

Consolidation should be used when the second action is almost always required and the AI decision adds little value.

This keeps MWMS agent loops cleaner, cheaper, and easier to evaluate.


Search Plus Inspect Workflow Pattern

For Deep Search and Research Brain workflows, search and source inspection often belong together.

A weak pattern is:

  1. AI decides to search.
  2. Search results appear.
  3. AI decides whether to inspect a source.
  4. AI may skip inspection and answer from snippets.

A stronger MWMS pattern is:

  1. Research plan defines what evidence is needed.
  2. Search finds candidate sources.
  3. The system automatically inspects top eligible sources according to source rules.
  4. Source summaries are created.
  5. The AI decides whether evidence is sufficient.
  6. The answer is produced only after evidence is inspected or limitations are declared.

This prevents AI Employees from treating search snippets as proof.

The rule:

Search results are leads. Inspected sources are evidence.


When Not To Use An Agent Loop

MWMS should avoid agent loops when:

  • the workflow has a fixed required sequence
  • the task is simple and deterministic
  • the cost of exploration is not justified
  • the AI has no meaningful decision to make
  • compliance risk requires fixed checks
  • the process must follow a strict approval path
  • the result must be generated from internal records only
  • the workflow is already well-defined and repeatable
  • a deterministic pipeline will be safer and cheaper
  • the AI Employee has not been evaluated enough for autonomy

Examples of workflows that may not need an agent loop:

  • checking required fields
  • formatting a known report
  • routing a known newsletter category
  • validating JSON structure
  • applying a fixed compliance checklist
  • generating a standard title
  • summarising a known source record
  • copying data between approved records

Agent loops should be reserved for workflows that benefit from adaptive reasoning.


Controlled Autonomy Principle

MWMS AI Employees may choose what to do next, but only from actions approved by MWMS.

The AI can decide:

  • search
  • inspect source
  • query database
  • answer
  • route
  • escalate
  • retry
  • stop

But MWMS defines:

  • which actions exist
  • which actions are allowed
  • which tools each action may use
  • when actions can happen
  • how many times actions can repeat
  • what each action must return
  • what metadata each action must record
  • when human review is required

This is controlled autonomy.


Agent Loop Structure

A standard MWMS agent loop should include:

  1. Workflow goal
  2. Shared context container
  3. Explicit parameters
  4. Approved action list
  5. Next-action picker
  6. Action executor
  7. Result recorder
  8. Context updater
  9. Stop condition checker
  10. Final answer or escalation handler
  11. Observability trace
  12. Evaluation record
  13. Kaizen output

Workflow Goal Requirement

Every loop must begin with a clear workflow goal.

The workflow goal should include:

  • original user request
  • Brain or system responsible
  • task purpose
  • expected output type
  • success criteria
  • known constraints
  • urgency
  • priority
  • risk level
  • required evidence level
  • review requirement

Every loop component must have access to the workflow goal.

If a search action, source action, answer action, or next-action picker does not know the original goal, it may perform locally correct actions that fail the overall task.


Context Container Requirement

Every agent loop should use a shared context container.

The context container is the working memory of the loop.

It should track:

  • original request
  • workflow goal
  • current step
  • maximum steps
  • Brain name
  • AI Employee name
  • task ID
  • thread ID
  • workflow type
  • current status
  • search history
  • source history
  • inspected source history
  • evidence collected
  • tool results
  • database records used
  • previous actions
  • failed actions
  • retry count
  • cost used
  • latency
  • token usage
  • confidence state
  • stop conditions
  • final answer status
  • escalation status

The context container makes the loop visible, testable, and controllable.


Context Formatting Rule

MWMS should distinguish between storage format, model context format, and human output format.

The same information may need different formats depending on use.

Storage Format

Used for databases and system records.

Examples:

  • JSON
  • table rows
  • event records
  • source records
  • task records

LLM Context Format

Used to help the model reason clearly.

Examples:

  • labelled sections
  • concise summaries
  • XML-style tags
  • readable action history
  • source excerpts
  • evidence blocks

Human Output Format

Used for operators and dashboards.

Examples:

  • reports
  • cards
  • tables
  • recommendations
  • decisions
  • next steps

The rule:

Store structured data for systems, but present context to the LLM in the clearest possible form for reasoning.


Explicit Parameters Requirement

Important AI Employee behaviour should be controlled by explicit parameters.

Parameters should not be hidden inside prompt wording where possible.

Recommended parameters:

ParameterPurpose
max_stepsPrevents endless loops
max_searchesControls search usage
max_sources_to_inspectControls crawler/scraper use
max_retriesPrevents repeated failure
max_costProtects budget
max_latencyProtects user experience
model_nameControls model choice
temperatureControls output variability
allowed_toolsRestricts tool access
allowed_actionsRestricts loop behaviour
required_sourcesDefines evidence requirement
confidence_thresholdDefines answer readiness
escalation_thresholdDefines review requirement
stop_conditionsDefines when loop ends
fallback_actionDefines what happens when loop cannot complete

The rule:

AI Employee behaviour should be governed by explicit parameters, not hidden assumptions.


Approved Action List

Each agent loop must define its approved action list.

Example Deep Search actions:

  • search
  • inspect source
  • scrape
  • summarise source
  • evaluate evidence
  • answer
  • stop
  • escalate

Example Research Brain actions:

  • create research plan
  • rewrite queries
  • search web
  • inspect source
  • compare sources
  • extract evidence
  • classify source
  • summarise findings
  • answer
  • request human review

Example Affiliate Brain actions:

  • research offer
  • inspect vendor page
  • inspect affiliate terms
  • check claims
  • check competition
  • score offer
  • reject
  • park
  • proceed to testing
  • request Research Brain support

Example HeadOffice actions:

  • classify signal
  • route to Brain
  • create task
  • escalate
  • monitor
  • approve
  • reject
  • park
  • request more evidence
  • generate report

The action list is a permission boundary.

If an action is not in the approved list, the AI Employee must not perform it.


Next Action Picker Pattern

A next-action picker is the component that decides what the agent should do next.

The next-action picker reviews:

  • workflow goal
  • current context
  • previous actions
  • evidence collected
  • failed attempts
  • remaining limits
  • stop conditions

It then chooses one approved action.

Examples:

  • search
  • inspect source
  • query database
  • answer
  • route
  • escalate
  • stop

The key rule:

The AI may choose the next action, but only from controlled options defined by MWMS.


Next Action Picker Output Standard

A next-action picker should return structured output.

Recommended fields:

{
"action_type": "",
"reason": "",
"target": "",
"input_summary": "",
"confidence": 0,
"requires_tool": false,
"requires_human_review": false,
"expected_result": "",
"risk_note": ""
}

This allows MWMS to log why an action was chosen and whether it was appropriate.


Action Type Field Rule

Where possible, MWMS should prefer a clear action_type field over complicated branching schemas.

For example:

{
"action_type": "search",
"query": "best current affiliate tracking tools for YouTube ads",
"reason": "Need current external evidence before making a recommendation"
}

This is usually easier for AI systems to follow than complex schemas with many mutually exclusive structures.


Modular Action Rule

Each major action should be separated into its own component where possible.

Examples:

  • research planning action
  • query rewriting action
  • search action
  • inspect source action
  • summarise source action
  • database query action
  • source scoring action
  • answer action
  • routing action
  • escalation action
  • evaluation action

Each action should have:

  • clear purpose
  • allowed inputs
  • expected outputs
  • allowed tools
  • failure handling
  • metadata capture
  • evaluation criteria

The rule:

AI Employee work should be composed from controlled actions, not one giant undifferentiated prompt.


Action Specific Model Rule

Different actions may need different models, settings, prompts, and evaluation criteria.

Examples:

ActionPossible Model Need
Research planningstrong reasoning and task decomposition
Query rewritingfocused search-planning ability
Search query planningfast, low-cost reasoning
Source selectionrelevance-focused reasoning
Source extractionstructured summarisation
Compliance reviewcautious, high-accuracy reasoning
Final answerstrong synthesis and clarity
Routing decisionstructured classification
Evaluation judgestable criteria-following model

MWMS should not assume every action needs the same model, temperature, or prompt.


Action Specific Evaluation Rule

Each loop component should be independently evaluable.

Examples:

Next Action Picker Evaluation

  • Did it choose the correct next action?
  • Did it stop too early?
  • Did it search when it should have answered?
  • Did it answer when more evidence was needed?
  • Did it escalate correctly?

Research Planning Evaluation

  • Did the plan identify the real decision?
  • Did it identify missing information?
  • Did it include source preferences?
  • Did it include freshness and jurisdiction needs?
  • Did it create useful queries?

Search Action Evaluation

  • Was the search query relevant?
  • Was it specific enough?
  • Did it account for date or freshness?
  • Did it diversify sources?
  • Did it avoid irrelevant searches?

Source Selection Evaluation

  • Were the selected sources credible?
  • Were they current enough?
  • Were official sources preferred where appropriate?
  • Were weak sources avoided?
  • Were conflicting sources noticed?

Source Inspection Evaluation

  • Was content extracted successfully?
  • Was the content useful?
  • Was source freshness captured?
  • Were failures recorded?

Final Answer Evaluation

  • Was the answer factual?
  • Was it relevant?
  • Was it sourced?
  • Was it decision-ready?
  • Was confidence calibrated?

This links directly to the MWMS AI Employee Evaluation Scorecard Standard.


Stop Condition Requirement

Every agent loop must have stop conditions.

Stop conditions may include:

  • max steps reached
  • max searches reached
  • max sources inspected
  • sufficient evidence collected
  • answer confidence threshold reached
  • cost limit reached
  • time limit reached
  • token budget reached
  • repeated failure detected
  • required source unavailable
  • human review required
  • escalation required
  • task no longer valid
  • user clarification needed

An agent loop without stop conditions is unsafe.


Forced Final Output Or Escalation Rule

A loop must not fail silently.

If the loop reaches a limit, the system must produce one of the following:

  1. Final answer with limitations
  2. Partial answer with confidence warning
  3. Request for more information
  4. Escalation to human review
  5. Route to another Brain
  6. Park status with reason
  7. Failed status with explanation

The rule:

When the loop cannot continue, MWMS must still produce a controlled outcome.


Loop State Statuses

Suggested loop statuses:

StatusMeaning
pendingLoop created but not started
runningLoop actively working
planning_researchCreating research plan or query set
searchingPerforming search action
inspecting_sourceInspecting source content
summarising_sourceCompressing source content into evidence notes
querying_databaseReading internal records
evaluatingRunning eval or quality check
answeringProducing final output
routingSending result to another Brain or workflow
escalatingHuman review required
completedFinal output created successfully
parkedPaused for later
failedCould not complete
stopped_by_limitMax step, cost, time, or token limit reached

Loop History Requirement

The agent loop should maintain a history of actions.

Each action history item should include:

  • step number
  • action type
  • reason
  • input summary
  • tool used
  • output summary
  • success status
  • failure reason
  • cost estimate
  • latency
  • whether result was used
  • next status

This history supports:

  • observability
  • debugging
  • evaluation
  • human review
  • Kaizen
  • regression protection

Research Planning Action Rules

A research planning action should be used when the workflow requires external evidence, source comparison, freshness checking, or structured investigation.

Research planning should record:

  • research goal
  • decision supported
  • missing information
  • assumptions to verify
  • freshness requirement
  • jurisdiction or market context
  • source preferences
  • risk areas
  • query plan
  • stopping criteria

Research planning should happen before important searches, not after random search results have already shaped the answer.

This connects to the MWMS Research Planning And Query Rewriting Standard.


Query Rewriting Action Rules

A query rewriting action should transform a business task or user question into search queries that execute the research plan.

Query rewriting should record:

  • original request
  • research purpose
  • generated queries
  • query purpose
  • preferred source type
  • freshness need
  • expected evidence
  • priority order

The rule:

Queries are not just search phrases. They are the execution path of the research plan.


Search Action Rules

A search action should be used when the system needs external information.

Search action should record:

  • query
  • query reason
  • query timestamp
  • search provider
  • result count
  • top results
  • whether results were useful
  • whether follow-up search is needed

Search should be avoided when:

  • internal records are enough
  • the task is not current or external
  • the cost is not justified
  • the loop has already searched enough
  • human clarification is needed first

Source Inspection Action Rules

A source inspection action should be used when search snippets are not enough.

Source inspection should record:

  • URL
  • title
  • retrieval time
  • access status
  • extraction status
  • extracted content summary
  • source freshness
  • source trust rating
  • evidence usefulness
  • whether content supports the final answer

The rule:

Search snippets are not enough for important Deep Search decisions.


Source Summarisation Action Rules

A source summarisation action should compress inspected source content into compact evidence notes.

Source summaries should preserve:

  • source title
  • source URL
  • query that found the source
  • retrieved date
  • source date if visible
  • key facts
  • key claims
  • relevant figures
  • limitations
  • what the source does not prove
  • freshness status
  • trust rating
  • relevance to the research goal
  • whether the source supports the final conclusion

The rule:

Summaries must compress evidence without inventing evidence.

Source summarisation allows MWMS to use more source diversity without overloading the model context window.


Database Action Rules

A database action should be used when the AI Employee needs MWMS internal records.

Examples:

  • task record
  • source record
  • offer record
  • campaign record
  • experiment record
  • user record
  • newsletter record
  • finance record
  • previous trace
  • regression failure

Database actions should record:

  • table or system queried
  • record IDs read
  • records written
  • permission status
  • success status
  • failure reason

Internal database state is part of agent context and must be observable.


Answer Action Rules

The answer action should produce the final or partial output.

An answer action should include:

  • direct answer
  • evidence summary
  • source references where required
  • confidence
  • limitations
  • decision or recommendation
  • next action
  • review requirement
  • routing if needed
  • Kaizen note if useful

The answer action should not invent missing evidence.

If evidence is incomplete, the answer must say so.


Escalation Action Rules

Escalation should occur when:

  • evidence is weak
  • sources conflict
  • compliance risk exists
  • financial risk exists
  • confidence is low
  • tools failed
  • database writes failed
  • loop limits were reached
  • human judgement is required
  • the AI Employee is not authorised to proceed

Escalation output should include:

  • reason for escalation
  • current findings
  • missing information
  • risk level
  • recommended human action
  • related trace ID
  • related task ID

Routing Action Rules

Routing should occur when the result belongs to another Brain or workflow.

Examples:

  • source issue to Research Brain
  • campaign issue to Ads Brain
  • offer issue to Affiliate Brain
  • cost issue to Finance Brain
  • test issue to Experimentation Brain
  • system issue to Operations Brain
  • compliance issue to Risk Brain
  • framework update to MCR

Routing should record:

  • destination Brain
  • reason for routing
  • source task
  • priority
  • urgency
  • expected next action

Prompt Architecture Rule

Agent loops should use prompt modularity.

Prompt layers should be separated into:

  1. Static role instruction
  2. Workflow rules
  3. Approved actions
  4. Success criteria
  5. Dynamic context
  6. Action history
  7. Current decision request
  8. Output schema

Static instructions should be stable and versioned.

Dynamic context should be inserted separately.

Prompt changes should be tested using evals before being trusted.


Prompt Caching And Stability Rule

Where supported by the provider, MWMS should keep stable prompt content before dynamic content.

This can improve cost and latency through prompt caching.

Stable content may include:

  • AI Employee role
  • action definitions
  • safety rules
  • output schema
  • evaluation criteria

Dynamic content may include:

  • user request
  • current context
  • source excerpts
  • action history
  • latest task state

This is a performance optimisation, not the core governance principle.


Parameter Versioning Rule

Agent loop parameters should eventually be versioned.

Versioned parameters may include:

  • prompt version
  • action list version
  • model version
  • scoring version
  • tool permissions version
  • stop condition version
  • source rule version
  • research planning version
  • query rewriting version

This allows MWMS to compare performance across changes and avoid uncontrolled drift.


Observability Requirements

Every serious agent loop should capture observability metadata.

Required where possible:

  • trace ID
  • task ID
  • thread ID
  • Brain
  • AI Employee
  • workflow type
  • loop status
  • current step
  • actions taken
  • tools used
  • sources inspected
  • database records read or written
  • cost
  • latency
  • failures
  • confidence
  • final output location
  • review status
  • Kaizen note

This connects directly to the MWMS AI Observability Metadata Standard.


Evaluation Requirements

Agent loops should be evaluated at both loop level and action level.

Loop-level evaluations:

  • Did the loop complete successfully?
  • Did it use enough evidence?
  • Did it stop correctly?
  • Did it stay within budget?
  • Did it produce useful output?
  • Did it avoid unnecessary actions?
  • Did it route or escalate correctly?

Action-level evaluations:

  • Did each action perform correctly?
  • Did the action picker choose well?
  • Was the research plan strong?
  • Were queries useful?
  • Were sources selected correctly?
  • Were source summaries accurate?
  • Was the final answer relevant and factual?

This connects directly to the MWMS AI Employee Evaluation Scorecard Standard.


Cost Control Requirements

Agent loops must include cost protection.

Cost controls may include:

  • max model actions
  • max searches
  • max source inspections
  • max retries
  • max token budget
  • max total cost
  • cheaper model for simple actions
  • expensive model only for high-value synthesis or review
  • stop and escalate when budget is exceeded

The AI Employee must not decide spending limits by itself.

HeadOffice controls cost rules.


Safety And Permission Requirements

Each loop must obey MWMS tool permissions.

An AI Employee should only use tools approved for:

  • its Brain
  • its role
  • its workflow
  • its user
  • its risk level
  • its environment

High-risk tools may require human approval.

Examples:

  • sending emails
  • publishing content
  • changing campaigns
  • modifying budgets
  • deleting records
  • making client-facing recommendations
  • writing to production systems

Agent autonomy must never bypass MWMS permission rules.


Human Review Requirements

Human review is required when:

  • the workflow is high risk
  • the output affects budget
  • the output affects campaign launch
  • the output affects public claims
  • the output affects compliance
  • source quality is weak
  • confidence is low
  • loop limits were reached
  • escalation threshold is triggered
  • the AI Employee failed a required evaluation
  • the system is not yet approved for autonomy

Human review should be recorded in the trace.


Agent Loop Maturity Levels

MWMS agent loops can mature through levels.

Level 1: Manual Assisted Loop

The human directs most steps.

AI assists with individual actions.

Use case:

  • early testing
  • new AI Employee design
  • high uncertainty workflows

Level 2: Controlled Internal Loop

The AI chooses next actions from approved options.

Human reviews final output.

Use case:

  • internal research
  • newsletter processing
  • offer research
  • source analysis

Level 3: Semi Autonomous Loop

The AI completes the loop under limits.

Human review is triggered only for risk, low confidence, or failed evals.

Use case:

  • stable workflows
  • repeatable internal tasks
  • mature Brain operations

Level 4: Governed Production Loop

The AI operates in production with full observability, scorecards, regression tests, cost controls, and permission boundaries.

Use case:

  • scaled AI Employees
  • client-facing systems
  • critical workflows

No AI Employee should move to a higher maturity level without evaluation evidence.


AI Employee Promotion Rules

An AI Employee may gain more loop autonomy when:

  • loop traces are complete
  • evaluation scores are consistently strong
  • regression failures are low
  • cost is controlled
  • latency is acceptable
  • human review approval rate is high
  • source quality is strong
  • confidence calibration is reliable
  • tool use is appropriate
  • stop conditions work correctly
  • escalation triggers work correctly
  • research plans are consistently useful
  • query rewriting improves evidence quality

AI Employee Restriction Rules

An AI Employee should be restricted when:

  • it loops unnecessarily
  • it stops too early
  • it answers without enough evidence
  • it ignores source freshness
  • it overuses expensive tools
  • it fails to route correctly
  • it repeats failed actions
  • it overstates confidence
  • it fails required evals
  • it bypasses review triggers
  • it causes cost or compliance risk
  • it creates untraceable outputs
  • it creates weak research plans
  • it uses vague or duplicated queries
  • it treats search snippets as evidence

Restriction may include:

  • lowering autonomy level
  • reducing allowed actions
  • reducing tool permissions
  • requiring human review
  • changing prompts
  • changing models
  • adding regression tests
  • pausing the workflow
  • converting the task from agent loop to deterministic workflow

Suggested Agent Loop Record Structure

The following is a conceptual structure for future implementation.

{
"loop_id": "",
"trace_id": "",
"created_at": "",
"brain_name": "",
"ai_employee_name": "",
"workflow_type": "",
"task_id": "",
"thread_id": "",
"workflow_goal": "",
"current_step": 0,
"max_steps": 0,
"status": "",
"autonomy_mode": "",
"allowed_actions": [],
"parameters": {
"max_searches": 0,
"max_sources_to_inspect": 0,
"max_retries": 0,
"max_cost": 0,
"max_latency_ms": 0,
"confidence_threshold": 0
},
"research_plan": {
"research_goal": "",
"decision_supported": "",
"queries": [],
"source_preferences": [],
"stopping_criteria": []
},
"action_history": [],
"sources_inspected": [],
"source_summaries": [],
"database_records_used": [],
"cost_estimate": 0,
"latency_ms": 0,
"confidence": 0,
"stop_reason": "",
"final_output_location": "",
"human_review_required": false,
"evaluation_status": "",
"kaizen_note": ""
}

This is not mandatory code.

It is the conceptual record structure for consistent future build work.


Minimum Starting Implementation

MWMS does not need a perfect autonomous loop immediately.

The first practical implementation should capture:

  • workflow goal
  • task ID or thread ID
  • Brain
  • AI Employee
  • autonomy mode
  • allowed actions
  • current step
  • max steps
  • action history
  • research plan where relevant
  • search/source history where relevant
  • source summaries where relevant
  • final answer or escalation
  • stop reason
  • confidence
  • review status
  • Kaizen note

This is enough to begin controlled agent-loop operation without overengineering.


Relationship To Other MWMS Standards

This framework supports and should align with:

  • MWMS Deep Search Quality And Observability Framework
  • MWMS AI Observability Metadata Standard
  • MWMS AI Employee Evaluation Scorecard Standard
  • MWMS Research Planning And Query Rewriting Standard
  • MWMS AI Work Session Persistence Standard
  • MWMS Next Action Picker Standard
  • MWMS Agent Loop Context Schema
  • MWMS AI Agent Operations Core
  • MWMS AI Tool Permission And Access Framework
  • MWMS AI Agent Deployment Readiness Checklist
  • MWMS AI Workflow Pipeline Standard
  • MWMS AI Schema And Decision Ready Output Framework
  • MWMS AI Output Validation Standard
  • MWMS Agentic Reporting Standard
  • MWMS Supabase Event Schema
  • MWMS Brain Room Architecture
  • HeadOffice Operational Intelligence Framework
  • HeadOffice Newsletter Intelligence Operating Protocol
  • Research Brain Source Evaluation Framework
  • Data Brain Measurement Integrity Framework
  • Experimentation Brain Canon
  • MWMS Kaizen Continuous Improvement Loop
  • MWMS System Change Log

This framework does not replace those standards.

It defines the control architecture for AI Employee loops.


Future Enhancements

Future pages or modules may include:

  • MWMS Agent Action Registry
  • MWMS Agent Stop Condition Standard
  • MWMS Agent Loop Dashboard Specification
  • MWMS Prompt Versioning And Parameter Control Standard
  • MWMS Deep Search Source Record Standard
  • MWMS Search Scrape Summarise Evidence Pipeline Standard
  • MWMS Agent Loop Regression Test Library
  • MWMS AI Employee Promotion And Restriction Standard
  • MWMS Agentic Dial And Workflow Control Standard

These should only be created when enough implementation or course material justifies them.


Drift Protection

This framework prevents the following drift:

  • one prompt doing too many jobs
  • hidden SDK loops controlling important workflows
  • AI Employees wandering without limits
  • search loops running too long
  • premature final answers
  • unobserved tool actions
  • untracked source inspection
  • cost growth from uncontrolled loops
  • AI actions without workflow goals
  • AI actions without task linkage
  • AI outputs without stop reasons
  • AI Employees gaining autonomy without evaluation evidence
  • prompt changes breaking unrelated behaviour
  • failure patterns being lost instead of converted into Kaizen learning
  • making every workflow too agentic
  • using agent loops where deterministic workflows are safer
  • treating search snippets as evidence
  • failing to consolidate actions that always belong together
  • creating unnecessary AI decisions where workflow rules would be better

If an agent loop cannot be controlled, traced, stopped, and evaluated, it should not be trusted for important MWMS work.


Architectural Intent

The architectural intent of this framework is to turn AI Employee autonomy into a controlled MWMS system capability.

MWMS is not building loose chatbots.

MWMS is building governed AI Employees that can reason, search, inspect, decide, route, and improve under HeadOffice control.

The agent loop is the core operating cycle of those AI Employees.

A strong loop gives MWMS:

  • control
  • visibility
  • repeatability
  • safety
  • cost discipline
  • better evaluation
  • better Kaizen learning
  • scalable autonomy

A weak loop creates hidden risk.

The additional workflow-first rule ensures MWMS does not overuse agents where controlled workflows would be safer, cheaper, and more predictable.

This framework ensures MWMS AI Employees become controlled workers, not uncontrolled model actions.


Change Log

v1.1 Agent Vs Workflow Update

Updated the MWMS Agent Loop Control Framework with insights from the Agents vs Workflows and Search Scrape Summarise course block.

Added:

  • Agent Vs Workflow Rule
  • Agentic Dial Rule
  • Workflow First Governance Rule
  • Action Consolidation Rule
  • Search Plus Inspect Workflow Pattern
  • When Not To Use An Agent Loop
  • Research Planning Action Rules
  • Query Rewriting Action Rules
  • Source Summarisation Action Rules
  • autonomy mode field in the conceptual loop record
  • research plan and source summary fields in the conceptual loop record

This update clarifies that MWMS should not make every workflow agentic. Some workflows should remain deterministic or assisted. AI autonomy should be increased only where judgement is useful and where observability, evaluation, cost control, and HeadOffice governance support it.

v1.0 Initial Draft

Created the MWMS Agent Loop Control Framework based on absorbed insights from Matt Pocock AIhero Build DeepSearch In TypeScript.

Integrated principles from course blocks covering:

  • extracting system parameters
  • weaknesses of overloaded system prompts
  • replacing hidden SDK-managed tool loops with controlled loops
  • shared workflow context
  • modular actions
  • next-action picker pattern
  • action-specific model and prompt control
  • stop conditions
  • forced final answer or escalation
  • loop observability
  • action-level evaluation
  • prompt hygiene
  • prompt caching considerations
  • parameter versioning
  • AI Employee autonomy governance

Established this framework as the MWMS governance page for controlling multi-step AI Employee agent loops across Brains, workflows, and future client-facing AI systems.