System: MWMS
Document Type: Framework
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, Deep Search Workflows, Research Workflows, Affiliate Evaluation, Ads Review, Content Workflows, Experimentation Brain, HeadOffice Intelligence, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, AI Employee Dashboards, Future Agent Control Dashboards
Parent Page: HeadOffice
Source Of Truth: MCR
Related Frameworks: MWMS Deep Search Quality And Observability Framework, MWMS AI Observability Metadata Standard, MWMS AI Employee Evaluation Scorecard Standard, MWMS Research Planning And Query Rewriting Standard
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration
Purpose
The purpose of this framework is to define how MWMS controls AI Employee agent loops.
An agent loop is the repeated decision cycle where an AI Employee decides what to do next, performs an action, updates context, evaluates progress, and either continues, answers, escalates, or stops.
MWMS must not rely blindly on hidden SDK-managed loops, overloaded prompts, or uncontrolled tool-calling behaviour for important workflows.
This framework ensures AI Employees operate through:
- explicit parameters
- controlled actions
- shared workflow context
- next-action selection
- clear stop conditions
- fallback behaviour
- observability
- evaluation
- cost control
- HeadOffice governance
The goal is to give AI Employees controlled autonomy without allowing them to wander, loop endlessly, overuse tools, ignore context, or produce unsupported decisions.
Scope
This framework applies to all MWMS systems where an AI Employee or AI workflow performs multi-step work.
This includes:
- Deep Search workflows
- Research Brain source investigation
- Affiliate Brain offer evaluation
- Ads Brain compliance review
- Content Brain research workflows
- HeadOffice Intelligence workflows
- Newsletter Intelligence routing
- Experimentation Brain test analysis
- Data Brain validation processes
- Finance Brain cost or budget review
- Brain Room AI assistance
- Dev Console AI assistance
- future client-facing AIBS agents
- future autonomous AI Employees
- future workflow agents using tools, databases, source inspection, or routing logic
This framework does not define exact code implementation, TypeScript structure, SDK usage, Langfuse setup, Evalite setup, or model provider configuration.
It defines the MWMS governance architecture for agent-loop behaviour.
Core Rule
An AI Employee must not be governed by one overloaded prompt that performs every function at once.
The MWMS rule is:
One prompt should not control planning, search, source selection, scraping, reasoning, answering, routing, stopping, and escalation all at once.
Important AI work must be broken into controlled actions that can be traced, evaluated, improved, and governed.
Definition Of An Agent Loop
An agent loop is a structured cycle where an AI Employee repeatedly:
- Reviews the workflow goal
- Reviews the current context
- Chooses the next approved action
- Executes that action
- Records the result
- Updates workflow context
- Checks stop conditions
- Decides whether to continue, answer, route, escalate, or stop
In MWMS, an agent loop is not just a model calling tools.
It is a governed workflow control system.
Why Agent Loop Control Matters
Without controlled loops, AI Employees may:
- search too many times
- stop too early
- answer with weak evidence
- ignore the original question
- use tools unnecessarily
- choose poor sources
- fail silently
- exceed cost limits
- exceed token limits
- fail to route correctly
- repeat the same action
- hallucinate final answers
- become difficult to debug
- become difficult to evaluate
A controlled loop prevents important AI behaviour from being hidden inside a single model action or SDK-managed tool loop.
Single Prompt Fragility Rule
A single system prompt becomes fragile when it is responsible for too many jobs.
Examples of jobs that should not all be handled by one undifferentiated prompt:
- understanding the user request
- deciding whether research is needed
- choosing search queries
- deciding which URLs to inspect
- scraping or opening sources
- evaluating source quality
- deciding whether enough evidence exists
- writing the final answer
- deciding whether to stop
- deciding whether to escalate
- deciding whether to route to another Brain
- formatting final output
- applying compliance checks
- assigning confidence
When one prompt handles all of these, improving one behaviour can damage another.
For example:
- improving source selection may weaken final answer style
- adding compliance caution may reduce search tenacity
- changing answer format may damage tool usage
- adding cost limits may cause premature stopping
- improving speed may weaken evidence quality
MWMS must avoid overloaded prompt architecture.
Agent Vs Workflow Rule
Not every AI system should be treated as an open-ended agent.
Before MWMS creates or upgrades an AI Employee loop, the system must decide whether the task should be:
- a deterministic workflow
- an assisted workflow
- a controlled agent loop
- a broader agentic workflow
- a future autonomous agent
The key rule is:
Use AI judgement where judgement is useful. Use workflow rules where predictability is better.
Some tasks should not require the AI Employee to decide every step.
For example, if a workflow always needs a source inspected after search, MWMS should not ask the AI every time whether to inspect the source. It should define a controlled workflow where search and source inspection happen together.
The more business risk involved, the more MWMS should favour controlled workflows over open-ended agent behaviour.
Agentic Dial Rule
MWMS should treat autonomy as a dial, not a yes/no switch.
Each workflow should be assigned an autonomy mode.
| Mode | Meaning | Suitable Use |
|---|---|---|
| Deterministic Workflow | Fixed path, little or no AI choice | high-risk, repeatable, compliance-sensitive workflows |
| Assisted Workflow | AI helps within controlled steps | early AI Employee work, internal support, structured review |
| Controlled Agent | AI chooses next action from approved options | Deep Search, research, evaluation, source inspection |
| Agentic Workflow | AI has broader control but is still logged and limited | mature internal workflows with strong evals |
| Autonomous Agent | AI operates with high autonomy under governance | only for mature, proven, low-risk or explicitly approved systems |
Early MWMS AI systems should usually operate as Assisted Workflow or Controlled Agent, not full autonomy.
AI Employees should only move up the autonomy dial when evaluation, observability, cost control, safety, and human review results prove the workflow is stable.
Workflow First Governance Rule
Before creating an agent loop, MWMS must ask:
Can this be handled better as a workflow?
A workflow-first approach is preferred when:
- the steps are predictable
- the source requirements are known
- the compliance risk is high
- the cost risk is high
- the output format must be consistent
- the process needs repeatability
- the task must be auditable
- the AI does not need much judgement
- failure would create business risk
An agent loop is preferred when:
- the path is not fully known in advance
- the AI needs to choose between multiple evidence paths
- the research question is complex
- sources may conflict
- the system must adapt based on findings
- the workflow requires judgement between approved options
- the task needs iterative investigation
The rule:
MWMS should not make a workflow agentic unless the task genuinely benefits from agentic behaviour.
Action Consolidation Rule
If two actions almost always belong together, MWMS should consider consolidating them into one workflow step.
This reduces:
- unnecessary AI decisions
- cost
- latency
- failure points
- action drift
- repeated tool use
- avoidable complexity
Examples:
| Separate Actions | Consolidated Workflow Pattern |
|---|---|
| Search, then maybe inspect source | Search plus inspect top eligible sources |
| Inspect source, then maybe summarise | Inspect plus summarise source |
| Read newsletter, then maybe route | Extract signal plus propose routing |
| Inspect offer page, then maybe check claims | Offer page inspect plus claim extraction |
| Review ad, then maybe check compliance | Ad review plus compliance risk check |
| Source inspect, then maybe score | Source inspect plus trust/freshness score |
Consolidation should be used when the second action is almost always required and the AI decision adds little value.
This keeps MWMS agent loops cleaner, cheaper, and easier to evaluate.
Search Plus Inspect Workflow Pattern
For Deep Search and Research Brain workflows, search and source inspection often belong together.
A weak pattern is:
- AI decides to search.
- Search results appear.
- AI decides whether to inspect a source.
- AI may skip inspection and answer from snippets.
A stronger MWMS pattern is:
- Research plan defines what evidence is needed.
- Search finds candidate sources.
- The system automatically inspects top eligible sources according to source rules.
- Source summaries are created.
- The AI decides whether evidence is sufficient.
- The answer is produced only after evidence is inspected or limitations are declared.
This prevents AI Employees from treating search snippets as proof.
The rule:
Search results are leads. Inspected sources are evidence.
When Not To Use An Agent Loop
MWMS should avoid agent loops when:
- the workflow has a fixed required sequence
- the task is simple and deterministic
- the cost of exploration is not justified
- the AI has no meaningful decision to make
- compliance risk requires fixed checks
- the process must follow a strict approval path
- the result must be generated from internal records only
- the workflow is already well-defined and repeatable
- a deterministic pipeline will be safer and cheaper
- the AI Employee has not been evaluated enough for autonomy
Examples of workflows that may not need an agent loop:
- checking required fields
- formatting a known report
- routing a known newsletter category
- validating JSON structure
- applying a fixed compliance checklist
- generating a standard title
- summarising a known source record
- copying data between approved records
Agent loops should be reserved for workflows that benefit from adaptive reasoning.
Controlled Autonomy Principle
MWMS AI Employees may choose what to do next, but only from actions approved by MWMS.
The AI can decide:
- search
- inspect source
- query database
- answer
- route
- escalate
- retry
- stop
But MWMS defines:
- which actions exist
- which actions are allowed
- which tools each action may use
- when actions can happen
- how many times actions can repeat
- what each action must return
- what metadata each action must record
- when human review is required
This is controlled autonomy.
Agent Loop Structure
A standard MWMS agent loop should include:
- Workflow goal
- Shared context container
- Explicit parameters
- Approved action list
- Next-action picker
- Action executor
- Result recorder
- Context updater
- Stop condition checker
- Final answer or escalation handler
- Observability trace
- Evaluation record
- Kaizen output
Workflow Goal Requirement
Every loop must begin with a clear workflow goal.
The workflow goal should include:
- original user request
- Brain or system responsible
- task purpose
- expected output type
- success criteria
- known constraints
- urgency
- priority
- risk level
- required evidence level
- review requirement
Every loop component must have access to the workflow goal.
If a search action, source action, answer action, or next-action picker does not know the original goal, it may perform locally correct actions that fail the overall task.
Context Container Requirement
Every agent loop should use a shared context container.
The context container is the working memory of the loop.
It should track:
- original request
- workflow goal
- current step
- maximum steps
- Brain name
- AI Employee name
- task ID
- thread ID
- workflow type
- current status
- search history
- source history
- inspected source history
- evidence collected
- tool results
- database records used
- previous actions
- failed actions
- retry count
- cost used
- latency
- token usage
- confidence state
- stop conditions
- final answer status
- escalation status
The context container makes the loop visible, testable, and controllable.
Context Formatting Rule
MWMS should distinguish between storage format, model context format, and human output format.
The same information may need different formats depending on use.
Storage Format
Used for databases and system records.
Examples:
- JSON
- table rows
- event records
- source records
- task records
LLM Context Format
Used to help the model reason clearly.
Examples:
- labelled sections
- concise summaries
- XML-style tags
- readable action history
- source excerpts
- evidence blocks
Human Output Format
Used for operators and dashboards.
Examples:
- reports
- cards
- tables
- recommendations
- decisions
- next steps
The rule:
Store structured data for systems, but present context to the LLM in the clearest possible form for reasoning.
Explicit Parameters Requirement
Important AI Employee behaviour should be controlled by explicit parameters.
Parameters should not be hidden inside prompt wording where possible.
Recommended parameters:
| Parameter | Purpose |
|---|---|
| max_steps | Prevents endless loops |
| max_searches | Controls search usage |
| max_sources_to_inspect | Controls crawler/scraper use |
| max_retries | Prevents repeated failure |
| max_cost | Protects budget |
| max_latency | Protects user experience |
| model_name | Controls model choice |
| temperature | Controls output variability |
| allowed_tools | Restricts tool access |
| allowed_actions | Restricts loop behaviour |
| required_sources | Defines evidence requirement |
| confidence_threshold | Defines answer readiness |
| escalation_threshold | Defines review requirement |
| stop_conditions | Defines when loop ends |
| fallback_action | Defines what happens when loop cannot complete |
The rule:
AI Employee behaviour should be governed by explicit parameters, not hidden assumptions.
Approved Action List
Each agent loop must define its approved action list.
Example Deep Search actions:
- search
- inspect source
- scrape
- summarise source
- evaluate evidence
- answer
- stop
- escalate
Example Research Brain actions:
- create research plan
- rewrite queries
- search web
- inspect source
- compare sources
- extract evidence
- classify source
- summarise findings
- answer
- request human review
Example Affiliate Brain actions:
- research offer
- inspect vendor page
- inspect affiliate terms
- check claims
- check competition
- score offer
- reject
- park
- proceed to testing
- request Research Brain support
Example HeadOffice actions:
- classify signal
- route to Brain
- create task
- escalate
- monitor
- approve
- reject
- park
- request more evidence
- generate report
The action list is a permission boundary.
If an action is not in the approved list, the AI Employee must not perform it.
Next Action Picker Pattern
A next-action picker is the component that decides what the agent should do next.
The next-action picker reviews:
- workflow goal
- current context
- previous actions
- evidence collected
- failed attempts
- remaining limits
- stop conditions
It then chooses one approved action.
Examples:
- search
- inspect source
- query database
- answer
- route
- escalate
- stop
The key rule:
The AI may choose the next action, but only from controlled options defined by MWMS.
Next Action Picker Output Standard
A next-action picker should return structured output.
Recommended fields:
{
"action_type": "",
"reason": "",
"target": "",
"input_summary": "",
"confidence": 0,
"requires_tool": false,
"requires_human_review": false,
"expected_result": "",
"risk_note": ""
}
This allows MWMS to log why an action was chosen and whether it was appropriate.
Action Type Field Rule
Where possible, MWMS should prefer a clear action_type field over complicated branching schemas.
For example:
{
"action_type": "search",
"query": "best current affiliate tracking tools for YouTube ads",
"reason": "Need current external evidence before making a recommendation"
}
This is usually easier for AI systems to follow than complex schemas with many mutually exclusive structures.
Modular Action Rule
Each major action should be separated into its own component where possible.
Examples:
- research planning action
- query rewriting action
- search action
- inspect source action
- summarise source action
- database query action
- source scoring action
- answer action
- routing action
- escalation action
- evaluation action
Each action should have:
- clear purpose
- allowed inputs
- expected outputs
- allowed tools
- failure handling
- metadata capture
- evaluation criteria
The rule:
AI Employee work should be composed from controlled actions, not one giant undifferentiated prompt.
Action Specific Model Rule
Different actions may need different models, settings, prompts, and evaluation criteria.
Examples:
| Action | Possible Model Need |
|---|---|
| Research planning | strong reasoning and task decomposition |
| Query rewriting | focused search-planning ability |
| Search query planning | fast, low-cost reasoning |
| Source selection | relevance-focused reasoning |
| Source extraction | structured summarisation |
| Compliance review | cautious, high-accuracy reasoning |
| Final answer | strong synthesis and clarity |
| Routing decision | structured classification |
| Evaluation judge | stable criteria-following model |
MWMS should not assume every action needs the same model, temperature, or prompt.
Action Specific Evaluation Rule
Each loop component should be independently evaluable.
Examples:
Next Action Picker Evaluation
- Did it choose the correct next action?
- Did it stop too early?
- Did it search when it should have answered?
- Did it answer when more evidence was needed?
- Did it escalate correctly?
Research Planning Evaluation
- Did the plan identify the real decision?
- Did it identify missing information?
- Did it include source preferences?
- Did it include freshness and jurisdiction needs?
- Did it create useful queries?
Search Action Evaluation
- Was the search query relevant?
- Was it specific enough?
- Did it account for date or freshness?
- Did it diversify sources?
- Did it avoid irrelevant searches?
Source Selection Evaluation
- Were the selected sources credible?
- Were they current enough?
- Were official sources preferred where appropriate?
- Were weak sources avoided?
- Were conflicting sources noticed?
Source Inspection Evaluation
- Was content extracted successfully?
- Was the content useful?
- Was source freshness captured?
- Were failures recorded?
Final Answer Evaluation
- Was the answer factual?
- Was it relevant?
- Was it sourced?
- Was it decision-ready?
- Was confidence calibrated?
This links directly to the MWMS AI Employee Evaluation Scorecard Standard.
Stop Condition Requirement
Every agent loop must have stop conditions.
Stop conditions may include:
- max steps reached
- max searches reached
- max sources inspected
- sufficient evidence collected
- answer confidence threshold reached
- cost limit reached
- time limit reached
- token budget reached
- repeated failure detected
- required source unavailable
- human review required
- escalation required
- task no longer valid
- user clarification needed
An agent loop without stop conditions is unsafe.
Forced Final Output Or Escalation Rule
A loop must not fail silently.
If the loop reaches a limit, the system must produce one of the following:
- Final answer with limitations
- Partial answer with confidence warning
- Request for more information
- Escalation to human review
- Route to another Brain
- Park status with reason
- Failed status with explanation
The rule:
When the loop cannot continue, MWMS must still produce a controlled outcome.
Loop State Statuses
Suggested loop statuses:
| Status | Meaning |
|---|---|
| pending | Loop created but not started |
| running | Loop actively working |
| planning_research | Creating research plan or query set |
| searching | Performing search action |
| inspecting_source | Inspecting source content |
| summarising_source | Compressing source content into evidence notes |
| querying_database | Reading internal records |
| evaluating | Running eval or quality check |
| answering | Producing final output |
| routing | Sending result to another Brain or workflow |
| escalating | Human review required |
| completed | Final output created successfully |
| parked | Paused for later |
| failed | Could not complete |
| stopped_by_limit | Max step, cost, time, or token limit reached |
Loop History Requirement
The agent loop should maintain a history of actions.
Each action history item should include:
- step number
- action type
- reason
- input summary
- tool used
- output summary
- success status
- failure reason
- cost estimate
- latency
- whether result was used
- next status
This history supports:
- observability
- debugging
- evaluation
- human review
- Kaizen
- regression protection
Research Planning Action Rules
A research planning action should be used when the workflow requires external evidence, source comparison, freshness checking, or structured investigation.
Research planning should record:
- research goal
- decision supported
- missing information
- assumptions to verify
- freshness requirement
- jurisdiction or market context
- source preferences
- risk areas
- query plan
- stopping criteria
Research planning should happen before important searches, not after random search results have already shaped the answer.
This connects to the MWMS Research Planning And Query Rewriting Standard.
Query Rewriting Action Rules
A query rewriting action should transform a business task or user question into search queries that execute the research plan.
Query rewriting should record:
- original request
- research purpose
- generated queries
- query purpose
- preferred source type
- freshness need
- expected evidence
- priority order
The rule:
Queries are not just search phrases. They are the execution path of the research plan.
Search Action Rules
A search action should be used when the system needs external information.
Search action should record:
- query
- query reason
- query timestamp
- search provider
- result count
- top results
- whether results were useful
- whether follow-up search is needed
Search should be avoided when:
- internal records are enough
- the task is not current or external
- the cost is not justified
- the loop has already searched enough
- human clarification is needed first
Source Inspection Action Rules
A source inspection action should be used when search snippets are not enough.
Source inspection should record:
- URL
- title
- retrieval time
- access status
- extraction status
- extracted content summary
- source freshness
- source trust rating
- evidence usefulness
- whether content supports the final answer
The rule:
Search snippets are not enough for important Deep Search decisions.
Source Summarisation Action Rules
A source summarisation action should compress inspected source content into compact evidence notes.
Source summaries should preserve:
- source title
- source URL
- query that found the source
- retrieved date
- source date if visible
- key facts
- key claims
- relevant figures
- limitations
- what the source does not prove
- freshness status
- trust rating
- relevance to the research goal
- whether the source supports the final conclusion
The rule:
Summaries must compress evidence without inventing evidence.
Source summarisation allows MWMS to use more source diversity without overloading the model context window.
Database Action Rules
A database action should be used when the AI Employee needs MWMS internal records.
Examples:
- task record
- source record
- offer record
- campaign record
- experiment record
- user record
- newsletter record
- finance record
- previous trace
- regression failure
Database actions should record:
- table or system queried
- record IDs read
- records written
- permission status
- success status
- failure reason
Internal database state is part of agent context and must be observable.
Answer Action Rules
The answer action should produce the final or partial output.
An answer action should include:
- direct answer
- evidence summary
- source references where required
- confidence
- limitations
- decision or recommendation
- next action
- review requirement
- routing if needed
- Kaizen note if useful
The answer action should not invent missing evidence.
If evidence is incomplete, the answer must say so.
Escalation Action Rules
Escalation should occur when:
- evidence is weak
- sources conflict
- compliance risk exists
- financial risk exists
- confidence is low
- tools failed
- database writes failed
- loop limits were reached
- human judgement is required
- the AI Employee is not authorised to proceed
Escalation output should include:
- reason for escalation
- current findings
- missing information
- risk level
- recommended human action
- related trace ID
- related task ID
Routing Action Rules
Routing should occur when the result belongs to another Brain or workflow.
Examples:
- source issue to Research Brain
- campaign issue to Ads Brain
- offer issue to Affiliate Brain
- cost issue to Finance Brain
- test issue to Experimentation Brain
- system issue to Operations Brain
- compliance issue to Risk Brain
- framework update to MCR
Routing should record:
- destination Brain
- reason for routing
- source task
- priority
- urgency
- expected next action
Prompt Architecture Rule
Agent loops should use prompt modularity.
Prompt layers should be separated into:
- Static role instruction
- Workflow rules
- Approved actions
- Success criteria
- Dynamic context
- Action history
- Current decision request
- Output schema
Static instructions should be stable and versioned.
Dynamic context should be inserted separately.
Prompt changes should be tested using evals before being trusted.
Prompt Caching And Stability Rule
Where supported by the provider, MWMS should keep stable prompt content before dynamic content.
This can improve cost and latency through prompt caching.
Stable content may include:
- AI Employee role
- action definitions
- safety rules
- output schema
- evaluation criteria
Dynamic content may include:
- user request
- current context
- source excerpts
- action history
- latest task state
This is a performance optimisation, not the core governance principle.
Parameter Versioning Rule
Agent loop parameters should eventually be versioned.
Versioned parameters may include:
- prompt version
- action list version
- model version
- scoring version
- tool permissions version
- stop condition version
- source rule version
- research planning version
- query rewriting version
This allows MWMS to compare performance across changes and avoid uncontrolled drift.
Observability Requirements
Every serious agent loop should capture observability metadata.
Required where possible:
- trace ID
- task ID
- thread ID
- Brain
- AI Employee
- workflow type
- loop status
- current step
- actions taken
- tools used
- sources inspected
- database records read or written
- cost
- latency
- failures
- confidence
- final output location
- review status
- Kaizen note
This connects directly to the MWMS AI Observability Metadata Standard.
Evaluation Requirements
Agent loops should be evaluated at both loop level and action level.
Loop-level evaluations:
- Did the loop complete successfully?
- Did it use enough evidence?
- Did it stop correctly?
- Did it stay within budget?
- Did it produce useful output?
- Did it avoid unnecessary actions?
- Did it route or escalate correctly?
Action-level evaluations:
- Did each action perform correctly?
- Did the action picker choose well?
- Was the research plan strong?
- Were queries useful?
- Were sources selected correctly?
- Were source summaries accurate?
- Was the final answer relevant and factual?
This connects directly to the MWMS AI Employee Evaluation Scorecard Standard.
Cost Control Requirements
Agent loops must include cost protection.
Cost controls may include:
- max model actions
- max searches
- max source inspections
- max retries
- max token budget
- max total cost
- cheaper model for simple actions
- expensive model only for high-value synthesis or review
- stop and escalate when budget is exceeded
The AI Employee must not decide spending limits by itself.
HeadOffice controls cost rules.
Safety And Permission Requirements
Each loop must obey MWMS tool permissions.
An AI Employee should only use tools approved for:
- its Brain
- its role
- its workflow
- its user
- its risk level
- its environment
High-risk tools may require human approval.
Examples:
- sending emails
- publishing content
- changing campaigns
- modifying budgets
- deleting records
- making client-facing recommendations
- writing to production systems
Agent autonomy must never bypass MWMS permission rules.
Human Review Requirements
Human review is required when:
- the workflow is high risk
- the output affects budget
- the output affects campaign launch
- the output affects public claims
- the output affects compliance
- source quality is weak
- confidence is low
- loop limits were reached
- escalation threshold is triggered
- the AI Employee failed a required evaluation
- the system is not yet approved for autonomy
Human review should be recorded in the trace.
Agent Loop Maturity Levels
MWMS agent loops can mature through levels.
Level 1: Manual Assisted Loop
The human directs most steps.
AI assists with individual actions.
Use case:
- early testing
- new AI Employee design
- high uncertainty workflows
Level 2: Controlled Internal Loop
The AI chooses next actions from approved options.
Human reviews final output.
Use case:
- internal research
- newsletter processing
- offer research
- source analysis
Level 3: Semi Autonomous Loop
The AI completes the loop under limits.
Human review is triggered only for risk, low confidence, or failed evals.
Use case:
- stable workflows
- repeatable internal tasks
- mature Brain operations
Level 4: Governed Production Loop
The AI operates in production with full observability, scorecards, regression tests, cost controls, and permission boundaries.
Use case:
- scaled AI Employees
- client-facing systems
- critical workflows
No AI Employee should move to a higher maturity level without evaluation evidence.
AI Employee Promotion Rules
An AI Employee may gain more loop autonomy when:
- loop traces are complete
- evaluation scores are consistently strong
- regression failures are low
- cost is controlled
- latency is acceptable
- human review approval rate is high
- source quality is strong
- confidence calibration is reliable
- tool use is appropriate
- stop conditions work correctly
- escalation triggers work correctly
- research plans are consistently useful
- query rewriting improves evidence quality
AI Employee Restriction Rules
An AI Employee should be restricted when:
- it loops unnecessarily
- it stops too early
- it answers without enough evidence
- it ignores source freshness
- it overuses expensive tools
- it fails to route correctly
- it repeats failed actions
- it overstates confidence
- it fails required evals
- it bypasses review triggers
- it causes cost or compliance risk
- it creates untraceable outputs
- it creates weak research plans
- it uses vague or duplicated queries
- it treats search snippets as evidence
Restriction may include:
- lowering autonomy level
- reducing allowed actions
- reducing tool permissions
- requiring human review
- changing prompts
- changing models
- adding regression tests
- pausing the workflow
- converting the task from agent loop to deterministic workflow
Suggested Agent Loop Record Structure
The following is a conceptual structure for future implementation.
{
"loop_id": "",
"trace_id": "",
"created_at": "",
"brain_name": "",
"ai_employee_name": "",
"workflow_type": "",
"task_id": "",
"thread_id": "",
"workflow_goal": "",
"current_step": 0,
"max_steps": 0,
"status": "",
"autonomy_mode": "",
"allowed_actions": [],
"parameters": {
"max_searches": 0,
"max_sources_to_inspect": 0,
"max_retries": 0,
"max_cost": 0,
"max_latency_ms": 0,
"confidence_threshold": 0
},
"research_plan": {
"research_goal": "",
"decision_supported": "",
"queries": [],
"source_preferences": [],
"stopping_criteria": []
},
"action_history": [],
"sources_inspected": [],
"source_summaries": [],
"database_records_used": [],
"cost_estimate": 0,
"latency_ms": 0,
"confidence": 0,
"stop_reason": "",
"final_output_location": "",
"human_review_required": false,
"evaluation_status": "",
"kaizen_note": ""
}
This is not mandatory code.
It is the conceptual record structure for consistent future build work.
Minimum Starting Implementation
MWMS does not need a perfect autonomous loop immediately.
The first practical implementation should capture:
- workflow goal
- task ID or thread ID
- Brain
- AI Employee
- autonomy mode
- allowed actions
- current step
- max steps
- action history
- research plan where relevant
- search/source history where relevant
- source summaries where relevant
- final answer or escalation
- stop reason
- confidence
- review status
- Kaizen note
This is enough to begin controlled agent-loop operation without overengineering.
Relationship To Other MWMS Standards
This framework supports and should align with:
- MWMS Deep Search Quality And Observability Framework
- MWMS AI Observability Metadata Standard
- MWMS AI Employee Evaluation Scorecard Standard
- MWMS Research Planning And Query Rewriting Standard
- MWMS AI Work Session Persistence Standard
- MWMS Next Action Picker Standard
- MWMS Agent Loop Context Schema
- MWMS AI Agent Operations Core
- MWMS AI Tool Permission And Access Framework
- MWMS AI Agent Deployment Readiness Checklist
- MWMS AI Workflow Pipeline Standard
- MWMS AI Schema And Decision Ready Output Framework
- MWMS AI Output Validation Standard
- MWMS Agentic Reporting Standard
- MWMS Supabase Event Schema
- MWMS Brain Room Architecture
- HeadOffice Operational Intelligence Framework
- HeadOffice Newsletter Intelligence Operating Protocol
- Research Brain Source Evaluation Framework
- Data Brain Measurement Integrity Framework
- Experimentation Brain Canon
- MWMS Kaizen Continuous Improvement Loop
- MWMS System Change Log
This framework does not replace those standards.
It defines the control architecture for AI Employee loops.
Future Enhancements
Future pages or modules may include:
- MWMS Agent Action Registry
- MWMS Agent Stop Condition Standard
- MWMS Agent Loop Dashboard Specification
- MWMS Prompt Versioning And Parameter Control Standard
- MWMS Deep Search Source Record Standard
- MWMS Search Scrape Summarise Evidence Pipeline Standard
- MWMS Agent Loop Regression Test Library
- MWMS AI Employee Promotion And Restriction Standard
- MWMS Agentic Dial And Workflow Control Standard
These should only be created when enough implementation or course material justifies them.
Drift Protection
This framework prevents the following drift:
- one prompt doing too many jobs
- hidden SDK loops controlling important workflows
- AI Employees wandering without limits
- search loops running too long
- premature final answers
- unobserved tool actions
- untracked source inspection
- cost growth from uncontrolled loops
- AI actions without workflow goals
- AI actions without task linkage
- AI outputs without stop reasons
- AI Employees gaining autonomy without evaluation evidence
- prompt changes breaking unrelated behaviour
- failure patterns being lost instead of converted into Kaizen learning
- making every workflow too agentic
- using agent loops where deterministic workflows are safer
- treating search snippets as evidence
- failing to consolidate actions that always belong together
- creating unnecessary AI decisions where workflow rules would be better
If an agent loop cannot be controlled, traced, stopped, and evaluated, it should not be trusted for important MWMS work.
Architectural Intent
The architectural intent of this framework is to turn AI Employee autonomy into a controlled MWMS system capability.
MWMS is not building loose chatbots.
MWMS is building governed AI Employees that can reason, search, inspect, decide, route, and improve under HeadOffice control.
The agent loop is the core operating cycle of those AI Employees.
A strong loop gives MWMS:
- control
- visibility
- repeatability
- safety
- cost discipline
- better evaluation
- better Kaizen learning
- scalable autonomy
A weak loop creates hidden risk.
The additional workflow-first rule ensures MWMS does not overuse agents where controlled workflows would be safer, cheaper, and more predictable.
This framework ensures MWMS AI Employees become controlled workers, not uncontrolled model actions.
Change Log
v1.1 Agent Vs Workflow Update
Updated the MWMS Agent Loop Control Framework with insights from the Agents vs Workflows and Search Scrape Summarise course block.
Added:
- Agent Vs Workflow Rule
- Agentic Dial Rule
- Workflow First Governance Rule
- Action Consolidation Rule
- Search Plus Inspect Workflow Pattern
- When Not To Use An Agent Loop
- Research Planning Action Rules
- Query Rewriting Action Rules
- Source Summarisation Action Rules
- autonomy mode field in the conceptual loop record
- research plan and source summary fields in the conceptual loop record
This update clarifies that MWMS should not make every workflow agentic. Some workflows should remain deterministic or assisted. AI autonomy should be increased only where judgement is useful and where observability, evaluation, cost control, and HeadOffice governance support it.
v1.0 Initial Draft
Created the MWMS Agent Loop Control Framework based on absorbed insights from Matt Pocock AIhero Build DeepSearch In TypeScript.
Integrated principles from course blocks covering:
- extracting system parameters
- weaknesses of overloaded system prompts
- replacing hidden SDK-managed tool loops with controlled loops
- shared workflow context
- modular actions
- next-action picker pattern
- action-specific model and prompt control
- stop conditions
- forced final answer or escalation
- loop observability
- action-level evaluation
- prompt hygiene
- prompt caching considerations
- parameter versioning
- AI Employee autonomy governance
Established this framework as the MWMS governance page for controlling multi-step AI Employee agent loops across Brains, workflows, and future client-facing AI systems.