System: MWMS
Document Type: Standard
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, Agent Loops, Deep Search Workflows, Research Workflows, Affiliate Evaluation, Ads Review, Content Workflows, Experimentation Brain, HeadOffice Intelligence, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, AI Employee Dashboards, Future Agent Control Dashboards
Parent Page: HeadOffice
Source Of Truth: MCR
Related Frameworks: MWMS Agent Loop Control Framework, MWMS Deep Search Quality And Observability Framework, MWMS AI Observability Metadata Standard, MWMS AI Employee Evaluation Scorecard Standard
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration
Purpose
The purpose of this standard is to define how MWMS AI Employees choose their next action inside a controlled agent loop.
A Next Action Picker is the decision layer that decides what the AI Employee should do next based on the workflow goal, current context, previous actions, evidence gathered, limits, failures, and success criteria.
This standard ensures AI Employees do not wander freely, call tools randomly, answer too early, search forever, ignore weak evidence, or bypass MWMS governance.
The AI Employee may choose the next action, but only from approved MWMS action types.
Scope
This standard applies to any MWMS workflow where an AI Employee performs multi-step work.
This includes:
- Deep Search workflows
- Research Brain source investigation
- Affiliate Brain offer evaluation
- Ads Brain compliance review
- Content Brain research and production workflows
- HeadOffice Intelligence processing
- Newsletter Intelligence routing
- Experimentation Brain analysis
- Data Brain validation
- Finance Brain cost review
- Brain Room AI assistance
- Dev Console AI assistance
- future client-facing AIBS agents
- future autonomous AI Employees
- future AI workflows using search, scraping, databases, routing, escalation, or task creation
This standard does not define exact code implementation.
It defines the MWMS governance standard for controlled action selection.
Core Rule
The core rule is:
An AI Employee may choose what to do next, but only from actions MWMS has approved for that workflow, Brain, role, and risk level.
The Next Action Picker must not invent new actions.
It must not use tools outside its permission boundary.
It must not continue working when stop, escalation, or review conditions are met.
It must not answer if the workflow requires more evidence.
It must not search or scrape if the cost, time, source, or step limit has been reached.
Definition Of A Next Action Picker
A Next Action Picker is a controlled decision component inside an AI Employee loop.
It reviews the current workflow state and returns a structured decision such as:
- search
- inspect source
- query database
- answer
- route
- escalate
- retry
- park
- stop
- request human review
The Next Action Picker does not perform the action itself.
It chooses the next approved action and explains why that action is appropriate.
Why MWMS Needs A Next Action Picker
Without a controlled Next Action Picker, AI Employees may:
- search when they should answer
- answer when they should gather evidence
- inspect weak sources
- repeat the same search
- miss required database context
- ignore the original request
- continue after the budget is exceeded
- fail to escalate risky decisions
- call tools they should not use
- stop without producing a controlled outcome
- hide uncertainty
- route to the wrong Brain
- create untraceable outputs
A Next Action Picker gives MWMS controlled autonomy.
It allows the AI to make decisions while keeping those decisions inside MWMS boundaries.
Relationship To Agent Loop Control
This standard is a companion to the MWMS Agent Loop Control Framework.
The Agent Loop Control Framework defines the full loop.
The Next Action Picker Standard defines the decision point inside that loop.
The normal loop pattern is:
- Load workflow goal
- Load current context
- Review action history
- Check limits and stop conditions
- Use Next Action Picker
- Execute selected action
- Record result
- Update context
- Evaluate progress
- Continue, answer, route, escalate, or stop
The Next Action Picker is the moment where the system decides what happens next.
Required Inputs
A Next Action Picker should receive enough context to make a safe and useful decision.
Recommended inputs:
- original user request
- workflow goal
- Brain name
- AI Employee name
- workflow type
- task ID
- thread ID
- current step
- maximum steps
- previous actions
- current evidence
- search history
- source inspection history
- database records already used
- tool failures
- retry count
- current confidence
- cost used
- time used
- allowed actions
- allowed tools
- stop conditions
- escalation conditions
- required output type
- success criteria
- risk level
- human review rules
The Next Action Picker must always know the original user request or workflow goal.
A picker that does not know the goal cannot reliably choose the correct next action.
Required Output
The Next Action Picker must return structured output.
Recommended output structure:
{
"action_type": "",
"reason": "",
"target": "",
"input_summary": "",
"confidence": 0,
"requires_tool": false,
"requires_human_review": false,
"expected_result": "",
"risk_note": "",
"stop_condition_triggered": false,
"escalation_required": false
}
This structure allows MWMS to log, evaluate, and improve action decisions.
Required Output Fields
action_type
Defines the selected next action.
Examples:
- search
- inspect_source
- query_database
- answer
- route
- escalate
- retry
- park
- stop
- request_human_review
reason
Explains why this action was selected.
This should be short but specific.
Bad reason:
Need more info.
Better reason:
The current evidence only comes from search snippets, so one official source should be inspected before answering.
target
Defines what the action should act on.
Examples:
- search query
- URL
- database table
- source record
- task ID
- Brain name
- workflow stage
input_summary
Summarises the input that should be passed to the action.
confidence
The picker’s confidence that this is the correct next action.
This is not final answer confidence.
It is action-selection confidence.
requires_tool
Marks whether the next action requires a tool.
requires_human_review
Marks whether the next action requires human approval or review.
expected_result
Defines what the selected action is expected to produce.
risk_note
Captures any risk attached to the selected action.
stop_condition_triggered
Marks whether a stop condition has been reached.
escalation_required
Marks whether escalation is required.
Approved Action Types
MWMS should maintain approved action types.
Initial approved action types include:
| Action Type | Purpose |
|---|---|
| search | Find external sources or information |
| inspect_source | Open, scrape, crawl, or inspect a selected source |
| query_database | Retrieve MWMS internal records |
| evaluate_evidence | Judge whether evidence is enough |
| answer | Produce final or partial answer |
| route | Send result or task to another Brain |
| escalate | Raise to human or higher authority |
| retry | Repeat a failed action under limits |
| park | Pause or hold for later |
| reject | Reject unsuitable item or path |
| request_human_review | Ask human to review before proceeding |
| request_more_information | Ask for missing user/system information |
| stop | End the loop with reason |
Future actions may be added only when HeadOffice approves them.
Action Type: Search
Use search when external information is needed and current context is insufficient.
Search is appropriate when:
- the task depends on current information
- source evidence is required
- internal records are not enough
- market, platform, policy, or product facts may have changed
- source diversity is needed
- the workflow is a Deep Search task
Search is not appropriate when:
- the answer can be produced from existing verified context
- the workflow has already searched enough
- search budget has been reached
- the query is too vague and clarification is required first
- the action would duplicate an earlier search
The picker should output:
- search query
- reason for search
- expected evidence type
- freshness need
- source preference if relevant
Action Type: Inspect Source
Use inspect_source when search snippets are not enough.
Inspect source is appropriate when:
- a source appears important
- final answer needs evidence
- source details must be verified
- source freshness must be checked
- source reliability must be assessed
- a vendor page, policy page, or official page must be inspected
- the workflow depends on page content, not just search results
Inspect source is not appropriate when:
- the source is clearly low quality
- the source is irrelevant
- the source inspection limit has been reached
- the workflow does not require source evidence
- a higher-trust source should be searched first
The picker should output:
- URL or source ID
- reason for inspection
- evidence expected
- source risk note
- whether official source is preferred
Action Type: Query Database
Use query_database when internal MWMS records may answer or support the workflow.
Query database is appropriate when:
- task history matters
- source records already exist
- offer records already exist
- campaign records already exist
- previous evaluations exist
- Brain Room context is needed
- newsletter records are needed
- experiment records are needed
- cost or finance records are needed
- prior regression failures are relevant
Query database is not appropriate when:
- no internal context is required
- the AI Employee lacks permission
- the query would access unrelated data
- external current information is required first
The picker should output:
- database or system target
- record type
- reason for query
- expected record needed
- permission requirement
Action Type: Evaluate Evidence
Use evaluate_evidence when the system must decide whether collected evidence is enough to answer.
Evaluate evidence is appropriate when:
- multiple sources have been inspected
- sources conflict
- evidence may be weak
- confidence needs calibration
- a decision depends on source quality
- the system may need to continue or stop
The picker should output:
- evidence to evaluate
- evaluation purpose
- required criteria
- expected decision
Action Type: Answer
Use answer when enough context exists to produce a final or partial response.
Answer is appropriate when:
- evidence is sufficient
- required searches are complete
- source quality is acceptable
- database context is available
- confidence is high enough
- limits require a final answer with caveats
- no further action is useful
Answer is not appropriate when:
- evidence is weak
- key sources have not been inspected
- the task is unclear
- current information is required but not confirmed
- safety or compliance review is needed first
- the loop has missing required metadata
The picker should output:
- answer type
- confidence basis
- limitation note if needed
- whether human review is required
Action Type: Route
Use route when another Brain, workflow, or AI Employee should take over.
Route is appropriate when:
- source evaluation belongs to Research Brain
- offer decision belongs to Affiliate Brain
- campaign issue belongs to Ads Brain
- cost issue belongs to Finance Brain
- experiment issue belongs to Experimentation Brain
- system issue belongs to Operations Brain
- compliance issue belongs to Risk Brain
- framework update belongs in MCR
The picker should output:
- destination Brain
- reason for routing
- priority
- urgency
- expected next action
Action Type: Escalate
Use escalate when the workflow requires higher authority.
Escalate is appropriate when:
- financial risk exists
- compliance risk exists
- client-facing risk exists
- evidence conflicts
- confidence is low
- tool failures block progress
- database failures block progress
- loop limits were reached
- AI Employee lacks authority
- human judgement is required
The picker should output:
- escalation reason
- risk level
- current findings
- missing information
- recommended human action
Action Type: Retry
Use retry only when a failed action has a reasonable chance of succeeding.
Retry is appropriate when:
- temporary tool failure occurred
- source fetch failed once
- database call failed temporarily
- output format failed and can be corrected
- rate limit may reset within allowed workflow logic
Retry is not appropriate when:
- max retries reached
- failure is not temporary
- permission was denied
- source is blocked
- action is structurally wrong
- repeated retry would waste cost
The picker should output:
- failed action
- retry reason
- retry count
- changed input if any
- fallback if retry fails
Action Type: Park
Use park when the workflow should pause without being rejected.
Park is appropriate when:
- information is not available yet
- source freshness cannot be confirmed
- user decision is needed later
- task is useful but not urgent
- budget is not available
- workflow depends on external timing
- more course or business context is needed
The picker should output:
- park reason
- recommended review date if known
- missing condition
- destination record
Action Type: Reject
Use reject when the item, source, path, or opportunity should not proceed.
Reject is appropriate when:
- evidence shows poor fit
- source is unreliable
- offer is unsuitable
- compliance risk is too high
- task is outside scope
- duplicated item exists
- action would waste resources
The picker should output:
- rejection reason
- evidence summary
- risk note
- whether record should be archived
Action Type: Request Human Review
Use request_human_review when the AI should not proceed alone.
Human review is required when:
- output affects budget
- output affects campaign launch
- output affects public claims
- output affects client-facing work
- compliance risk exists
- financial risk exists
- evidence is weak
- sources conflict
- trace quality is low
- AI confidence is low
- AI confidence appears too high for evidence
The picker should output:
- review reason
- what the human must decide
- current evidence
- risk level
- recommended next action
Action Type: Request More Information
Use request_more_information when the workflow cannot proceed safely without missing information.
This is appropriate when:
- user request is ambiguous
- key business context is missing
- required input is unavailable
- the AI cannot identify the correct Brain or workflow
- the output would be guesswork
The picker should output:
- missing information
- why it matters
- suggested question or field needed
This action should not be overused when the AI can make a safe, bounded assumption.
Action Type: Stop
Use stop when the loop must end.
Stop is appropriate when:
- max steps reached
- max cost reached
- max time reached
- repeated failures occurred
- task completed
- required escalation occurred
- no useful next action exists
- workflow is invalid
- permission boundary blocks progress
Stop must include a reason.
The loop should then produce one of:
- final answer
- partial answer
- escalation record
- parked record
- failed record
- routed task
No silent stop.
Stop Vs Answer Rule
Stop and answer are not the same.
Answer means the system has enough information to produce a response.
Stop means the loop should not continue.
A loop can stop because:
- answer is ready
- escalation is required
- limit was reached
- task failed
- task was parked
- no safe action exists
If the picker chooses stop, it must also define the controlled outcome.
Escalation Priority Rule
Escalation overrides normal action selection when risk is high.
The picker must prefer escalation over continued autonomous action when:
- compliance risk is high
- financial risk is high
- public claim risk is high
- client impact is high
- source conflict is unresolved
- required permissions are missing
- evaluation failure is serious
- trace quality is insufficient for the workflow risk
Evidence Sufficiency Rule
Before choosing answer, the picker must consider whether evidence is sufficient.
Evidence sufficiency depends on:
- task risk
- source quality
- source freshness
- source count
- source diversity
- inspected source depth
- internal record support
- conflict detection
- confidence threshold
- required decision quality
For low-risk tasks, one internal record may be enough.
For Deep Search or compliance tasks, multiple inspected and current sources may be required.
Cost And Limit Awareness Rule
The Next Action Picker must be aware of limits.
It should consider:
- current step
- maximum steps
- searches already used
- sources already inspected
- retries already used
- cost already used
- latency already used
- token budget
- tool rate limits
The picker should not choose expensive actions when the cost or workflow value does not justify them.
No Duplicate Action Rule
The picker should avoid repeating actions unless there is a clear reason.
Examples of duplicate action problems:
- searching the same query again
- inspecting the same URL again
- retrying a permanently failed tool
- asking the same clarification twice
- routing to the same Brain repeatedly
- generating final answer multiple times without new evidence
Duplicate actions should be flagged in action history.
Original Goal Preservation Rule
The picker must preserve the original workflow goal.
It should not drift into related but irrelevant tasks.
Before selecting an action, it should check:
- Does this action move the original task forward?
- Does this action answer the real question?
- Does this action support the expected output?
- Does this action belong to the assigned Brain?
- Does this action help reach a decision?
This prevents agent drift.
Risk Level Rule
Action selection must consider risk level.
Low Risk
The picker may choose answer sooner if context is enough.
Medium Risk
The picker should prefer source or database support before answering.
High Risk
The picker should prefer evidence, review, or escalation before final output.
High-risk workflows include:
- compliance review
- ad claim review
- budget recommendation
- affiliate offer approval
- client-facing advice
- policy-sensitive decisions
- legal, financial, or medical topics
Confidence Rule
Action selection confidence must be separate from final answer confidence.
Example:
- The picker may be highly confident that the next action should be search.
- But the final answer confidence may still be low because evidence has not been gathered.
The picker confidence should reflect:
- clarity of next step
- evidence state
- risk level
- available actions
- previous failures
- remaining limits
Tool Permission Rule
The picker must check tool permission before choosing a tool-requiring action.
If the action requires a tool not allowed for the AI Employee, the picker must choose one of:
- route
- escalate
- request human review
- answer with limitations
- stop with reason
It must not bypass permission rules.
Human Review Flag Rule
The picker should flag human review early when risk appears.
Human review should not only happen after the final answer.
If the picker detects risk mid-loop, it should select:
request_human_review
or
escalate
This avoids wasted tool calls and unsafe autonomous continuation.
Output Format Rule
The Next Action Picker must return predictable output.
Freeform explanations are not enough.
A valid picker output should include:
- action type
- reason
- target
- confidence
- review flag
- expected result
This allows MWMS to evaluate whether the picker made a good decision.
Suggested Prompt Structure
A Next Action Picker prompt should include:
- Role
- Workflow goal
- Allowed actions
- Current context
- Action history
- Evidence state
- Limits and stop conditions
- Risk rules
- Selection criteria
- Output schema
Example structure:
You are the MWMS Next Action Picker for [AI Employee].
Your job is to choose the next approved action only.
Workflow Goal:
[goal]
Allowed Actions:
[action list]
Current Context:
[context summary]
Action History:
[action history]
Limits:
[step/cost/time/source limits]
Risk Rules:
[review/escalation rules]
Return structured output only.
Evaluation Requirements
Next Action Picker decisions should be evaluated.
Evaluation questions:
- Did it choose the correct action?
- Did it answer too early?
- Did it search unnecessarily?
- Did it inspect the right source?
- Did it avoid duplicate actions?
- Did it respect limits?
- Did it escalate when required?
- Did it preserve the original goal?
- Did it choose from approved actions only?
- Did it provide a useful reason?
- Did it include required output fields?
This links directly to the MWMS AI Employee Evaluation Scorecard Standard.
Deterministic Checks
Some picker outputs can be checked deterministically.
Examples:
- action_type is present
- action_type is approved
- reason is present
- confidence is present
- target is present where required
- requires_tool is present
- human review flag is present
- stop condition flag is present
- output is valid structured format
These should be tested before LLM-as-judge evaluation.
LLM As Judge Checks
Some picker decisions require judgement.
Examples:
- Was search actually needed?
- Was the chosen source worth inspecting?
- Was it too early to answer?
- Should it have escalated?
- Did it misunderstand the original goal?
- Was the action reason strong?
- Did the selected action move the workflow forward?
These can use LLM-as-judge evals with structured categories.
Suggested Picker Judgement Categories
| Category | Meaning |
|---|---|
| Correct Action | Best available next action |
| Acceptable Action | Reasonable, though not ideal |
| Weak Action | May work but misses a better option |
| Premature Answer | Tried to answer too early |
| Unnecessary Search | Searched when enough context existed |
| Missed Escalation | Continued when review was required |
| Goal Drift | Action did not support the original task |
| Invalid Action | Chose action outside approved list |
| Limit Violation | Ignored step, cost, time, or permission limits |
Failure Conditions
A Next Action Picker decision should be marked failed if:
- it chooses an unapproved action
- it chooses a tool the AI Employee cannot use
- it answers without required evidence
- it ignores stop conditions
- it ignores escalation rules
- it repeats failed actions without reason
- it routes to the wrong Brain
- it loses the original goal
- it fails to output structured data
- it hides uncertainty
- it chooses action with no clear reason
- it causes avoidable cost waste
- it bypasses human review requirements
Action Picker Trace Requirements
Every Next Action Picker decision should be logged.
Recommended trace fields:
- trace ID
- loop ID
- task ID
- Brain
- AI Employee
- workflow type
- step number
- allowed actions
- selected action
- reason
- target
- confidence
- risk note
- requires tool
- requires human review
- stop condition triggered
- escalation required
- previous action
- next status
- evaluation result
- failure reason if any
This connects to the MWMS AI Observability Metadata Standard.
Relationship To Agent Action Registry
This standard may later require an MWMS Agent Action Registry.
The registry would define:
- approved action names
- action descriptions
- allowed tools
- required inputs
- required outputs
- allowed Brains
- risk level
- review requirement
- evaluation rules
Do not create the registry until implementation requires it.
For now, the action list inside this standard is sufficient.
Relationship To Agent Stop Conditions
This standard may later require an MWMS Agent Stop Condition Standard.
The stop condition page would define:
- max steps
- max searches
- max sources
- max retries
- cost limits
- time limits
- source conflict rules
- tool failure rules
- escalation thresholds
- forced final answer rules
Do not create the stop condition page until the course or implementation gives more detail.
For now, stop condition rules inside this standard are sufficient.
Relationship To Agent Loop Context
This standard depends on the loop context container defined in the MWMS Agent Loop Control Framework.
The picker cannot make good decisions without:
- original request
- workflow goal
- current context
- action history
- evidence state
- limits
- risk rules
- allowed actions
If the context container is weak, the picker will be weak.
Relationship To Prompt Optimisation
Prompt changes to the picker must be tested.
Do not randomly rewrite the picker prompt without checking:
- action choice accuracy
- answer timing
- search behaviour
- escalation behaviour
- routing behaviour
- cost impact
- regression failures
Prompt optimisation must be evaluation-driven.
Relationship To Kaizen
Every failed or weak Next Action Picker decision should become Kaizen learning.
Kaizen notes may include:
- picker answered too early
- picker searched unnecessarily
- picker missed source conflict
- picker ignored freshness
- picker failed to escalate
- picker routed incorrectly
- picker repeated action
- picker chose expensive path
- picker misunderstood goal
These failures should feed:
- prompt improvements
- action list changes
- stop condition changes
- regression tests
- AI Employee restriction or promotion decisions
Minimum Starting Implementation
MWMS does not need a complex picker system immediately.
The first version should require:
- approved action list
- structured action_type output
- reason field
- confidence field
- human review flag
- stop condition flag
- action history logging
- invalid action detection
- basic evaluation of action choice
This is enough to begin controlled agent loops.
Future Enhancements
Future enhancements may include:
- MWMS Agent Action Registry
- MWMS Agent Stop Condition Standard
- MWMS Agent Loop Context Schema
- MWMS Next Action Picker Eval Dataset
- MWMS Agent Loop Dashboard Specification
- MWMS AI Employee Autonomy Level Standard
- MWMS Prompt Versioning And Parameter Control Standard
- MWMS Agent Loop Regression Test Library
These should be created only when course material or implementation pressure justifies them.
Drift Protection
This standard prevents the following drift:
- AI Employees choosing actions freely
- uncontrolled tool use
- hidden decision logic
- repeated searches
- premature answers
- missed escalation
- weak evidence being treated as enough
- high-cost action loops
- action choices without reasons
- picker decisions that cannot be evaluated
- prompt changes breaking action logic
- agents drifting away from the original task
- Brains receiving wrongly routed work
- human review being bypassed
If the Next Action Picker cannot be controlled, traced, and evaluated, the AI Employee should not be given more autonomy.
Architectural Intent
The architectural intent of this standard is to make AI Employee autonomy safe, bounded, and reviewable.
MWMS does not want uncontrolled AI agents.
MWMS wants controlled AI Employees that can choose the next best action within a governed system.
The Next Action Picker is the decision gate inside the agent loop.
It gives MWMS a practical way to balance autonomy and control.
The AI can think.
MWMS sets the boundaries.
HeadOffice governs the system.
Change Log
v1.0 Initial Draft
Created the MWMS Next Action Picker Standard based on absorbed insights from Matt Pocock AIhero Build DeepSearch In TypeScript.
Integrated principles from course blocks covering:
- controlled agent-loop design
- replacing overloaded prompts
- approved action selection
- next-action picker pattern
- structured action output
- action confidence
- stop condition awareness
- escalation awareness
- search, scrape, answer, route, stop, and review decisions
- action-level observability
- action-level evaluation
- prompt optimisation through eval results
- Kaizen routing for failed action decisions
Established this standard as the MWMS governance page for controlled next-action selection inside AI Employee agent loops.