MWMS Next Action Picker Standard

System: MWMS
Document Type: Standard
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, Agent Loops, Deep Search Workflows, Research Workflows, Affiliate Evaluation, Ads Review, Content Workflows, Experimentation Brain, HeadOffice Intelligence, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, AI Employee Dashboards, Future Agent Control Dashboards
Parent Page: HeadOffice
Source Of Truth: MCR
Related Frameworks: MWMS Agent Loop Control Framework, MWMS Deep Search Quality And Observability Framework, MWMS AI Observability Metadata Standard, MWMS AI Employee Evaluation Scorecard Standard
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration

Purpose

The purpose of this standard is to define how MWMS AI Employees choose their next action inside a controlled agent loop.

A Next Action Picker is the decision layer that decides what the AI Employee should do next based on the workflow goal, current context, previous actions, evidence gathered, limits, failures, and success criteria.

This standard ensures AI Employees do not wander freely, call tools randomly, answer too early, search forever, ignore weak evidence, or bypass MWMS governance.

The AI Employee may choose the next action, but only from approved MWMS action types.

Scope

This standard applies to any MWMS workflow where an AI Employee performs multi-step work.

This includes:

Deep Search workflows
Research Brain source investigation
Affiliate Brain offer evaluation
Ads Brain compliance review
Content Brain research and production workflows
HeadOffice Intelligence processing
Newsletter Intelligence routing
Experimentation Brain analysis
Data Brain validation
Finance Brain cost review
Brain Room AI assistance
Dev Console AI assistance
future client-facing AIBS agents
future autonomous AI Employees
future AI workflows using search, scraping, databases, routing, escalation, or task creation

This standard does not define exact code implementation.

It defines the MWMS governance standard for controlled action selection.

Core Rule

The core rule is:

An AI Employee may choose what to do next, but only from actions MWMS has approved for that workflow, Brain, role, and risk level.

The Next Action Picker must not invent new actions.

It must not use tools outside its permission boundary.

It must not continue working when stop, escalation, or review conditions are met.

It must not answer if the workflow requires more evidence.

It must not search or scrape if the cost, time, source, or step limit has been reached.

Definition Of A Next Action Picker

A Next Action Picker is a controlled decision component inside an AI Employee loop.

It reviews the current workflow state and returns a structured decision such as:

search
inspect source
query database
answer
route
escalate
retry
park
stop
request human review

The Next Action Picker does not perform the action itself.

It chooses the next approved action and explains why that action is appropriate.

Why MWMS Needs A Next Action Picker

Without a controlled Next Action Picker, AI Employees may:

search when they should answer
answer when they should gather evidence
inspect weak sources
repeat the same search
miss required database context
ignore the original request
continue after the budget is exceeded
fail to escalate risky decisions
call tools they should not use
stop without producing a controlled outcome
hide uncertainty
route to the wrong Brain
create untraceable outputs

A Next Action Picker gives MWMS controlled autonomy.

It allows the AI to make decisions while keeping those decisions inside MWMS boundaries.

Relationship To Agent Loop Control

This standard is a companion to the MWMS Agent Loop Control Framework.

The Agent Loop Control Framework defines the full loop.

The Next Action Picker Standard defines the decision point inside that loop.

The normal loop pattern is:

Load workflow goal
Load current context
Review action history
Check limits and stop conditions
Use Next Action Picker
Execute selected action
Record result
Update context
Evaluate progress
Continue, answer, route, escalate, or stop

The Next Action Picker is the moment where the system decides what happens next.

Required Inputs

A Next Action Picker should receive enough context to make a safe and useful decision.

Recommended inputs:

original user request
workflow goal
Brain name
AI Employee name
workflow type
task ID
thread ID
current step
maximum steps
previous actions
current evidence
search history
source inspection history
database records already used
tool failures
retry count
current confidence
cost used
time used
allowed actions
allowed tools
stop conditions
escalation conditions
required output type
success criteria
risk level
human review rules

The Next Action Picker must always know the original user request or workflow goal.

A picker that does not know the goal cannot reliably choose the correct next action.

Required Output

The Next Action Picker must return structured output.

Recommended output structure:

{
  "action_type": "",
  "reason": "",
  "target": "",
  "input_summary": "",
  "confidence": 0,
  "requires_tool": false,
  "requires_human_review": false,
  "expected_result": "",
  "risk_note": "",
  "stop_condition_triggered": false,
  "escalation_required": false
}

This structure allows MWMS to log, evaluate, and improve action decisions.

Required Output Fields

action_type

Defines the selected next action.

Examples:

search
inspect_source
query_database
answer
route
escalate
retry
park
stop
request_human_review

reason

Explains why this action was selected.

This should be short but specific.

Bad reason:

Need more info.

Better reason:

The current evidence only comes from search snippets, so one official source should be inspected before answering.

target

Defines what the action should act on.

Examples:

search query
URL
database table
source record
task ID
Brain name
workflow stage

input_summary

Summarises the input that should be passed to the action.

confidence

The picker’s confidence that this is the correct next action.

This is not final answer confidence.

It is action-selection confidence.

requires_tool

Marks whether the next action requires a tool.

requires_human_review

Marks whether the next action requires human approval or review.

expected_result

Defines what the selected action is expected to produce.

risk_note

Captures any risk attached to the selected action.

stop_condition_triggered

Marks whether a stop condition has been reached.

escalation_required

Marks whether escalation is required.

Approved Action Types

MWMS should maintain approved action types.

Initial approved action types include:

Action Type	Purpose
search	Find external sources or information
inspect_source	Open, scrape, crawl, or inspect a selected source
query_database	Retrieve MWMS internal records
evaluate_evidence	Judge whether evidence is enough
answer	Produce final or partial answer
route	Send result or task to another Brain
escalate	Raise to human or higher authority
retry	Repeat a failed action under limits
park	Pause or hold for later
reject	Reject unsuitable item or path
request_human_review	Ask human to review before proceeding
request_more_information	Ask for missing user/system information
stop	End the loop with reason

Future actions may be added only when HeadOffice approves them.

Action Type: Search

Use search when external information is needed and current context is insufficient.

Search is appropriate when:

the task depends on current information
source evidence is required
internal records are not enough
market, platform, policy, or product facts may have changed
source diversity is needed
the workflow is a Deep Search task

Search is not appropriate when:

the answer can be produced from existing verified context
the workflow has already searched enough
search budget has been reached
the query is too vague and clarification is required first
the action would duplicate an earlier search

The picker should output:

search query
reason for search
expected evidence type
freshness need
source preference if relevant

Action Type: Inspect Source

Use inspect_source when search snippets are not enough.

Inspect source is appropriate when:

a source appears important
final answer needs evidence
source details must be verified
source freshness must be checked
source reliability must be assessed
a vendor page, policy page, or official page must be inspected
the workflow depends on page content, not just search results

Inspect source is not appropriate when:

the source is clearly low quality
the source is irrelevant
the source inspection limit has been reached
the workflow does not require source evidence
a higher-trust source should be searched first

The picker should output:

URL or source ID
reason for inspection
evidence expected
source risk note
whether official source is preferred

Action Type: Query Database

Use query_database when internal MWMS records may answer or support the workflow.

Query database is appropriate when:

task history matters
source records already exist
offer records already exist
campaign records already exist
previous evaluations exist
Brain Room context is needed
newsletter records are needed
experiment records are needed
cost or finance records are needed
prior regression failures are relevant

Query database is not appropriate when:

no internal context is required
the AI Employee lacks permission
the query would access unrelated data
external current information is required first

The picker should output:

database or system target
record type
reason for query
expected record needed
permission requirement

Action Type: Evaluate Evidence

Use evaluate_evidence when the system must decide whether collected evidence is enough to answer.

Evaluate evidence is appropriate when:

multiple sources have been inspected
sources conflict
evidence may be weak
confidence needs calibration
a decision depends on source quality
the system may need to continue or stop

The picker should output:

evidence to evaluate
evaluation purpose
required criteria
expected decision

Action Type: Answer

Use answer when enough context exists to produce a final or partial response.

Answer is appropriate when:

evidence is sufficient
required searches are complete
source quality is acceptable
database context is available
confidence is high enough
limits require a final answer with caveats
no further action is useful

Answer is not appropriate when:

evidence is weak
key sources have not been inspected
the task is unclear
current information is required but not confirmed
safety or compliance review is needed first
the loop has missing required metadata

The picker should output:

answer type
confidence basis
limitation note if needed
whether human review is required

Action Type: Route

Use route when another Brain, workflow, or AI Employee should take over.

Route is appropriate when:

source evaluation belongs to Research Brain
offer decision belongs to Affiliate Brain
campaign issue belongs to Ads Brain
cost issue belongs to Finance Brain
experiment issue belongs to Experimentation Brain
system issue belongs to Operations Brain
compliance issue belongs to Risk Brain
framework update belongs in MCR

The picker should output:

destination Brain
reason for routing
priority
urgency
expected next action

Action Type: Escalate

Use escalate when the workflow requires higher authority.

Escalate is appropriate when:

financial risk exists
compliance risk exists
client-facing risk exists
evidence conflicts
confidence is low
tool failures block progress
database failures block progress
loop limits were reached
AI Employee lacks authority
human judgement is required

The picker should output:

escalation reason
risk level
current findings
missing information
recommended human action

Action Type: Retry

Use retry only when a failed action has a reasonable chance of succeeding.

Retry is appropriate when:

temporary tool failure occurred
source fetch failed once
database call failed temporarily
output format failed and can be corrected
rate limit may reset within allowed workflow logic

Retry is not appropriate when:

max retries reached
failure is not temporary
permission was denied
source is blocked
action is structurally wrong
repeated retry would waste cost

The picker should output:

failed action
retry reason
retry count
changed input if any
fallback if retry fails

Action Type: Park

Use park when the workflow should pause without being rejected.

Park is appropriate when:

information is not available yet
source freshness cannot be confirmed
user decision is needed later
task is useful but not urgent
budget is not available
workflow depends on external timing
more course or business context is needed

The picker should output:

park reason
recommended review date if known
missing condition
destination record

Action Type: Reject

Use reject when the item, source, path, or opportunity should not proceed.

Reject is appropriate when:

evidence shows poor fit
source is unreliable
offer is unsuitable
compliance risk is too high
task is outside scope
duplicated item exists
action would waste resources

The picker should output:

rejection reason
evidence summary
risk note
whether record should be archived

Action Type: Request Human Review

Use request_human_review when the AI should not proceed alone.

Human review is required when:

output affects budget
output affects campaign launch
output affects public claims
output affects client-facing work
compliance risk exists
financial risk exists
evidence is weak
sources conflict
trace quality is low
AI confidence is low
AI confidence appears too high for evidence

The picker should output:

review reason
what the human must decide
current evidence
risk level
recommended next action

Action Type: Request More Information

Use request_more_information when the workflow cannot proceed safely without missing information.

This is appropriate when:

user request is ambiguous
key business context is missing
required input is unavailable
the AI cannot identify the correct Brain or workflow
the output would be guesswork

The picker should output:

missing information
why it matters
suggested question or field needed

This action should not be overused when the AI can make a safe, bounded assumption.

Action Type: Stop

Use stop when the loop must end.

Stop is appropriate when:

max steps reached
max cost reached
max time reached
repeated failures occurred
task completed
required escalation occurred
no useful next action exists
workflow is invalid
permission boundary blocks progress

Stop must include a reason.

The loop should then produce one of:

final answer
partial answer
escalation record
parked record
failed record
routed task

No silent stop.

Stop Vs Answer Rule

Stop and answer are not the same.

Answer means the system has enough information to produce a response.

Stop means the loop should not continue.

A loop can stop because:

answer is ready
escalation is required
limit was reached
task failed
task was parked
no safe action exists

If the picker chooses stop, it must also define the controlled outcome.

Escalation Priority Rule

Escalation overrides normal action selection when risk is high.

The picker must prefer escalation over continued autonomous action when:

compliance risk is high
financial risk is high
public claim risk is high
client impact is high
source conflict is unresolved
required permissions are missing
evaluation failure is serious
trace quality is insufficient for the workflow risk

Evidence Sufficiency Rule

Before choosing answer, the picker must consider whether evidence is sufficient.

Evidence sufficiency depends on:

task risk
source quality
source freshness
source count
source diversity
inspected source depth
internal record support
conflict detection
confidence threshold
required decision quality

For low-risk tasks, one internal record may be enough.

For Deep Search or compliance tasks, multiple inspected and current sources may be required.

Cost And Limit Awareness Rule

The Next Action Picker must be aware of limits.

It should consider:

current step
maximum steps
searches already used
sources already inspected
retries already used
cost already used
latency already used
token budget
tool rate limits

The picker should not choose expensive actions when the cost or workflow value does not justify them.

No Duplicate Action Rule

The picker should avoid repeating actions unless there is a clear reason.

Examples of duplicate action problems:

searching the same query again
inspecting the same URL again
retrying a permanently failed tool
asking the same clarification twice
routing to the same Brain repeatedly
generating final answer multiple times without new evidence

Duplicate actions should be flagged in action history.

Original Goal Preservation Rule

The picker must preserve the original workflow goal.

It should not drift into related but irrelevant tasks.

Before selecting an action, it should check:

Does this action move the original task forward?
Does this action answer the real question?
Does this action support the expected output?
Does this action belong to the assigned Brain?
Does this action help reach a decision?

This prevents agent drift.

Risk Level Rule

Action selection must consider risk level.

Low Risk

The picker may choose answer sooner if context is enough.

Medium Risk

The picker should prefer source or database support before answering.

High Risk

The picker should prefer evidence, review, or escalation before final output.

High-risk workflows include:

compliance review
ad claim review
budget recommendation
affiliate offer approval
client-facing advice
policy-sensitive decisions
legal, financial, or medical topics

Confidence Rule

Action selection confidence must be separate from final answer confidence.

Example:

The picker may be highly confident that the next action should be search.
But the final answer confidence may still be low because evidence has not been gathered.

The picker confidence should reflect:

clarity of next step
evidence state
risk level
available actions
previous failures
remaining limits

Tool Permission Rule

The picker must check tool permission before choosing a tool-requiring action.

If the action requires a tool not allowed for the AI Employee, the picker must choose one of:

route
escalate
request human review
answer with limitations
stop with reason

It must not bypass permission rules.

Human Review Flag Rule

The picker should flag human review early when risk appears.

Human review should not only happen after the final answer.

If the picker detects risk mid-loop, it should select:

request_human_review

escalate

This avoids wasted tool calls and unsafe autonomous continuation.

Output Format Rule

The Next Action Picker must return predictable output.

Freeform explanations are not enough.

A valid picker output should include:

action type
reason
target
confidence
review flag
expected result

This allows MWMS to evaluate whether the picker made a good decision.

Suggested Prompt Structure

A Next Action Picker prompt should include:

Role
Workflow goal
Allowed actions
Current context
Action history
Evidence state
Limits and stop conditions
Risk rules
Selection criteria
Output schema

Example structure:

You are the MWMS Next Action Picker for [AI Employee].

Your job is to choose the next approved action only.

Workflow Goal:
[goal]

Allowed Actions:
[action list]

Current Context:
[context summary]

Action History:
[action history]

Limits:
[step/cost/time/source limits]

Risk Rules:
[review/escalation rules]

Return structured output only.

Evaluation Requirements

Next Action Picker decisions should be evaluated.

Evaluation questions:

Did it choose the correct action?
Did it answer too early?
Did it search unnecessarily?
Did it inspect the right source?
Did it avoid duplicate actions?
Did it respect limits?
Did it escalate when required?
Did it preserve the original goal?
Did it choose from approved actions only?
Did it provide a useful reason?
Did it include required output fields?

This links directly to the MWMS AI Employee Evaluation Scorecard Standard.

Deterministic Checks

Some picker outputs can be checked deterministically.

Examples:

action_type is present
action_type is approved
reason is present
confidence is present
target is present where required
requires_tool is present
human review flag is present
stop condition flag is present
output is valid structured format

These should be tested before LLM-as-judge evaluation.

LLM As Judge Checks

Some picker decisions require judgement.

Examples:

Was search actually needed?
Was the chosen source worth inspecting?
Was it too early to answer?
Should it have escalated?
Did it misunderstand the original goal?
Was the action reason strong?
Did the selected action move the workflow forward?

These can use LLM-as-judge evals with structured categories.

Suggested Picker Judgement Categories

Category	Meaning
Correct Action	Best available next action
Acceptable Action	Reasonable, though not ideal
Weak Action	May work but misses a better option
Premature Answer	Tried to answer too early
Unnecessary Search	Searched when enough context existed
Missed Escalation	Continued when review was required
Goal Drift	Action did not support the original task
Invalid Action	Chose action outside approved list
Limit Violation	Ignored step, cost, time, or permission limits

Failure Conditions

A Next Action Picker decision should be marked failed if:

it chooses an unapproved action
it chooses a tool the AI Employee cannot use
it answers without required evidence
it ignores stop conditions
it ignores escalation rules
it repeats failed actions without reason
it routes to the wrong Brain
it loses the original goal
it fails to output structured data
it hides uncertainty
it chooses action with no clear reason
it causes avoidable cost waste
it bypasses human review requirements

Action Picker Trace Requirements

Every Next Action Picker decision should be logged.

Recommended trace fields:

trace ID
loop ID
task ID
Brain
AI Employee
workflow type
step number
allowed actions
selected action
reason
target
confidence
risk note
requires tool
requires human review
stop condition triggered
escalation required
previous action
next status
evaluation result
failure reason if any

This connects to the MWMS AI Observability Metadata Standard.

Relationship To Agent Action Registry

This standard may later require an MWMS Agent Action Registry.

The registry would define:

approved action names
action descriptions
allowed tools
required inputs
required outputs
allowed Brains
risk level
review requirement
evaluation rules

Do not create the registry until implementation requires it.

For now, the action list inside this standard is sufficient.

Relationship To Agent Stop Conditions

This standard may later require an MWMS Agent Stop Condition Standard.

The stop condition page would define:

max steps
max searches
max sources
max retries
cost limits
time limits
source conflict rules
tool failure rules
escalation thresholds
forced final answer rules

Do not create the stop condition page until the course or implementation gives more detail.

For now, stop condition rules inside this standard are sufficient.

Relationship To Agent Loop Context

This standard depends on the loop context container defined in the MWMS Agent Loop Control Framework.

The picker cannot make good decisions without:

original request
workflow goal
current context
action history
evidence state
limits
risk rules
allowed actions

If the context container is weak, the picker will be weak.

Relationship To Prompt Optimisation

Prompt changes to the picker must be tested.

Do not randomly rewrite the picker prompt without checking:

action choice accuracy
answer timing
search behaviour
escalation behaviour
routing behaviour
cost impact
regression failures

Prompt optimisation must be evaluation-driven.

Relationship To Kaizen

Every failed or weak Next Action Picker decision should become Kaizen learning.

Kaizen notes may include:

picker answered too early
picker searched unnecessarily
picker missed source conflict
picker ignored freshness
picker failed to escalate
picker routed incorrectly
picker repeated action
picker chose expensive path
picker misunderstood goal

These failures should feed:

prompt improvements
action list changes
stop condition changes
regression tests
AI Employee restriction or promotion decisions

Minimum Starting Implementation

MWMS does not need a complex picker system immediately.

The first version should require:

approved action list
structured action_type output
reason field
confidence field
human review flag
stop condition flag
action history logging
invalid action detection
basic evaluation of action choice

This is enough to begin controlled agent loops.

Future Enhancements

Future enhancements may include:

MWMS Agent Action Registry
MWMS Agent Stop Condition Standard
MWMS Agent Loop Context Schema
MWMS Next Action Picker Eval Dataset
MWMS Agent Loop Dashboard Specification
MWMS AI Employee Autonomy Level Standard
MWMS Prompt Versioning And Parameter Control Standard
MWMS Agent Loop Regression Test Library

These should be created only when course material or implementation pressure justifies them.

Drift Protection

This standard prevents the following drift:

AI Employees choosing actions freely
uncontrolled tool use
hidden decision logic
repeated searches
premature answers
missed escalation
weak evidence being treated as enough
high-cost action loops
action choices without reasons
picker decisions that cannot be evaluated
prompt changes breaking action logic
agents drifting away from the original task
Brains receiving wrongly routed work
human review being bypassed

If the Next Action Picker cannot be controlled, traced, and evaluated, the AI Employee should not be given more autonomy.

Architectural Intent

The architectural intent of this standard is to make AI Employee autonomy safe, bounded, and reviewable.

MWMS does not want uncontrolled AI agents.

MWMS wants controlled AI Employees that can choose the next best action within a governed system.

The Next Action Picker is the decision gate inside the agent loop.

It gives MWMS a practical way to balance autonomy and control.

The AI can think.

MWMS sets the boundaries.

HeadOffice governs the system.

Change Log

v1.0 Initial Draft

Created the MWMS Next Action Picker Standard based on absorbed insights from Matt Pocock AIhero Build DeepSearch In TypeScript.

Integrated principles from course blocks covering:

controlled agent-loop design
replacing overloaded prompts
approved action selection
next-action picker pattern
structured action output
action confidence
stop condition awareness
escalation awareness
search, scrape, answer, route, stop, and review decisions
action-level observability
action-level evaluation
prompt optimisation through eval results
Kaizen routing for failed action decisions

Established this standard as the MWMS governance page for controlled next-action selection inside AI Employee agent loops.