System: MWMS
Document Type: Standard
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, Agent Loops, Deep Search Workflows, Research Workflows, Affiliate Evaluation, Ads Review, Content Workflows, Newsletter Intelligence, HeadOffice Intelligence, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, Brain Room, Dev Console, AI Employee Dashboards, HeadOffice Dashboards, Finance Brain, Future Client Portals
Parent Page: HeadOffice
Source Of Truth: MCR
Related Frameworks: MWMS AI Guardrail And Preflight Check Standard, MWMS Source Visibility And Evidence Display Standard, MWMS AI Work Session Persistence Standard, MWMS AI Observability Metadata Standard, MWMS Agent Loop Control Framework, MWMS AI Employee Evaluation Scorecard Standard, MWMS Deep Search Quality And Observability Framework
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration
Purpose
The purpose of this standard is to define how MWMS tracks, displays, reviews, and controls AI usage and cost across AI Employees, Brains, workflows, sessions, tools, and future client-facing systems.
AI work is not free.
Every AI Employee action may consume:
- model tokens
- tool usage
- search actions
- source inspections
- database actions
- streaming time
- evaluator actions
- storage
- human review time
- workflow overhead
This standard ensures AI usage is visible before it becomes a cost problem.
MWMS must be able to see not only whether an AI Employee produced a good answer, but whether that answer was worth the cost.
Scope
This standard applies to any MWMS system where AI work creates usage or cost.
This includes:
- Brain Room
- Dev Console
- HeadOffice Intelligence
- Newsletter Intelligence
- Deep Search workflows
- Research Brain investigations
- Affiliate Brain offer evaluations
- Ads Brain compliance reviews
- Content Brain research and drafting
- Data Brain validation
- Experimentation Brain analysis
- Finance Brain cost reviews
- Agent loop workflows
- Source inspection workflows
- Evaluator workflows
- Future client-facing AIBS systems
- Future AI Employee dashboards
- Future paid client accounts
This standard does not define exact pricing models, accounting software, API billing logic, or provider-specific cost calculations.
It defines the MWMS governance standard for usage and cost visibility.
Core Rule
The core rule is:
AI usage should be visible before it becomes a cost problem.
If an AI Employee performs meaningful work, MWMS should eventually know:
- what was used
- who used it
- which Brain used it
- which Employee used it
- which workflow used it
- what model was used
- what tools were used
- how many actions occurred
- how many sources were inspected
- how much it cost
- whether the cost was justified
- whether the output succeeded or failed
A useful answer at an unsustainable cost is not a scalable MWMS workflow.
Definition Of AI Usage
AI usage refers to any measurable resource consumed by an AI workflow.
Usage may include:
- input tokens
- output tokens
- total tokens
- model actions
- tool actions
- search actions
- source inspections
- database actions
- file reads
- evaluator actions
- retries
- streamed response duration
- storage writes
- trace records
- human review time
- external API usage
Usage is the raw activity.
Cost is the financial or operational consequence of that activity.
Definition Of AI Cost Visibility
AI Cost Visibility means MWMS can see the estimated or actual cost of AI work in a useful operational format.
Cost visibility should answer:
- How much did this session cost?
- How much did this AI Employee cost today?
- Which Brain is using the most?
- Which workflow is expensive?
- Which tool is driving cost?
- Which failed runs wasted money?
- Which successful runs are worth scaling?
- Which client or project is consuming the most?
- Which model is too expensive for the task?
- Which actions should be consolidated or restricted?
Cost visibility turns AI usage into business intelligence.
Why MWMS Needs This Standard
Without usage and cost visibility, MWMS risks:
- hidden model spend
- expensive research loops
- excessive source inspection
- repeated failed actions
- costly retries
- oversized prompts
- unnecessary model use
- expensive evaluator loops
- poor client pricing
- weak Finance Brain reporting
- scaling workflows that lose money
- not knowing which AI Employees are efficient
- not knowing which tasks are worth automating
With usage and cost visibility, MWMS gains:
- better HeadOffice control
- better Finance Brain oversight
- better AI Employee evaluation
- better workflow optimisation
- better pricing for future clients
- better cost-per-output understanding
- better guardrails
- better scaling decisions
Usage Visibility Versus Observability
Observability shows what happened.
Usage visibility shows what resources were consumed.
| Observability | Usage Visibility |
|---|---|
| what actions happened | what those actions consumed |
| model/tool trace | token and tool counts |
| source and DB activity | cost per action |
| failure trace | cost of failure |
| debugging | budget control |
| workflow review | scalability review |
Both are required.
An AI Employee may be observable but still financially inefficient.
Usage Visibility Versus Finance Reporting
Finance reporting tracks business money.
Usage visibility tracks AI operational consumption.
Usage visibility feeds Finance Brain later.
Examples:
| Usage Visibility | Finance Reporting |
|---|---|
| tokens used per session | monthly AI provider bill |
| cost per AI Employee | AI system operating cost |
| cost per successful output | profit margin analysis |
| cost per client | client account profitability |
| failed run cost | waste reduction |
Finance Brain should eventually receive summarised usage data from HeadOffice.
Required Usage Metrics
MWMS should eventually capture the following usage metrics where available.
| Metric | Description |
|---|---|
| input_tokens | Tokens sent into the model |
| output_tokens | Tokens generated by the model |
| total_tokens | Combined token usage |
| model_actions | Number of model requests |
| tool_actions | Number of tool actions |
| search_actions | Number of searches |
| source_inspections | Number of inspected sources |
| database_actions | Number of DB actions |
| evaluator_actions | Number of eval/judge actions |
| retries | Number of repeated attempts |
| failed_actions | Number of failed actions |
| successful_actions | Number of successful actions |
| session_duration | Total time spent |
| stream_duration | Response streaming time |
| workflow_steps | Number of workflow steps |
Required Cost Metrics
MWMS should eventually capture or estimate:
| Metric | Description |
|---|---|
| model_cost | Estimated model cost |
| tool_cost | Estimated external tool cost |
| search_cost | Estimated search cost |
| source_inspection_cost | Estimated scrape/crawl cost |
| evaluation_cost | Cost of evaluator/judge actions |
| retry_cost | Cost caused by retries |
| failed_run_cost | Cost of failed workflow |
| successful_output_cost | Cost of completed output |
| session_cost | Total session cost |
| workflow_cost | Total workflow cost |
| employee_cost | Cost by AI Employee |
| brain_cost | Cost by Brain |
| client_cost | Cost by client/account |
| daily_cost | Daily total |
| monthly_cost | Monthly total |
Usage Visibility Levels
Not every workflow needs the same level of usage reporting.
Level 1: Basic Usage
Shows:
- model used
- total tokens
- estimated session cost
- status
Use for:
- simple internal AI replies
- early testing
- low-risk workflows
Level 2: Operational Usage
Shows Level 1 plus:
- model actions
- tool actions
- search count
- source count
- duration
- cost by session
- success/failure status
Use for:
- Brain Room
- Dev Console
- HeadOffice Intelligence
- Newsletter Intelligence
- internal AI Employees
Level 3: Governance Usage
Shows Level 2 plus:
- cost by Brain
- cost by AI Employee
- cost by workflow type
- cost per successful output
- failed run cost
- retry cost
- evaluator cost
- human review requirement
Use for:
- Deep Search
- Research Brain
- Affiliate Brain
- Ads Brain
- Experimentation Brain
- Finance Brain visibility
Level 4: Commercial Usage
Shows Level 3 plus:
- cost by client
- cost by account
- usage allowance
- usage overage
- margin estimate
- billing support
- plan limits
- client-facing usage summary
Use for:
- future client-facing AIBS systems
- paid client portals
- white-label delivery
- consultant-led systems
Session-Level Usage
Each AI work session should eventually show a usage summary.
Recommended fields:
| Field | Description |
|---|---|
| session_id | Work session ID |
| brain_name | Brain responsible |
| ai_employee_name | Employee responsible |
| workflow_type | Type of workflow |
| model_used | Model or model group |
| input_tokens | Input tokens |
| output_tokens | Output tokens |
| total_tokens | Total tokens |
| tool_actions | Number of tool actions |
| search_actions | Number of searches |
| sources_inspected | Number of sources inspected |
| evaluator_actions | Number of evals |
| retries | Number of retries |
| duration | Total time |
| estimated_cost | Estimated total cost |
| status | Completed, failed, parked, routed |
| cost_status | Normal, high, excessive, unknown |
AI Employee-Level Usage
MWMS should eventually show usage by AI Employee.
Recommended fields:
- AI Employee name
- owning Brain
- number of sessions
- number of successful outputs
- number of failed outputs
- total tokens
- total model actions
- total tool actions
- total source inspections
- total estimated cost
- average cost per session
- average cost per successful output
- failure cost
- most expensive workflow
- most common cost driver
- cost trend
- efficiency rating
This allows HeadOffice to see which AI Employees are cost-efficient and which need tuning.
Brain-Level Usage
MWMS should eventually show usage by Brain.
Recommended Brain-level metrics:
- total AI cost by Brain
- total sessions by Brain
- successful outputs by Brain
- failed outputs by Brain
- cost per workflow type
- cost per AI Employee
- top cost drivers
- usage trend
- cost per useful decision
- cost per routed action
- review-required cost
- wasted cost from failures
This helps HeadOffice and Finance Brain understand which Brains consume resources.
Workflow-Level Usage
Some workflows will be more expensive than others.
Workflow-level reporting should show:
- workflow name
- workflow type
- average cost
- average duration
- average tool actions
- average source inspections
- pass rate
- fail rate
- human review rate
- cost per approved output
- cost per rejected output
- cost per parked output
- cost per useful decision
This helps MWMS decide which workflows should be simplified, consolidated, limited, or improved.
Tool-Level Usage
Tool-level usage should show which tools drive cost and complexity.
Tool metrics may include:
- tool name
- tool type
- number of actions
- success rate
- failure rate
- average latency
- estimated cost
- retry count
- output usefulness
- workflows using the tool
- AI Employees using the tool
This matters for:
- search tools
- scraper tools
- browser tools
- evaluator tools
- analytics tools
- ad platform tools
- external APIs
- future MCP tools
Source Inspection Usage
Deep Search workflows can become expensive when source inspection is uncontrolled.
Source usage should track:
- number of sources found
- number of sources inspected
- number of sources summarised
- number of failed inspections
- cost per source inspection
- cost per useful source
- source inspection limit
- source inspection overuse
- used-in-final-answer count
Rule
If many sources are inspected but few are used, the research planning or source selection process should be improved.
Evaluator Usage
LLM-as-judge and evaluator workflows also cost money.
Evaluator usage should track:
- number of evaluator actions
- evaluation model used
- evaluation token usage
- evaluation cost
- eval pass/fail result
- cost per evaluation
- whether eval prevented a bad output
- whether eval created a regression case
- whether eval caused rerun or revision
Rule
Evaluators are valuable, but they must also be cost-visible.
Retry And Failure Cost
Retries and failures must be visible.
Failure costs include:
- failed tool actions
- failed model outputs
- invalid structured outputs
- failed source inspections
- failed database writes
- repeated retries
- human review caused by poor AI output
- abandoned sessions
- failed final answers
Rule
Failed runs should not disappear from cost reporting.
A failed workflow still consumed resources.
Cost Status Labels
MWMS should classify cost status.
| Cost Status | Meaning |
|---|---|
| Normal | Cost is expected for workflow |
| Elevated | Higher than expected but acceptable |
| High | Requires review |
| Excessive | Should trigger investigation |
| Unknown | Cost could not be calculated |
Cost status should consider workflow value.
A high-cost workflow may be acceptable for a high-value decision.
A low-value task should not consume high-cost resources.
Usage Limits
MWMS should support usage limits at multiple levels.
Possible limit types:
- per session
- per task
- per user
- per AI Employee
- per Brain
- per workflow
- per client
- per day
- per month
- per tool
- per source inspection
- per evaluator run
Limits may include:
- token limits
- cost limits
- model action limits
- search limits
- source inspection limits
- retry limits
- evaluator limits
Usage Limit Responses
When a usage limit is reached, the system should produce a controlled outcome.
Possible outcomes:
- answer with limitations
- pause and request approval
- route to human review
- downgrade to cheaper model
- reduce source count
- stop with reason
- park for later
- create Finance Brain review
- create Kaizen note
The system should not fail silently.
Cost Preflight Rule
Before starting expensive workflows, the AI Guardrail And Preflight Check should estimate likely cost level.
Cost preflight should ask:
- Is this worth Deep Search?
- Is this worth source inspection?
- Is this worth evaluator scoring?
- Can existing context answer it?
- Can a cheaper workflow be used?
- Is human clarification needed first?
- Is the workflow budget available?
This prevents expensive workflows from starting unnecessarily.
Usage Display In The Frontend
Usage should eventually be visible to operators.
Operator-facing usage display may include:
- token count
- estimated cost
- number of model actions
- number of tool actions
- number of sources inspected
- number of evaluator actions
- workflow duration
- cost status
- warning if cost is high
- comparison to expected cost
This should be shown in:
- Brain Room where useful
- Dev Console
- HeadOffice dashboards
- AI Employee dashboards
- Deep Search sessions
- future client portals where appropriate
Usage Display By Audience
Different users need different usage visibility.
| Audience | Suggested View |
|---|---|
| Martyn / HeadOffice | full operational cost and usage |
| M / Developer | tokens, actions, errors, technical usage |
| Finance Brain | cost summaries and trends |
| AI Employee dashboard | cost by Employee and workflow |
| Client | simplified usage allowance and summary |
| Public user | usually hidden or simplified |
Internal MWMS usage detail should usually be richer than client-facing usage detail.
Client-Facing Usage Visibility
Future client-facing systems may need usage visibility.
Possible client-facing fields:
- number of AI sessions used
- usage allowance remaining
- monthly usage summary
- heavy usage warning
- plan limit status
- overage warning if applicable
Client-facing usage should be simple and not expose internal MWMS details.
Usage And Pricing Strategy
Usage visibility will support future pricing decisions.
MWMS can use usage data to understand:
- cost per client
- cost per workflow
- cost per AI Employee
- cost per support request
- cost per report
- cost per research output
- gross margin
- plan limits
- fair usage policies
- when to charge more
- when to restrict expensive features
This is important for AIBS and future white-label systems.
Usage And AI Employee Evaluation
Usage should be part of AI Employee evaluation.
An AI Employee is not only judged by output quality.
It should also be judged by:
- cost per useful output
- cost per failed output
- cost per decision
- unnecessary tool use
- retry rate
- source inspection efficiency
- evaluator cost
- latency
- cost trend over time
This connects to the MWMS AI Employee Evaluation Scorecard Standard.
Usage And Kaizen
Usage problems should create Kaizen opportunities.
Kaizen notes may include:
- prompt too large
- too many source inspections
- weak query plan created waste
- unnecessary evaluator actions
- repeated retries
- expensive model used for simple task
- tool failure caused wasted cost
- response too long
- workflow should be deterministic
- action consolidation needed
- cheaper model possible
Usage visibility turns cost into system learning.
Cost Reduction Strategies
MWMS should use usage data to reduce waste.
Possible strategies:
- use cheaper models for simple actions
- reserve expensive models for synthesis or judgement
- consolidate actions that always belong together
- improve research planning
- limit source inspections
- reduce retries
- cache reusable source summaries
- use deterministic checks before LLM judges
- compact context
- improve prompts
- ask clarifying questions before research
- stop early when evidence is sufficient
- avoid Deep Search for simple requests
Cost And Quality Balance Rule
MWMS should not optimise only for low cost.
The goal is:
Best useful outcome at acceptable cost.
Low-cost bad output is waste.
High-cost excellent output may be justified for major decisions.
The correct question is:
Was the result worth the cost for this workflow?
Usage Data Quality
Usage reporting is only useful if data quality is acceptable.
Usage data should include:
- source of estimate
- provider used
- model used
- timestamp
- workflow ID
- session ID
- whether cost is exact or estimated
- currency if applicable
- confidence in estimate
If cost cannot be calculated, mark it as unknown rather than pretending.
Recommended Usage Object
{
"usage_id": "",
"session_id": "",
"task_id": "",
"trace_id": "",
"brain_name": "",
"ai_employee_name": "",
"workflow_type": "",
"user_id": "",
"client_id": "",
"model_provider": "",
"model_name": "",
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"model_actions": 0,
"tool_actions": 0,
"search_actions": 0,
"source_inspections": 0,
"evaluator_actions": 0,
"retries": 0,
"failed_actions": 0,
"successful_actions": 0,
"duration_ms": 0,
"model_cost_estimate": 0,
"tool_cost_estimate": 0,
"search_cost_estimate": 0,
"source_inspection_cost_estimate": 0,
"evaluation_cost_estimate": 0,
"total_cost_estimate": 0,
"currency": "",
"cost_status": "",
"usage_limit_status": "",
"workflow_status": "",
"output_success": false,
"review_required": false,
"kaizen_note": ""
}
This is conceptual only.
Exact implementation can be adapted later.
Minimum Starting Implementation
MWMS does not need perfect cost tracking immediately.
Minimum starting fields:
- session ID
- Brain
- AI Employee
- workflow type
- model used
- input tokens
- output tokens
- total tokens
- tool actions
- source inspections
- estimated cost
- workflow status
- success or failure
- cost status
- Kaizen note if waste is detected
This is enough to begin usage visibility.
Relationship To Guardrail And Preflight
The AI Guardrail And Preflight Check Standard decides whether the work is worth starting.
Usage and cost visibility provides the data needed to improve that decision.
Relationship:
Preflight asks:
Is this worth the likely cost?
Usage visibility later shows:
Was it actually worth the cost?
Relationship To Work Session Persistence
Usage data should attach to AI work sessions.
A work session should eventually show:
- total tokens
- total cost
- actions used
- tools used
- sources inspected
- evaluator actions
- duration
- cost status
- whether the session was worth the cost
This turns sessions into reusable business records.
Relationship To Observability Metadata
Usage data is part of observability metadata.
Observability shows what happened.
Usage data shows what it consumed.
The two should share identifiers such as:
- trace ID
- session ID
- task ID
- Brain
- AI Employee
- workflow type
Relationship To Finance Brain
Finance Brain should eventually receive summarised usage and cost data.
Finance Brain may use it to analyse:
- AI system operating cost
- cost per Brain
- cost per AI Employee
- cost per client
- cost per workflow
- monthly usage trends
- margin risk
- budget allocation
- scaling readiness
HeadOffice owns the standard.
Finance Brain uses the cost intelligence.
Relationship To Source Visibility
Source visibility shows which sources were used.
Usage visibility shows what those source inspections cost.
Together they answer:
Did the evidence justify the source inspection cost?
This helps MWMS improve Deep Search workflows.
Relationship To Evaluation Scorecards
AI Employee evaluation should include cost efficiency.
An output may pass factuality and relevancy but still fail cost efficiency.
Evaluation should ask:
- Was the cost justified?
- Did the workflow overuse tools?
- Did it inspect too many sources?
- Did it retry unnecessarily?
- Did it use an expensive model unnecessarily?
- Did it fail after consuming high cost?
Relationship To Agent Loop Control
Agent loops must include usage limits and cost stop conditions.
The loop should know:
- current cost
- cost limit
- source inspection count
- model action count
- retry count
- evaluator count
The Next Action Picker should not choose expensive actions blindly.
Relationship To Future Client Systems
Future client systems may need cost and usage controls before public release.
Client-facing AI systems should define:
- usage allowance
- plan limits
- fair use limits
- overage rules
- expensive feature gating
- admin visibility
- client summary display
- internal margin tracking
This prevents client-facing AI tools from becoming unprofitable.
Failure Conditions
Usage visibility should be marked failed or weak if:
- token usage is missing
- model name is missing
- session cost is unknown for high-cost workflows
- failed runs are not counted
- retries are invisible
- source inspection count is missing
- evaluator cost is missing
- cost cannot be tied to Brain or AI Employee
- cost cannot be tied to session or task
- client usage cannot be separated from internal usage
- high-cost workflows have no review trigger
Human Review Triggers
Human or HeadOffice review should be triggered when:
- cost status is high or excessive
- usage limit is reached
- failed run cost is high
- retry count is high
- source inspection count is excessive
- evaluator cost is excessive
- a workflow becomes more expensive over time
- an AI Employee cost trend increases sharply
- a client account exceeds fair use
- cost is unknown for a high-value workflow
- quality is low but cost is high
Drift Protection
This standard prevents the following drift:
- hidden AI spend
- uncontrolled Deep Search cost
- expensive failed workflows
- repeated retries without review
- source inspection overuse
- evaluator overuse
- expensive models used for simple tasks
- client-facing systems without margin awareness
- Finance Brain lacking AI usage data
- HeadOffice scaling workflows without cost evidence
- AI Employees judged only on output quality, not efficiency
If MWMS cannot see usage, it cannot control cost.
If MWMS cannot control cost, it cannot scale safely.
Architectural Intent
The architectural intent of this standard is to make AI cost visible, governable, and improvable.
MWMS is building an AI-powered business operating system.
That system must not only be intelligent.
It must be economically sustainable.
AI usage and cost visibility allow MWMS to decide:
- what to automate
- what to restrict
- what to improve
- what to price
- what to scale
- what to stop
This standard ensures AI Employees are judged not only by whether they work, but by whether they work at a cost MWMS can sustain.
Change Log
v1.0 Initial Draft
Created the MWMS AI Usage And Cost Visibility Standard based on absorbed insights from the final block of Matt Pocock AIhero Build DeepSearch In TypeScript.
Integrated principles from course sections covering:
- showing usage in the frontend
- token visibility
- model action visibility
- tool and source inspection usage
- cost awareness
- expensive Deep Search workflow control
- evaluator cost tracking
- usage visibility for operators
- cost governance before scaling
- future client-facing usage and pricing awareness
Established this standard as the MWMS governance page for tracking, displaying, reviewing, and controlling AI usage and cost across Brains, AI Employees, workflows, sessions, tools, and future client systems.