MWMS AI Usage And Cost Visibility Standard

System: MWMS
Document Type: Standard
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, Agent Loops, Deep Search Workflows, Research Workflows, Affiliate Evaluation, Ads Review, Content Workflows, Newsletter Intelligence, HeadOffice Intelligence, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, Brain Room, Dev Console, AI Employee Dashboards, HeadOffice Dashboards, Finance Brain, Future Client Portals
Parent Page: HeadOffice
Source Of Truth: MCR
Related Frameworks: MWMS AI Guardrail And Preflight Check Standard, MWMS Source Visibility And Evidence Display Standard, MWMS AI Work Session Persistence Standard, MWMS AI Observability Metadata Standard, MWMS Agent Loop Control Framework, MWMS AI Employee Evaluation Scorecard Standard, MWMS Deep Search Quality And Observability Framework
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration


Purpose

The purpose of this standard is to define how MWMS tracks, displays, reviews, and controls AI usage and cost across AI Employees, Brains, workflows, sessions, tools, and future client-facing systems.

AI work is not free.

Every AI Employee action may consume:

  • model tokens
  • tool usage
  • search actions
  • source inspections
  • database actions
  • streaming time
  • evaluator actions
  • storage
  • human review time
  • workflow overhead

This standard ensures AI usage is visible before it becomes a cost problem.

MWMS must be able to see not only whether an AI Employee produced a good answer, but whether that answer was worth the cost.


Scope

This standard applies to any MWMS system where AI work creates usage or cost.

This includes:

  • Brain Room
  • Dev Console
  • HeadOffice Intelligence
  • Newsletter Intelligence
  • Deep Search workflows
  • Research Brain investigations
  • Affiliate Brain offer evaluations
  • Ads Brain compliance reviews
  • Content Brain research and drafting
  • Data Brain validation
  • Experimentation Brain analysis
  • Finance Brain cost reviews
  • Agent loop workflows
  • Source inspection workflows
  • Evaluator workflows
  • Future client-facing AIBS systems
  • Future AI Employee dashboards
  • Future paid client accounts

This standard does not define exact pricing models, accounting software, API billing logic, or provider-specific cost calculations.

It defines the MWMS governance standard for usage and cost visibility.


Core Rule

The core rule is:

AI usage should be visible before it becomes a cost problem.

If an AI Employee performs meaningful work, MWMS should eventually know:

  • what was used
  • who used it
  • which Brain used it
  • which Employee used it
  • which workflow used it
  • what model was used
  • what tools were used
  • how many actions occurred
  • how many sources were inspected
  • how much it cost
  • whether the cost was justified
  • whether the output succeeded or failed

A useful answer at an unsustainable cost is not a scalable MWMS workflow.


Definition Of AI Usage

AI usage refers to any measurable resource consumed by an AI workflow.

Usage may include:

  • input tokens
  • output tokens
  • total tokens
  • model actions
  • tool actions
  • search actions
  • source inspections
  • database actions
  • file reads
  • evaluator actions
  • retries
  • streamed response duration
  • storage writes
  • trace records
  • human review time
  • external API usage

Usage is the raw activity.

Cost is the financial or operational consequence of that activity.


Definition Of AI Cost Visibility

AI Cost Visibility means MWMS can see the estimated or actual cost of AI work in a useful operational format.

Cost visibility should answer:

  • How much did this session cost?
  • How much did this AI Employee cost today?
  • Which Brain is using the most?
  • Which workflow is expensive?
  • Which tool is driving cost?
  • Which failed runs wasted money?
  • Which successful runs are worth scaling?
  • Which client or project is consuming the most?
  • Which model is too expensive for the task?
  • Which actions should be consolidated or restricted?

Cost visibility turns AI usage into business intelligence.


Why MWMS Needs This Standard

Without usage and cost visibility, MWMS risks:

  • hidden model spend
  • expensive research loops
  • excessive source inspection
  • repeated failed actions
  • costly retries
  • oversized prompts
  • unnecessary model use
  • expensive evaluator loops
  • poor client pricing
  • weak Finance Brain reporting
  • scaling workflows that lose money
  • not knowing which AI Employees are efficient
  • not knowing which tasks are worth automating

With usage and cost visibility, MWMS gains:

  • better HeadOffice control
  • better Finance Brain oversight
  • better AI Employee evaluation
  • better workflow optimisation
  • better pricing for future clients
  • better cost-per-output understanding
  • better guardrails
  • better scaling decisions

Usage Visibility Versus Observability

Observability shows what happened.

Usage visibility shows what resources were consumed.

ObservabilityUsage Visibility
what actions happenedwhat those actions consumed
model/tool tracetoken and tool counts
source and DB activitycost per action
failure tracecost of failure
debuggingbudget control
workflow reviewscalability review

Both are required.

An AI Employee may be observable but still financially inefficient.


Usage Visibility Versus Finance Reporting

Finance reporting tracks business money.

Usage visibility tracks AI operational consumption.

Usage visibility feeds Finance Brain later.

Examples:

Usage VisibilityFinance Reporting
tokens used per sessionmonthly AI provider bill
cost per AI EmployeeAI system operating cost
cost per successful outputprofit margin analysis
cost per clientclient account profitability
failed run costwaste reduction

Finance Brain should eventually receive summarised usage data from HeadOffice.


Required Usage Metrics

MWMS should eventually capture the following usage metrics where available.

MetricDescription
input_tokensTokens sent into the model
output_tokensTokens generated by the model
total_tokensCombined token usage
model_actionsNumber of model requests
tool_actionsNumber of tool actions
search_actionsNumber of searches
source_inspectionsNumber of inspected sources
database_actionsNumber of DB actions
evaluator_actionsNumber of eval/judge actions
retriesNumber of repeated attempts
failed_actionsNumber of failed actions
successful_actionsNumber of successful actions
session_durationTotal time spent
stream_durationResponse streaming time
workflow_stepsNumber of workflow steps

Required Cost Metrics

MWMS should eventually capture or estimate:

MetricDescription
model_costEstimated model cost
tool_costEstimated external tool cost
search_costEstimated search cost
source_inspection_costEstimated scrape/crawl cost
evaluation_costCost of evaluator/judge actions
retry_costCost caused by retries
failed_run_costCost of failed workflow
successful_output_costCost of completed output
session_costTotal session cost
workflow_costTotal workflow cost
employee_costCost by AI Employee
brain_costCost by Brain
client_costCost by client/account
daily_costDaily total
monthly_costMonthly total

Usage Visibility Levels

Not every workflow needs the same level of usage reporting.

Level 1: Basic Usage

Shows:

  • model used
  • total tokens
  • estimated session cost
  • status

Use for:

  • simple internal AI replies
  • early testing
  • low-risk workflows

Level 2: Operational Usage

Shows Level 1 plus:

  • model actions
  • tool actions
  • search count
  • source count
  • duration
  • cost by session
  • success/failure status

Use for:

  • Brain Room
  • Dev Console
  • HeadOffice Intelligence
  • Newsletter Intelligence
  • internal AI Employees

Level 3: Governance Usage

Shows Level 2 plus:

  • cost by Brain
  • cost by AI Employee
  • cost by workflow type
  • cost per successful output
  • failed run cost
  • retry cost
  • evaluator cost
  • human review requirement

Use for:

  • Deep Search
  • Research Brain
  • Affiliate Brain
  • Ads Brain
  • Experimentation Brain
  • Finance Brain visibility

Level 4: Commercial Usage

Shows Level 3 plus:

  • cost by client
  • cost by account
  • usage allowance
  • usage overage
  • margin estimate
  • billing support
  • plan limits
  • client-facing usage summary

Use for:

  • future client-facing AIBS systems
  • paid client portals
  • white-label delivery
  • consultant-led systems

Session-Level Usage

Each AI work session should eventually show a usage summary.

Recommended fields:

FieldDescription
session_idWork session ID
brain_nameBrain responsible
ai_employee_nameEmployee responsible
workflow_typeType of workflow
model_usedModel or model group
input_tokensInput tokens
output_tokensOutput tokens
total_tokensTotal tokens
tool_actionsNumber of tool actions
search_actionsNumber of searches
sources_inspectedNumber of sources inspected
evaluator_actionsNumber of evals
retriesNumber of retries
durationTotal time
estimated_costEstimated total cost
statusCompleted, failed, parked, routed
cost_statusNormal, high, excessive, unknown

AI Employee-Level Usage

MWMS should eventually show usage by AI Employee.

Recommended fields:

  • AI Employee name
  • owning Brain
  • number of sessions
  • number of successful outputs
  • number of failed outputs
  • total tokens
  • total model actions
  • total tool actions
  • total source inspections
  • total estimated cost
  • average cost per session
  • average cost per successful output
  • failure cost
  • most expensive workflow
  • most common cost driver
  • cost trend
  • efficiency rating

This allows HeadOffice to see which AI Employees are cost-efficient and which need tuning.


Brain-Level Usage

MWMS should eventually show usage by Brain.

Recommended Brain-level metrics:

  • total AI cost by Brain
  • total sessions by Brain
  • successful outputs by Brain
  • failed outputs by Brain
  • cost per workflow type
  • cost per AI Employee
  • top cost drivers
  • usage trend
  • cost per useful decision
  • cost per routed action
  • review-required cost
  • wasted cost from failures

This helps HeadOffice and Finance Brain understand which Brains consume resources.


Workflow-Level Usage

Some workflows will be more expensive than others.

Workflow-level reporting should show:

  • workflow name
  • workflow type
  • average cost
  • average duration
  • average tool actions
  • average source inspections
  • pass rate
  • fail rate
  • human review rate
  • cost per approved output
  • cost per rejected output
  • cost per parked output
  • cost per useful decision

This helps MWMS decide which workflows should be simplified, consolidated, limited, or improved.


Tool-Level Usage

Tool-level usage should show which tools drive cost and complexity.

Tool metrics may include:

  • tool name
  • tool type
  • number of actions
  • success rate
  • failure rate
  • average latency
  • estimated cost
  • retry count
  • output usefulness
  • workflows using the tool
  • AI Employees using the tool

This matters for:

  • search tools
  • scraper tools
  • browser tools
  • evaluator tools
  • analytics tools
  • ad platform tools
  • external APIs
  • future MCP tools

Source Inspection Usage

Deep Search workflows can become expensive when source inspection is uncontrolled.

Source usage should track:

  • number of sources found
  • number of sources inspected
  • number of sources summarised
  • number of failed inspections
  • cost per source inspection
  • cost per useful source
  • source inspection limit
  • source inspection overuse
  • used-in-final-answer count

Rule

If many sources are inspected but few are used, the research planning or source selection process should be improved.


Evaluator Usage

LLM-as-judge and evaluator workflows also cost money.

Evaluator usage should track:

  • number of evaluator actions
  • evaluation model used
  • evaluation token usage
  • evaluation cost
  • eval pass/fail result
  • cost per evaluation
  • whether eval prevented a bad output
  • whether eval created a regression case
  • whether eval caused rerun or revision

Rule

Evaluators are valuable, but they must also be cost-visible.


Retry And Failure Cost

Retries and failures must be visible.

Failure costs include:

  • failed tool actions
  • failed model outputs
  • invalid structured outputs
  • failed source inspections
  • failed database writes
  • repeated retries
  • human review caused by poor AI output
  • abandoned sessions
  • failed final answers

Rule

Failed runs should not disappear from cost reporting.

A failed workflow still consumed resources.


Cost Status Labels

MWMS should classify cost status.

Cost StatusMeaning
NormalCost is expected for workflow
ElevatedHigher than expected but acceptable
HighRequires review
ExcessiveShould trigger investigation
UnknownCost could not be calculated

Cost status should consider workflow value.

A high-cost workflow may be acceptable for a high-value decision.

A low-value task should not consume high-cost resources.


Usage Limits

MWMS should support usage limits at multiple levels.

Possible limit types:

  • per session
  • per task
  • per user
  • per AI Employee
  • per Brain
  • per workflow
  • per client
  • per day
  • per month
  • per tool
  • per source inspection
  • per evaluator run

Limits may include:

  • token limits
  • cost limits
  • model action limits
  • search limits
  • source inspection limits
  • retry limits
  • evaluator limits

Usage Limit Responses

When a usage limit is reached, the system should produce a controlled outcome.

Possible outcomes:

  • answer with limitations
  • pause and request approval
  • route to human review
  • downgrade to cheaper model
  • reduce source count
  • stop with reason
  • park for later
  • create Finance Brain review
  • create Kaizen note

The system should not fail silently.


Cost Preflight Rule

Before starting expensive workflows, the AI Guardrail And Preflight Check should estimate likely cost level.

Cost preflight should ask:

  • Is this worth Deep Search?
  • Is this worth source inspection?
  • Is this worth evaluator scoring?
  • Can existing context answer it?
  • Can a cheaper workflow be used?
  • Is human clarification needed first?
  • Is the workflow budget available?

This prevents expensive workflows from starting unnecessarily.


Usage Display In The Frontend

Usage should eventually be visible to operators.

Operator-facing usage display may include:

  • token count
  • estimated cost
  • number of model actions
  • number of tool actions
  • number of sources inspected
  • number of evaluator actions
  • workflow duration
  • cost status
  • warning if cost is high
  • comparison to expected cost

This should be shown in:

  • Brain Room where useful
  • Dev Console
  • HeadOffice dashboards
  • AI Employee dashboards
  • Deep Search sessions
  • future client portals where appropriate

Usage Display By Audience

Different users need different usage visibility.

AudienceSuggested View
Martyn / HeadOfficefull operational cost and usage
M / Developertokens, actions, errors, technical usage
Finance Braincost summaries and trends
AI Employee dashboardcost by Employee and workflow
Clientsimplified usage allowance and summary
Public userusually hidden or simplified

Internal MWMS usage detail should usually be richer than client-facing usage detail.


Client-Facing Usage Visibility

Future client-facing systems may need usage visibility.

Possible client-facing fields:

  • number of AI sessions used
  • usage allowance remaining
  • monthly usage summary
  • heavy usage warning
  • plan limit status
  • overage warning if applicable

Client-facing usage should be simple and not expose internal MWMS details.


Usage And Pricing Strategy

Usage visibility will support future pricing decisions.

MWMS can use usage data to understand:

  • cost per client
  • cost per workflow
  • cost per AI Employee
  • cost per support request
  • cost per report
  • cost per research output
  • gross margin
  • plan limits
  • fair usage policies
  • when to charge more
  • when to restrict expensive features

This is important for AIBS and future white-label systems.


Usage And AI Employee Evaluation

Usage should be part of AI Employee evaluation.

An AI Employee is not only judged by output quality.

It should also be judged by:

  • cost per useful output
  • cost per failed output
  • cost per decision
  • unnecessary tool use
  • retry rate
  • source inspection efficiency
  • evaluator cost
  • latency
  • cost trend over time

This connects to the MWMS AI Employee Evaluation Scorecard Standard.


Usage And Kaizen

Usage problems should create Kaizen opportunities.

Kaizen notes may include:

  • prompt too large
  • too many source inspections
  • weak query plan created waste
  • unnecessary evaluator actions
  • repeated retries
  • expensive model used for simple task
  • tool failure caused wasted cost
  • response too long
  • workflow should be deterministic
  • action consolidation needed
  • cheaper model possible

Usage visibility turns cost into system learning.


Cost Reduction Strategies

MWMS should use usage data to reduce waste.

Possible strategies:

  • use cheaper models for simple actions
  • reserve expensive models for synthesis or judgement
  • consolidate actions that always belong together
  • improve research planning
  • limit source inspections
  • reduce retries
  • cache reusable source summaries
  • use deterministic checks before LLM judges
  • compact context
  • improve prompts
  • ask clarifying questions before research
  • stop early when evidence is sufficient
  • avoid Deep Search for simple requests

Cost And Quality Balance Rule

MWMS should not optimise only for low cost.

The goal is:

Best useful outcome at acceptable cost.

Low-cost bad output is waste.

High-cost excellent output may be justified for major decisions.

The correct question is:

Was the result worth the cost for this workflow?


Usage Data Quality

Usage reporting is only useful if data quality is acceptable.

Usage data should include:

  • source of estimate
  • provider used
  • model used
  • timestamp
  • workflow ID
  • session ID
  • whether cost is exact or estimated
  • currency if applicable
  • confidence in estimate

If cost cannot be calculated, mark it as unknown rather than pretending.


Recommended Usage Object

{
"usage_id": "",
"session_id": "",
"task_id": "",
"trace_id": "",
"brain_name": "",
"ai_employee_name": "",
"workflow_type": "",
"user_id": "",
"client_id": "",
"model_provider": "",
"model_name": "",
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"model_actions": 0,
"tool_actions": 0,
"search_actions": 0,
"source_inspections": 0,
"evaluator_actions": 0,
"retries": 0,
"failed_actions": 0,
"successful_actions": 0,
"duration_ms": 0,
"model_cost_estimate": 0,
"tool_cost_estimate": 0,
"search_cost_estimate": 0,
"source_inspection_cost_estimate": 0,
"evaluation_cost_estimate": 0,
"total_cost_estimate": 0,
"currency": "",
"cost_status": "",
"usage_limit_status": "",
"workflow_status": "",
"output_success": false,
"review_required": false,
"kaizen_note": ""
}

This is conceptual only.

Exact implementation can be adapted later.


Minimum Starting Implementation

MWMS does not need perfect cost tracking immediately.

Minimum starting fields:

  • session ID
  • Brain
  • AI Employee
  • workflow type
  • model used
  • input tokens
  • output tokens
  • total tokens
  • tool actions
  • source inspections
  • estimated cost
  • workflow status
  • success or failure
  • cost status
  • Kaizen note if waste is detected

This is enough to begin usage visibility.


Relationship To Guardrail And Preflight

The AI Guardrail And Preflight Check Standard decides whether the work is worth starting.

Usage and cost visibility provides the data needed to improve that decision.

Relationship:

Preflight asks:

Is this worth the likely cost?

Usage visibility later shows:

Was it actually worth the cost?


Relationship To Work Session Persistence

Usage data should attach to AI work sessions.

A work session should eventually show:

  • total tokens
  • total cost
  • actions used
  • tools used
  • sources inspected
  • evaluator actions
  • duration
  • cost status
  • whether the session was worth the cost

This turns sessions into reusable business records.


Relationship To Observability Metadata

Usage data is part of observability metadata.

Observability shows what happened.

Usage data shows what it consumed.

The two should share identifiers such as:

  • trace ID
  • session ID
  • task ID
  • Brain
  • AI Employee
  • workflow type

Relationship To Finance Brain

Finance Brain should eventually receive summarised usage and cost data.

Finance Brain may use it to analyse:

  • AI system operating cost
  • cost per Brain
  • cost per AI Employee
  • cost per client
  • cost per workflow
  • monthly usage trends
  • margin risk
  • budget allocation
  • scaling readiness

HeadOffice owns the standard.

Finance Brain uses the cost intelligence.


Relationship To Source Visibility

Source visibility shows which sources were used.

Usage visibility shows what those source inspections cost.

Together they answer:

Did the evidence justify the source inspection cost?

This helps MWMS improve Deep Search workflows.


Relationship To Evaluation Scorecards

AI Employee evaluation should include cost efficiency.

An output may pass factuality and relevancy but still fail cost efficiency.

Evaluation should ask:

  • Was the cost justified?
  • Did the workflow overuse tools?
  • Did it inspect too many sources?
  • Did it retry unnecessarily?
  • Did it use an expensive model unnecessarily?
  • Did it fail after consuming high cost?

Relationship To Agent Loop Control

Agent loops must include usage limits and cost stop conditions.

The loop should know:

  • current cost
  • cost limit
  • source inspection count
  • model action count
  • retry count
  • evaluator count

The Next Action Picker should not choose expensive actions blindly.


Relationship To Future Client Systems

Future client systems may need cost and usage controls before public release.

Client-facing AI systems should define:

  • usage allowance
  • plan limits
  • fair use limits
  • overage rules
  • expensive feature gating
  • admin visibility
  • client summary display
  • internal margin tracking

This prevents client-facing AI tools from becoming unprofitable.


Failure Conditions

Usage visibility should be marked failed or weak if:

  • token usage is missing
  • model name is missing
  • session cost is unknown for high-cost workflows
  • failed runs are not counted
  • retries are invisible
  • source inspection count is missing
  • evaluator cost is missing
  • cost cannot be tied to Brain or AI Employee
  • cost cannot be tied to session or task
  • client usage cannot be separated from internal usage
  • high-cost workflows have no review trigger

Human Review Triggers

Human or HeadOffice review should be triggered when:

  • cost status is high or excessive
  • usage limit is reached
  • failed run cost is high
  • retry count is high
  • source inspection count is excessive
  • evaluator cost is excessive
  • a workflow becomes more expensive over time
  • an AI Employee cost trend increases sharply
  • a client account exceeds fair use
  • cost is unknown for a high-value workflow
  • quality is low but cost is high

Drift Protection

This standard prevents the following drift:

  • hidden AI spend
  • uncontrolled Deep Search cost
  • expensive failed workflows
  • repeated retries without review
  • source inspection overuse
  • evaluator overuse
  • expensive models used for simple tasks
  • client-facing systems without margin awareness
  • Finance Brain lacking AI usage data
  • HeadOffice scaling workflows without cost evidence
  • AI Employees judged only on output quality, not efficiency

If MWMS cannot see usage, it cannot control cost.

If MWMS cannot control cost, it cannot scale safely.


Architectural Intent

The architectural intent of this standard is to make AI cost visible, governable, and improvable.

MWMS is building an AI-powered business operating system.

That system must not only be intelligent.

It must be economically sustainable.

AI usage and cost visibility allow MWMS to decide:

  • what to automate
  • what to restrict
  • what to improve
  • what to price
  • what to scale
  • what to stop

This standard ensures AI Employees are judged not only by whether they work, but by whether they work at a cost MWMS can sustain.


Change Log

v1.0 Initial Draft

Created the MWMS AI Usage And Cost Visibility Standard based on absorbed insights from the final block of Matt Pocock AIhero Build DeepSearch In TypeScript.

Integrated principles from course sections covering:

  • showing usage in the frontend
  • token visibility
  • model action visibility
  • tool and source inspection usage
  • cost awareness
  • expensive Deep Search workflow control
  • evaluator cost tracking
  • usage visibility for operators
  • cost governance before scaling
  • future client-facing usage and pricing awareness

Established this standard as the MWMS governance page for tracking, displaying, reviewing, and controlling AI usage and cost across Brains, AI Employees, workflows, sessions, tools, and future client systems.