MWMS AI Observability Metadata Standard

System: MWMS
Document Type: Standard
Status: Draft For MCR
Authority: HeadOffice
Applies To: All MWMS Brains, AI Employees, AI Workflows, Brain Room, Dev Console, HeadOffice Intelligence, Research Workflows, Deep Search Workflows, Future Client Facing AI Systems
Primary Location: MCR
Future Operational Destination: mwmsbrain.site, mwmsheadofficebrain.site, AI Employee Dashboards, Future HeadOffice Trace Dashboards
Parent Page: HeadOffice
Source Of Truth: MCR
Related Framework: MWMS Deep Search Quality And Observability Framework
Course Source: Matt Pocock AIhero Build DeepSearch In TypeScript
Absorption Status: Approved For Integration


Purpose

The purpose of this standard is to define the metadata MWMS must capture whenever an AI Employee, Brain workflow, Deep Search process, automation, or AI powered system performs meaningful work.

Metadata is the business and operational context attached to an AI trace.

Without metadata, logs only show technical activity.

With metadata, HeadOffice can understand:

  • who triggered the work
  • which Brain handled it
  • which AI Employee acted
  • what task or workflow it belonged to
  • what tools were used
  • what sources were inspected
  • what the system cost
  • what failed
  • what decision was made
  • whether the output should be trusted
  • whether the process should be improved

This standard ensures MWMS AI activity is not hidden, disconnected, or impossible to audit.


Scope

This standard applies to all MWMS systems where AI performs, supports, records, analyses, or recommends work.

This includes:

  • Brain Room messages
  • Dev Console AI replies
  • HeadOffice Intelligence processing
  • Newsletter Intelligence
  • Research Brain workflows
  • Affiliate Brain offer evaluations
  • Ads Brain campaign or compliance research
  • Content Brain research and production workflows
  • Data Brain validation processes
  • Experimentation Brain test analysis
  • Finance Brain cost or budget analysis
  • Future AI Employees
  • Future client facing AIBS systems
  • Future Deep Search agents
  • Future WordPress plugin AI actions
  • Future Supabase task execution flows
  • Future Make.com or n8n AI automations

This standard does not define exact database implementation.

It defines what must be captured conceptually so M, future developers, and future AI Employees can build consistent trace records.


Core Rule

Every meaningful AI action must carry enough metadata for HeadOffice to answer:

What happened, who caused it, which system handled it, what it affected, what it cost, what evidence supported it, what decision resulted, and whether it needs review?

If an AI output cannot be linked to its metadata, it should not be treated as a trusted MWMS record.

No orphaned AI work.

No anonymous production traces.

No important AI decision without context.


Definition Of Observability Metadata

Observability metadata is the structured context attached to an AI workflow trace.

It is not the full content of the AI answer.

It is the information around the answer that explains:

  • identity
  • ownership
  • system location
  • workflow purpose
  • tools used
  • source records used
  • model behaviour
  • cost
  • latency
  • confidence
  • review status
  • business outcome
  • improvement notes

A normal log might say:

Tool call completed.

An MWMS metadata rich trace should say:

Research Brain used the Web Search tool during Affiliate Offer Evaluation Task 184 for Offer X, triggered by Martyn, using GPT model Y, inspected three sources, cost $0.07, completed successfully, confidence 82 percent, routed to Affiliate Brain for review.

That is the difference between technical logging and HeadOffice observability.


Metadata Categories

MWMS observability metadata is divided into nine categories:

  1. Identity Metadata
  2. System Metadata
  3. Workflow Metadata
  4. AI Model Metadata
  5. Tool Metadata
  6. Source And Evidence Metadata
  7. Database And Record Metadata
  8. Performance And Cost Metadata
  9. Review And Decision Metadata

Each category exists so MWMS can inspect, govern, troubleshoot, and improve AI work.


1. Identity Metadata

Identity metadata defines who or what triggered the AI workflow.

Recommended fields:

FieldDescription
user_idInternal user ID where available
user_nameHuman readable operator name
user_roleExample: Martyn, M, Admin, Developer, Client, Future Employee
client_idClient account ID if relevant
client_nameClient name if relevant
organisation_idFuture organisation or account grouping
session_idCurrent user session if available
request_originWhere the request came from

Request Origin Examples

  • Brain Room
  • Dev Console
  • HeadOffice Dashboard
  • Newsletter Intelligence
  • Make.com scenario
  • WordPress admin screen
  • Supabase task queue
  • AI Employee automation
  • Manual user prompt
  • Future client portal
  • Future API request

Rule

Every AI trace should identify the human, system, or automation that triggered it wherever possible.

If the user is unknown, the trace must clearly mark the origin as anonymous, system generated, scheduled, or external.


2. System Metadata

System metadata defines where the AI work happened inside MWMS.

Recommended fields:

FieldDescription
brain_nameThe Brain responsible for the work
brain_siteSite where the work occurred
system_layerMCR, Brain Site, HeadOffice, Plugin, Automation, External Tool
ai_employee_nameName of the AI Employee or agent
ai_employee_roleRole or function of the AI Employee
module_nameSpecific module or workflow area
environmentLocal, test, staging, production
source_of_truthWhere the canonical record belongs
parent_systemHigher level system that owns the workflow

Brain Name Examples

  • HeadOffice Brain
  • Research Brain
  • Affiliate Brain
  • Ads Brain
  • Content Brain
  • Data Brain
  • Experimentation Brain
  • Finance Brain
  • Operations Brain
  • Risk Brain
  • Strategy Brain

Rule

Every trace must make clear which Brain or system owns the action.

If ownership is unclear, the trace should be routed for HeadOffice review.


3. Workflow Metadata

Workflow metadata defines what type of work was being performed.

Recommended fields:

FieldDescription
workflow_typeThe category of workflow
workflow_nameSpecific workflow name
task_idRelated task ID
thread_idRelated conversation or Brain Room thread
run_idUnique workflow run ID
parent_record_idParent offer, experiment, newsletter, report, or source record
workflow_stageCurrent stage of the process
trigger_typeManual, automatic, scheduled, event based, routed
priorityLow, medium, high, urgent
urgencyMonitor, test, act now, emergency
statusPending, running, completed, failed, routed, parked, rejected
retry_countNumber of attempts
escalation_requiredYes or no

Workflow Type Examples

  • Newsletter extraction
  • Offer evaluation
  • Deep Search research
  • Source inspection
  • Compliance review
  • Campaign analysis
  • Experiment analysis
  • Task execution
  • Brain Room reply
  • Dev Console support
  • Content research
  • Report generation
  • Data validation
  • Cost analysis

Rule

Every meaningful AI action must belong to a workflow type.

If the system cannot classify the workflow, it should assign:

workflow_type: Unclassified AI Work

and route it for HeadOffice review.


4. AI Model Metadata

AI model metadata defines which model was used and how it behaved.

Recommended fields:

FieldDescription
model_providerOpenAI, Anthropic, Google, local model, other
model_nameExact model name where available
model_roleReasoning, extraction, summarisation, classification, tool calling
prompt_versionVersion of the system or task prompt
system_prompt_idID or reference for the system prompt
temperatureModel temperature if available
max_tokensToken limit if available
input_tokensInput token count
output_tokensOutput token count
total_tokensTotal token count
model_cost_estimateEstimated model cost
confidence_scoreAI or system confidence score
output_formatText, JSON, structured record, report, task, recommendation
tool_calling_enabledYes or no

Rule

Whenever an AI result affects a decision, workflow, source record, or task, the model used should be recorded.

If the exact model is unknown, the trace should mark:

model_name: Unknown

and this should be treated as a trace quality issue.


5. Tool Metadata

Tool metadata defines which tools the AI used.

Recommended fields:

FieldDescription
tool_nameName of the tool used
tool_typeSearch, browser, scraper, database, file, email, calendar, analytics, ads, automation
tool_providerProvider or internal system
tool_input_summaryShort summary of the input arguments
tool_output_summaryShort summary of returned result
tool_statusPending, completed, failed, skipped
tool_errorError message if failed
tool_latency_msTool execution time
tool_cost_estimateCost if available
tool_authorisedYes or no
tool_result_usedWhether result was used in final answer
tool_retry_countNumber of retries
tool_risk_levelLow, medium, high

Tool Type Examples

  • Web Search
  • Browser
  • Scraper
  • Crawler
  • Supabase query
  • WordPress REST call
  • Gmail
  • Google Calendar
  • Google Drive
  • Make.com scenario
  • n8n workflow
  • Ad platform API
  • Analytics platform
  • Future MCP tool

Rule

Every tool call should be visible and reviewable.

If an AI Employee used a tool that it was not authorised to use, the trace must be marked for review.


6. Source And Evidence Metadata

Source and evidence metadata defines what external information supported the AI output.

Recommended fields:

FieldDescription
source_countNumber of sources found
inspected_source_countNumber of sources opened or inspected
source_idsIDs of stored source records
source_urlsURLs where relevant
source_titlesSource titles where available
source_typesOfficial, expert, commercial, user generated, unknown
source_trust_ratingLow, medium, high
source_freshness_ratingOutdated, acceptable, current, unknown
publication_dateSource publication date if available
retrieved_atWhen source was retrieved
extraction_statusSuccessful, partial, failed, blocked
evidence_summaryShort summary of evidence used
conflicting_sources_detectedYes or no
source_used_in_final_outputYes or no
citation_requiredYes or no
evidence_sufficiencyWeak, acceptable, strong

Source Type Examples

  • Official documentation
  • Government source
  • Vendor page
  • News article
  • Blog post
  • Research report
  • Affiliate sales page
  • Review page
  • Forum discussion
  • Reddit thread
  • Social media post
  • Unknown source

Rule

A Deep Search output should never be treated as high confidence if source metadata is missing, weak, outdated, or uninspected.


7. Database And Record Metadata

Database and record metadata defines what internal MWMS records were read, created, updated, or linked.

Recommended fields:

FieldDescription
database_providerSupabase, WordPress, Google Sheets, other
table_nameTable or storage location
record_idMain record affected
parent_record_idParent object
read_record_idsRecords read
written_record_idsRecords created
updated_record_idsRecords updated
deleted_record_idsRecords deleted if applicable
event_log_idRelated event log
task_event_idRelated task event
queue_record_idRelated queue item
source_record_idRelated source record
output_record_idFinal saved output
db_operation_statusSuccessful, partial, failed
db_errorError if failed
duplicate_detectedYes or no
permission_statusAllowed, denied, unknown

Rule

Database activity matters because AI workflows often fail outside the model.

A trace that shows a good AI answer but failed storage, broken linkage, or missing task update is not a complete success.


8. Performance And Cost Metadata

Performance and cost metadata defines whether the AI workflow was efficient enough.

Recommended fields:

FieldDescription
started_atWorkflow start time
completed_atWorkflow completion time
total_latency_msTotal run time
model_latency_msModel call time
tool_latency_msTool call time
crawler_latency_msCrawler or scraper time
database_latency_msDatabase operation time
total_cost_estimateEstimated total cost
model_cost_estimateEstimated model cost
tool_cost_estimateEstimated tool cost
cost_per_queryQuery cost where applicable
cost_per_userUser level cost where applicable
cost_per_successful_outputCost divided by successful outputs
budget_groupRelated budget or cost group
cost_statusNormal, high, excessive, unknown

Rule

A Deep Search AI Employee is not successful if it produces a useful answer at an unsustainable cost.

Cost and speed must be tracked early, not only after scaling.


9. Review And Decision Metadata

Review and decision metadata defines what happened after the AI produced output.

Recommended fields:

FieldDescription
output_statusDraft, final, pending review, rejected, approved
decision_outcomeProceed, reject, park, route, test, monitor, escalate
human_review_requiredYes or no
reviewed_byReviewer name or ID
reviewed_atReview time
review_resultApproved, rejected, needs revision, more research needed
confidence_scoreConfidence level
evaluation_scoreEval score if available
failure_reasonWhy the output failed if applicable
risk_levelLow, medium, high
compliance_riskYes or no
financial_riskYes or no
business_impactLow, medium, high
kaizen_noteImprovement note
next_actionRequired next step
routed_to_brainBrain receiving follow up
routed_to_employeeAI Employee receiving follow up

Rule

AI work is not complete until the result is either:

  • approved
  • rejected
  • parked
  • routed
  • converted into a task
  • logged for Kaizen
  • stored as final intelligence

Minimum Required Metadata

Not every workflow needs every field from day one.

However, every meaningful AI workflow should capture the following minimum metadata:

FieldRequirement
trace_idRequired
created_atRequired
brain_nameRequired
workflow_typeRequired
task_id or thread_idRequired where applicable
ai_employee_nameRequired where applicable
user or request_originRequired
model_nameRequired where available
tool_usedRequired where applicable
source_countRequired for research workflows
statusRequired
confidence_scoreRequired where available
final_output_locationRequired where output is stored
decision_outcomeRequired where decision is made
review_statusRequired where human review applies
kaizen_noteRecommended

Recommended Trace Record Structure

The following structure can be used as the conceptual record shape for future implementation.

{
"trace_id": "",
"created_at": "",
"environment": "",
"brain_name": "",
"brain_site": "",
"ai_employee_name": "",
"workflow_type": "",
"workflow_name": "",
"task_id": "",
"thread_id": "",
"run_id": "",
"parent_record_id": "",
"user_name": "",
"user_role": "",
"request_origin": "",
"priority": "",
"urgency": "",
"status": "",
"model_provider": "",
"model_name": "",
"prompt_version": "",
"input_tokens": 0,
"output_tokens": 0,
"total_tokens": 0,
"tools_used": [],
"source_ids": [],
"source_count": 0,
"inspected_source_count": 0,
"source_trust_rating": "",
"source_freshness_rating": "",
"database_records_read": [],
"database_records_written": [],
"event_log_id": "",
"started_at": "",
"completed_at": "",
"total_latency_ms": 0,
"total_cost_estimate": 0,
"confidence_score": 0,
"evaluation_score": 0,
"risk_level": "",
"decision_outcome": "",
"human_review_required": false,
"review_status": "",
"routed_to_brain": "",
"final_output_location": "",
"failure_reason": "",
"kaizen_note": ""
}

This structure is not mandatory as exact code.

It is the recommended conceptual model for consistent implementation.


Metadata Quality Levels

MWMS should classify trace quality.

Level 1: Basic Trace

Captures:

  • trace ID
  • Brain
  • workflow type
  • user or origin
  • model
  • status
  • final output location

Use case:

  • early testing
  • simple internal workflows
  • low risk actions

Level 2: Operational Trace

Captures Level 1 plus:

  • task or thread ID
  • AI Employee
  • tools used
  • source count
  • database records
  • confidence
  • cost estimate
  • latency
  • decision outcome

Use case:

  • normal MWMS AI Employee work
  • Brain Room
  • HeadOffice Intelligence
  • Research workflows
  • Affiliate evaluations

Level 3: Governance Trace

Captures Level 2 plus:

  • source trust
  • source freshness
  • inspected sources
  • evaluation score
  • risk level
  • review status
  • compliance risk
  • financial risk
  • Kaizen note
  • routing outcome

Use case:

  • high value decisions
  • campaign launch decisions
  • compliance sensitive work
  • client facing systems
  • budget decisions
  • Deep Search outputs

Level 4: Full Observability Trace

Captures Level 3 plus:

  • prompt version
  • token usage
  • model latency
  • tool latency
  • database latency
  • crawler latency
  • tool input and output summaries
  • database operation status
  • failure details
  • retry counts
  • full workflow chain

Use case:

  • production AI Employees
  • scaling systems
  • debugging
  • automated evaluation
  • advanced HeadOffice dashboards

AI Employee Metadata Requirements

Each AI Employee should eventually have a metadata profile defining what it must capture.

The profile should include:

  • Employee name
  • owning Brain
  • allowed workflows
  • required trace level
  • allowed tools
  • required source metadata
  • required cost metadata
  • required review metadata
  • escalation rules
  • Kaizen logging rules

Example:

AI EmployeeRequired Trace Level
Newsletter Intelligence ExtractorLevel 2 or 3
Research Brain Source AnalystLevel 3
Affiliate Offer EvaluatorLevel 3
Ads Compliance ReviewerLevel 3 or 4
HeadOffice Decision AssistantLevel 3
Dev Console HelperLevel 2
Future Client Facing AI AgentLevel 4

Workflow Metadata Requirements

Different workflows require different metadata depth.

Workflow TypeMinimum Trace Level
Simple Brain Room replyLevel 1
Dev Console support replyLevel 2
Newsletter extractionLevel 2
Newsletter routing decisionLevel 3
Offer evaluationLevel 3
Deep Search researchLevel 3
Compliance reviewLevel 4
Budget recommendationLevel 4
Campaign launch recommendationLevel 4
Client facing AI outputLevel 4

Confidence Metadata Rule

Confidence must not be treated as a feeling.

Confidence should be based on:

  • source quality
  • source freshness
  • evidence sufficiency
  • tool success
  • database success
  • model certainty
  • conflict detection
  • eval result
  • review status

A high confidence score should not be allowed when:

  • no sources were inspected
  • source freshness is unknown
  • tools failed
  • database writes failed
  • evidence was weak
  • sources conflicted
  • the output contains unsupported claims
  • the workflow has compliance risk

Cost Metadata Rule

Cost must be visible before scaling.

Each Brain or AI Employee should eventually report:

  • daily cost
  • weekly cost
  • monthly cost
  • cost per workflow type
  • cost per successful output
  • cost per failed output
  • cost per client
  • cost per AI Employee
  • cost by model
  • cost by tool
  • cost by source inspection

HeadOffice must be able to see if an AI Employee is becoming expensive before it becomes a business problem.


Failure Metadata Rule

Failures must be recorded clearly.

Failure metadata should include:

  • where the failure happened
  • model failure
  • tool failure
  • crawler failure
  • database failure
  • permission failure
  • source failure
  • timeout
  • rate limit
  • invalid output
  • failed validation
  • human rejection
  • routing failure
  • unknown failure

Failures should not disappear inside final answers.

A failed workflow with a confident answer is dangerous.


Review Metadata Rule

Human review metadata must be attached where decisions matter.

Human review is required for:

  • high risk outputs
  • compliance sensitive outputs
  • financial recommendations
  • campaign launch decisions
  • offer approval decisions
  • client facing outputs
  • low confidence outputs
  • failed or partial traces
  • weak source evidence
  • expensive workflow runs
  • repeated AI Employee failure

Review metadata should show:

  • who reviewed it
  • when it was reviewed
  • what decision was made
  • what changed
  • what next action was created
  • whether Kaizen was required

Kaizen Metadata Rule

Every meaningful AI workflow should leave behind improvement data.

Kaizen metadata should include:

  • what worked
  • what failed
  • what was unclear
  • what was too expensive
  • what was too slow
  • what source was weak
  • what prompt needs improvement
  • what tool needs improvement
  • what workflow needs improvement
  • whether the AI Employee needs retraining, restriction, or promotion

This supports the MWMS Kaizen loop:

Reflect
→ Reduce
→ Refine
→ Record


Security And Privacy Rule

Metadata must not expose sensitive information unnecessarily.

MWMS should avoid storing unnecessary:

  • passwords
  • API keys
  • private keys
  • full payment details
  • private personal data
  • excessive client confidential data
  • private health or legal information unless required and authorised

Where sensitive data is involved, metadata should use:

  • IDs instead of raw values
  • summaries instead of full text
  • redaction where appropriate
  • permission controlled access
  • minimal retention where possible

Observability must improve governance without creating a new privacy or security risk.


Drift Protection

This standard prevents the following drift:

  • AI outputs with no owner
  • AI decisions with no trace
  • tool calls hidden from HeadOffice
  • database failures being ignored
  • cost growth becoming invisible
  • AI Employees acting without workflow classification
  • source based decisions with no source metadata
  • business decisions with only technical logs
  • high confidence outputs with weak evidence
  • human review not being recorded
  • Kaizen improvements being lost
  • future developers inventing inconsistent trace structures

If metadata is missing, the trace quality must be downgraded.

If trace quality is too low for the decision type, the output must be reviewed before being trusted.


Relationship To Other MWMS Standards

This standard supports and should align with:

  • MWMS Deep Search Quality And Observability Framework
  • MWMS AI Agent Operations Core
  • MWMS AI Tool Permission And Access Framework
  • MWMS AI Agent Deployment Readiness Checklist
  • MWMS AI Workflow Pipeline Standard
  • MWMS AI Schema And Decision Ready Output Framework
  • MWMS AI Output Validation Standard
  • MWMS Agentic Reporting Standard
  • MWMS Supabase Event Schema
  • MWMS Brain Room Architecture
  • HeadOffice Operational Intelligence Framework
  • HeadOffice Newsletter Intelligence Operating Protocol
  • Research Brain Source Evaluation Framework
  • Data Brain Measurement Integrity Framework
  • Experimentation Brain Canon
  • MWMS Kaizen Continuous Improvement Loop
  • MWMS System Change Log

This standard does not replace those pages.

It defines the metadata layer needed to make those systems observable and governable.


Implementation Notes For M And Future Developers

This page is a governance standard, not a code specification.

Developers do not need to implement every metadata field immediately.

The correct build approach is:

  1. Start with minimum required metadata.
  2. Add task and thread linkage.
  3. Add Brain and AI Employee ownership.
  4. Add model and tool metadata.
  5. Add source and database metadata for research workflows.
  6. Add cost and latency metadata.
  7. Add review and decision metadata.
  8. Add Kaizen metadata.
  9. Later, build dashboards from the trace data.

Do not block development waiting for perfect observability.

But do not build AI Employees that cannot be traced.


Minimum Starting Implementation

For current MWMS systems, the first practical metadata fields should be:

  • trace_id
  • created_at
  • brain_name
  • ai_employee_name
  • workflow_type
  • task_id
  • thread_id
  • request_origin
  • model_name
  • tools_used
  • status
  • confidence_score
  • final_output_location
  • decision_outcome
  • review_status
  • kaizen_note

This is enough to start creating useful HeadOffice visibility while leaving room for deeper observability later.


Future Enhancements

Future pages or system upgrades may include:

  • MWMS Deep Search Source Record Standard
  • MWMS AI Employee Evaluation Scorecard Standard
  • MWMS AI Trace Dashboard Specification
  • MWMS HeadOffice AI Observability Dashboard Specification
  • MWMS AI Employee Cost Reporting Standard
  • MWMS AI Tool Call Trace Schema
  • MWMS Source Evidence Record Schema
  • MWMS AI Employee Failure Pattern Registry
  • MWMS AI Employee Promotion And Restriction Standard
  • MWMS Client Facing AI Trace Visibility Standard

Architectural Intent

The architectural intent of this standard is to ensure MWMS AI work becomes visible, measurable, reviewable, and improvable.

MWMS is not building simple AI chat windows.

MWMS is building a business operating system with AI Employees.

AI Employees must not act like invisible assistants.

They must act like accountable workers whose activity can be reviewed by HeadOffice.

The metadata layer is what turns AI activity into operational intelligence.

Without metadata, MWMS has conversations.

With metadata, MWMS has a governable AI workforce.


Change Log

v1.0 Initial Draft

Created the MWMS AI Observability Metadata Standard as a companion page to the MWMS Deep Search Quality And Observability Framework.

Defined the required metadata categories for MWMS AI traces:

  • identity metadata
  • system metadata
  • workflow metadata
  • model metadata
  • tool metadata
  • source and evidence metadata
  • database and record metadata
  • performance and cost metadata
  • review and decision metadata

Established metadata quality levels from basic trace through full observability trace.

Defined minimum required metadata for current MWMS implementation and future expansion paths for AI Employee dashboards, HeadOffice observability, source records, evaluation scorecards, and Kaizen improvement loops.