System: MWMS
Document Type: Framework
Authority Level: MCR Source Of Truth
Status: Draft For MCR
Primary Location: MCR
Future Operational Destination: HeadOffice Brain, MWMS Brain, Brain Room, Newsletter Intelligence, Course Absorption System, Research Brain, Content Brain, AI Business Systems Brain
Parent Page: HeadOffice
Owner: Martyn
Developer Boundary: Do Not Touch M’s Active Build Areas Unless Specifically Assigned
Source Of Truth: MCR

Purpose

The purpose of this document is to define the MWMS Messy Input Normalization Framework.

This framework establishes how MWMS converts messy, inconsistent, incomplete, noisy, or unstructured inputs into clean, structured, usable intelligence.

MWMS receives information from many sources.

These sources are rarely clean.

They may include newsletters, course transcripts, PDFs, copied web pages, screenshots, sales pages, affiliate offers, spreadsheets, emails, Brain Room messages, Supabase records, Make.com outputs, Google Ads data, WordPress page lists, developer notes, and client materials.

If these inputs are not cleaned and normalized before analysis, MWMS risks producing weak intelligence, wrong decisions, bad routing, poor reports, duplicate pages, and unreliable automation.

This framework exists to make sure raw input is converted into usable MWMS structure before it becomes operational intelligence.

Scope

This framework applies to all MWMS workflows where raw information must be processed before it can be analyzed, routed, saved, reported, or acted upon.

This includes:

Course Absorption
Newsletter Intelligence
Research Brain workflows
Content Brain workflows
Affiliate offer evaluation
Brain Room requests
Dev Console requests
Finance data review
Experimentation data review
Ads Brain campaign review
HeadOffice reporting
MCR page creation
Supabase task/event review
future AIBS client workflows

This framework applies to both manual and automated intake.

Manual intake includes files, pasted text, screenshots, course notes, and user instructions.

Automated intake includes newsletters, emails, Supabase rows, Make.com outputs, future task records, dashboards, and client-system feeds.

Core Definition

Messy Input Normalization is the process of converting raw, noisy, unstructured, or inconsistent source material into clean, structured, classified, validated, and routable MWMS intelligence.

A messy input may be:

too long
incomplete
duplicated
badly formatted
full of filler
mixed with ads or footers
copied from a broken page
missing context
full of irrelevant sections
badly transcribed
spread across multiple files
inconsistent in field names
unclear in ownership
not ready for analysis

Normalization prepares the input for useful work.

The goal is not to summarize too early.

The goal is to clean, structure, classify, and prepare the source so the correct AI Employee or Brain can work on it.

Core Principle

The core principle of this framework is:

Dirty input creates dirty intelligence.

If MWMS analyzes poor input without cleaning it first, the output will be weaker.

If MWMS routes noisy input into the wrong Brain, the workflow becomes unreliable.

If MWMS stores messy input as if it were structured intelligence, dashboards and reports become cluttered.

Therefore, raw input must be normalized before it becomes trusted MWMS intelligence.

Why Messy Input Normalization Matters

MWMS is an intelligence system.

The quality of the system depends heavily on the quality of what enters it.

Weak input causes:

bad summaries
wrong Brain routing
poor decisions
duplicate page creation
missed insights
dashboard noise
incorrect priorities
false urgency
weak course absorption
poor offer evaluation
bad developer instructions
unreliable automation

Clean input creates:

better extraction
better analysis
better routing
stronger validation
clearer reporting
better handoffs
better learning
safer automation

Normalization is therefore a core operating layer, not a cleanup afterthought.

Normalization Workflow

The MWMS Messy Input Normalization Framework uses seven stages:

Capture
Extract
Clean
Structure
Classify
Validate
Route

1. Capture

Capture preserves the raw input and identifies where it came from.

The capture stage should record:

source type
source name
date received
original format
sender or origin where relevant
associated Brain or system
file name where relevant
task or workflow trigger
whether the source is complete or partial

Examples of source types include:

newsletter
course transcript
PDF
HTML file
screenshot
sales page
affiliate offer
Brain Room message
Supabase row
Google Sheet row
Make.com module output
developer note
WordPress page list
client process document

The Capture rule is:

MWMS must know what came in before it tries to interpret it.

2. Extract

Extraction pulls the useful content out of the raw input.

Extraction may include:

extracting text from a PDF
reading a transcript
separating useful lesson content from timestamps
pulling body content from an email
extracting table fields
identifying key headings
copying relevant screenshot text
separating source data from surrounding clutter
pulling offer data from a sales page
extracting campaign metrics from pasted reports

The extraction stage should avoid interpretation too early.

The job is to pull out what is actually present.

The Extract rule is:

Extract first. Interpret second.

3. Clean

Cleaning removes clutter, noise, duplication, and irrelevant material.

Cleaning may include removing:

newsletter footers
unsubscribe text
repeated headers
navigation menus
broken HTML
unrelated ads
tracking links
excessive timestamps
repeated transcript fragments
duplicate sections
irrelevant platform boilerplate
empty lines
corrupted characters
sales page fluff where not useful
copied page clutter

Cleaning does not mean deleting useful evidence.

It means removing noise so the useful material can be processed correctly.

The Clean rule is:

Remove clutter without destroying meaning.

4. Structure

Structure converts the cleaned input into a usable format.

Depending on the workflow, structure may include:

title
source
date
topic
key sections
extracted facts
claims
data points
entities
tools mentioned
workflows mentioned
risks mentioned
possible actions
impacted Brain
output type
priority marker
follow-up need

For course absorption, structured input may include:

course name
module
lesson title
core concept
frameworks
practical systems
tools mentioned
MWMS relevance
ignore/absorb/park decision

For newsletter intelligence, structured input may include:

headline
source
insight
underlying pattern
primary Brain
supporting Brains
recommended action
confidence
urgency
priority

For offer evaluation, structured input may include:

offer name
network
payout
niche
claim type
funnel type
traffic fit
compliance risk
market angle
test suitability

The Structure rule is:

Clean content must be converted into a format MWMS can use.

5. Classify

Classification identifies what the structured input is and where it belongs.

Classification should determine:

input type
owning Brain
supporting Brains
workflow type
risk level
urgency
priority
required AI Employee
validation level
likely destination

Classification examples:

A course transcript may classify as:

Course Absorption
HeadOffice Brain
AI Agent Operations Core
Medium risk
Human review required before MCR save

A newsletter may classify as:

Newsletter Intelligence
HeadOffice Brain
Primary Brain: Ads Brain
Action: Monitor or Test
Queue review required

An offer page may classify as:

Offer Evaluation
Affiliate Brain
Supporting Brains: Research, Ads, Finance, Experimentation
High risk
Human review required

The Classification rule is:

MWMS must know what the input is before deciding what to do with it.

6. Validate

Validation checks whether the normalized input is good enough to proceed.

Validation should ask:

Is the source complete enough?
Is the content readable?
Was important context preserved?
Was noise removed safely?
Is the structure usable?
Is the classification sensible?
Is the owning Brain correct?
Is the risk level appropriate?
Is human review needed?
Is more input required?
Is the input too weak to process?

Validation may lead to:

proceed
request more input
clean again
classify again
park
reject
escalate

The Validate rule is:

Do not send weak normalized input into serious analysis.

7. Route

Routing sends the normalized input to the correct next stage.

Possible destinations include:

Course Absorption Agent
Newsletter Signal Extraction Agent
Research Agent
Offer Evaluation Agent
Brain Room Task Builder Agent
HeadOffice Validation Agent
MCR page draft
Supabase task record
dashboard queue
human review queue
Parking System
archive

The Route rule is:

Normalized input must move to the correct destination or be deliberately parked.

Messy Input Categories

MWMS should recognize different types of messy input.

1. Noisy Text Input

Examples:

newsletters with ads and footers
copied web pages
long emails
transcript exports
HTML text
social posts

Main risk:

useful signal is buried inside clutter.

Normalization focus:

remove noise
preserve meaning
extract key sections
classify insight value

2. Broken Format Input

Examples:

bad PDF extraction
strange line breaks
corrupted characters
missing headings
copied table data
poor transcript timing

Main risk:

AI misreads structure or merges unrelated sections.

Normalization focus:

reconstruct sections
preserve source meaning
clarify fields
mark uncertain areas

3. Overloaded Input

Examples:

massive course blocks
large reports
many newsletter items
multiple pasted documents
long sales pages

Main risk:

AI compresses too aggressively and misses important details.

Normalization focus:

split into sections
process in stages
extract by category
avoid premature summarization

4. Incomplete Input

Examples:

partial screenshots
missing course files
incomplete offer data
broken email body
cut-off transcript
missing metrics

Main risk:

AI fills gaps with assumptions.

Normalization focus:

identify missing pieces
mark assumptions
request more input if required
avoid unsupported conclusions

5. Mixed-Purpose Input

Examples:

a newsletter containing tools, news, ads, offers, opinions, and compliance signals
a course lesson mixing tactics, philosophy, tool steps, and frameworks
a sales page containing claims, bonuses, scarcity, and testimonials

Main risk:

AI treats everything as equally important.

Normalization focus:

separate categories
classify business relevance
route each part properly

6. Operational Data Input

Examples:

Supabase rows
Google Ads stats
Finance numbers
experiment results
task logs
dashboard data

Main risk:

bad decisions from misunderstood fields.

Normalization focus:

preserve numeric accuracy
define fields
check missing data
separate data from interpretation
validate before decision-making

7. Development Input

Examples:

code snippets
plugin files
screenshots of errors
WordPress page lists
M’s notes
site status updates

Main risk:

vague or wrong developer instructions.

Normalization focus:

identify exact site
identify exact file or screen
identify current state
identify expected result
avoid assumptions
protect save point

Normalized Input Record

A normalized input record should include:

Input Title:
Source Type:
Source Name:
Date Received:
Original Format:
Workflow Type:
Owning Brain:
Supporting Brains:
Cleaned Content Summary:
Key Extracted Elements:
Known Gaps:
Risk Level:
Priority:
Required AI Employee:
Validation Level:
Next Destination:
Recommended Action:
Human Review Required:
Logging Required:

This record may be simplified for low-risk tasks.

It should not be skipped for high-value or high-risk workflows.

Application To Course Absorption

Course absorption depends heavily on messy input normalization.

Course files may include:

PDFs
SRT transcripts
HTML descriptions
lesson outlines
screenshots
module notes
repeated course platform text

The course normalization process should be:

Capture lesson/module identity
Extract useful lesson content
Remove course platform noise
Separate concepts from tool steps
Identify frameworks, rules, workflows, and principles
Classify MWMS relevance
Decide absorb, park, ignore, or request more files
Route to Course Absorption Agent or Page Builder Agent

Course rule:

Do not absorb course content until it has been normalized and judged for MWMS value.

Application To Newsletter Intelligence

Newsletter input is often noisy.

Newsletters may contain:

headlines
commentary
ads
sponsored tools
links
repeated footers
unsubscribe blocks
weak news
strong market signals
tool updates
compliance warnings
monetization opportunities

The newsletter normalization process should be:

Capture sender, subject, date, body, snippet
Clean footer and ad clutter
Extract distinct news/tool/policy/business items
Separate generic news from MWMS-relevant signal
Classify primary Brain and supporting Brains
Identify ACT NOW / TEST / MONITOR / PARK / REJECT
Validate usefulness and specificity
Route to queue, dashboard, or parking system

Newsletter rule:

A newsletter is not intelligence until useful signal has been separated from noise.

Application To Brain Room

Brain Room messages can be messy because they may contain:

quick notes
incomplete instructions
mixed topics
emotional context
developer references
task ideas
strategic questions
build concerns
pasted snippets

The Brain Room normalization process should be:

Capture message and thread context
Identify whether it is chat, decision, task, bug, idea, or handoff
Extract the actionable request
Identify owning Brain
Check developer boundary
Create Agentic Work Unit if needed
Route to AI Manager, human review, or parking

Brain Room rule:

Brain Room messages must be converted into structured work before they enter operational workflows.

Application To Offer Evaluation

Affiliate offers are messy because they include hype, claims, bonuses, vendor positioning, affiliate metrics, funnel promises, and incomplete data.

The offer normalization process should be:

Capture offer name, network, niche, payout, and source
Extract actual offer facts
Separate vendor claims from evidence
Identify claims, mechanism, audience, and funnel style
Identify missing data
Classify risk level and traffic fit
Route to Research Brain, Finance Brain, Ads Brain, Experimentation Brain, or rejection

Offer rule:

Vendor hype must be normalized into evidence before MWMS makes offer decisions.

Application To Research Brain

Research sources are messy because not all sources are equal.

Research input may contain:

facts
opinions
old data
biased claims
vendor claims
affiliate claims
unsupported estimates
conflicting evidence

Research normalization should:

Capture source and date
Identify source type
Extract claims
Separate evidence from opinion
Check freshness
Identify contradictions
Classify confidence
Route to analysis or validation

Research rule:

Evidence must be separated from opinion before research becomes decision support.

Application To Content Brain

Content input is messy when it comes from:

transcripts
blog drafts
old articles
newsletters
social posts
campaign scripts
SEO notes
customer research
AI drafts

Content normalization should:

Capture source
Identify content type
Extract useful ideas
Remove repetition
Classify content purpose
Identify target audience
Identify channel fit
Route to production, refresh, repurposing, or archive

Content rule:

Content ideas must be normalized before they enter production queues.

Application To AIBS Client Systems

Future client systems will receive messy business input.

Examples:

customer emails
internal documents
spreadsheets
reports
meeting notes
process documents
support tickets
sales messages
finance notes

AIBS normalization should:

capture the source
extract the useful business content
remove irrelevant noise
classify workflow type
assign AI Employee role
validate before action
route into report, task, dashboard, or human review

AIBS rule:

Client automation must begin with input normalization before AI action.

This is a major quality and safety layer for future MWMS client delivery.

Normalization Quality Checklist

Before input is passed into analysis, check:

Is the source identified?
Is the original format known?
Is the input complete enough?
Has obvious noise been removed?
Has useful meaning been preserved?
Are sections or fields clear?
Are missing pieces marked?
Has the input type been classified?
Is the owning Brain identified?
Is risk level assigned?
Is the next AI Employee identified?
Is human review needed?
Is the destination clear?
Is the input safe to analyze?
Should the input be parked or rejected instead?

Normalization Failure Modes

MWMS must watch for the following failure modes:

Analyzing raw input before cleaning it
Removing useful context during cleaning
Summarizing too early
Mixing facts with assumptions
Treating vendor claims as evidence
Routing input to the wrong Brain
Ignoring missing source data
Over-compressing long course material
Letting newsletters become dashboard noise
Creating tasks from incomplete Brain Room messages
Misreading copied tables or broken PDFs
Using old data as current data
Allowing unstructured client input into automation
Forgetting to log source and origin
Treating messy input as validated output

Any workflow showing these failure modes should be paused and corrected.

Human Review Rule

Human review is required when messy input affects:

MCR canon
live systems
developer instructions
financial decisions
offer testing
compliance-sensitive outputs
public-facing content
client-facing workflows
paid traffic decisions
cross-Brain routing
major architecture changes

Human review may also be required when the input is incomplete, corrupted, contradictory, or too important to normalize automatically.

Automation Readiness Rule

Input normalization should be automated only after the manual pattern is stable.

Before automating normalization, confirm:

input source is predictable
common noise patterns are known
cleaned structure is defined
classification rules exist
validation rules exist
failure cases are understood
human review rules are defined
output destination is clear
logging is available

The automation rule is:

Do not automate messy input until the cleanup pattern is understood.

Governance Role

HeadOffice owns the MWMS Messy Input Normalization Framework.

HeadOffice is responsible for:

defining normalization rules
protecting MWMS from dirty input
setting validation expectations
ensuring input is routed correctly
preventing dashboard noise
protecting MCR from weak source material
protecting M’s active build from vague instructions
ensuring future AIBS systems normalize client input before acting

Individual Brains may define their own specialized normalization rules, but they must align with this framework.

Relationship To Other MWMS Standards

This framework supports and must align with:

MWMS AI Agent Operations Core
MWMS Agentic Work Unit Standard
MWMS AI Employee Role Card Standard
MWMS AI Agent Orchestration Framework
MWMS AI Workflow Pipeline Standard
MWMS AI Output Validation Standard
MWMS Brain Routing Rule
MWMS Brain To Brain Request Protocol
MWMS AI Output Standard Full File Delivery Rule
MWMS Brain Header Schema Standard
MWMS Page Naming Standard
MWMS Document Structure Standard
MWMS Architecture Registry
MWMS Brain Interaction Map
MWMS System Data Flow Map
MWMS Supabase Event Schema
HeadOffice Newsletter Intelligence Operating Protocol
HeadOffice Newsletter Intelligence Output Validation Protocol
MWMS Course Absorption Operating Rule
MWMS Opportunity System Operating Protocol
AI Business Systems Brain Blueprint

This framework sits early in the workflow chain because normalization happens before analysis, validation, routing, and reporting.

Drift Protection

This framework protects MWMS from the following forms of drift:

Treating raw input as clean intelligence
Summarizing before extracting
Analyzing noisy newsletters without cleaning
Absorbing course material without filtering
Treating vendor claims as facts
Creating developer instructions from incomplete context
Routing vague Brain Room messages into live workflows
Filling dashboards with low-quality source material
Automating client workflows on messy inputs
Losing source origin
Ignoring missing data
Letting broken transcripts distort learning
Allowing unstructured data to drive business decisions
Confusing input cleanup with final validation
Mistaking quantity of source material for quality of intelligence

Any MWMS workflow that processes messy input without normalization should be reviewed.

Architectural Intent

The architectural intent of the MWMS Messy Input Normalization Framework is to protect the intelligence quality of the entire ecosystem.

MWMS depends on information.

Information enters the system in messy forms.

The stronger MWMS becomes, the more inputs it will receive.

Without normalization, the system becomes noisy.

With normalization, the system becomes intelligent.

The long-term goal is that MWMS can take messy input from any source and move it through a controlled path:

capture → extract → clean → structure → classify → validate → route

This allows MWMS to turn raw material into usable work, reports, decisions, dashboards, tasks, and learning records.

A mature MWMS normalization system should always be able to answer:

What came in?
Where did it come from?
Was it complete?
What useful content was extracted?
What noise was removed?
What structure was applied?
What type of input is it?
Which Brain owns it?
Is it safe to analyze?
Where should it go next?
What should be logged?

When MWMS can answer those questions consistently, it can turn messy information into reliable operational intelligence.

Change Log

v1.0 — Initial Draft

Created the MWMS Messy Input Normalization Framework as the standard for converting raw, noisy, incomplete, duplicated, or unstructured inputs into clean, structured, classified, validatable, and routable MWMS intelligence.

This framework supports the MWMS AI Agent Operations Core, Agentic Work Unit Standard, AI Employee Role Card Standard, AI Agent Orchestration Framework, AI Workflow Pipeline Standard, and AI Output Validation Standard by defining how input should be captured, extracted, cleaned, structured, classified, validated, and routed before serious AI analysis or automation occurs.