System: MWMS
Document Type: Framework
Authority Level: MCR Source Of Truth
Status: Draft For MCR
Primary Location: MCR
Future Operational Destination: HeadOffice Brain, MWMS Brain, Brain Room, Newsletter Intelligence, Course Absorption System, Research Brain, Content Brain, AI Business Systems Brain
Parent Page: HeadOffice
Owner: Martyn
Developer Boundary: Do Not Touch M’s Active Build Areas Unless Specifically Assigned
Source Of Truth: MCR
Purpose
The purpose of this document is to define the MWMS Messy Input Normalization Framework.
This framework establishes how MWMS converts messy, inconsistent, incomplete, noisy, or unstructured inputs into clean, structured, usable intelligence.
MWMS receives information from many sources.
These sources are rarely clean.
They may include newsletters, course transcripts, PDFs, copied web pages, screenshots, sales pages, affiliate offers, spreadsheets, emails, Brain Room messages, Supabase records, Make.com outputs, Google Ads data, WordPress page lists, developer notes, and client materials.
If these inputs are not cleaned and normalized before analysis, MWMS risks producing weak intelligence, wrong decisions, bad routing, poor reports, duplicate pages, and unreliable automation.
This framework exists to make sure raw input is converted into usable MWMS structure before it becomes operational intelligence.
Scope
This framework applies to all MWMS workflows where raw information must be processed before it can be analyzed, routed, saved, reported, or acted upon.
This includes:
- Course Absorption
- Newsletter Intelligence
- Research Brain workflows
- Content Brain workflows
- Affiliate offer evaluation
- Brain Room requests
- Dev Console requests
- Finance data review
- Experimentation data review
- Ads Brain campaign review
- HeadOffice reporting
- MCR page creation
- Supabase task/event review
- future AIBS client workflows
This framework applies to both manual and automated intake.
Manual intake includes files, pasted text, screenshots, course notes, and user instructions.
Automated intake includes newsletters, emails, Supabase rows, Make.com outputs, future task records, dashboards, and client-system feeds.
Core Definition
Messy Input Normalization is the process of converting raw, noisy, unstructured, or inconsistent source material into clean, structured, classified, validated, and routable MWMS intelligence.
A messy input may be:
- too long
- incomplete
- duplicated
- badly formatted
- full of filler
- mixed with ads or footers
- copied from a broken page
- missing context
- full of irrelevant sections
- badly transcribed
- spread across multiple files
- inconsistent in field names
- unclear in ownership
- not ready for analysis
Normalization prepares the input for useful work.
The goal is not to summarize too early.
The goal is to clean, structure, classify, and prepare the source so the correct AI Employee or Brain can work on it.
Core Principle
The core principle of this framework is:
Dirty input creates dirty intelligence.
If MWMS analyzes poor input without cleaning it first, the output will be weaker.
If MWMS routes noisy input into the wrong Brain, the workflow becomes unreliable.
If MWMS stores messy input as if it were structured intelligence, dashboards and reports become cluttered.
Therefore, raw input must be normalized before it becomes trusted MWMS intelligence.
Why Messy Input Normalization Matters
MWMS is an intelligence system.
The quality of the system depends heavily on the quality of what enters it.
Weak input causes:
- bad summaries
- wrong Brain routing
- poor decisions
- duplicate page creation
- missed insights
- dashboard noise
- incorrect priorities
- false urgency
- weak course absorption
- poor offer evaluation
- bad developer instructions
- unreliable automation
Clean input creates:
- better extraction
- better analysis
- better routing
- stronger validation
- clearer reporting
- better handoffs
- better learning
- safer automation
Normalization is therefore a core operating layer, not a cleanup afterthought.
Normalization Workflow
The MWMS Messy Input Normalization Framework uses seven stages:
- Capture
- Extract
- Clean
- Structure
- Classify
- Validate
- Route
1. Capture
Capture preserves the raw input and identifies where it came from.
The capture stage should record:
- source type
- source name
- date received
- original format
- sender or origin where relevant
- associated Brain or system
- file name where relevant
- task or workflow trigger
- whether the source is complete or partial
Examples of source types include:
- newsletter
- course transcript
- HTML file
- screenshot
- sales page
- affiliate offer
- Brain Room message
- Supabase row
- Google Sheet row
- Make.com module output
- developer note
- WordPress page list
- client process document
The Capture rule is:
MWMS must know what came in before it tries to interpret it.
2. Extract
Extraction pulls the useful content out of the raw input.
Extraction may include:
- extracting text from a PDF
- reading a transcript
- separating useful lesson content from timestamps
- pulling body content from an email
- extracting table fields
- identifying key headings
- copying relevant screenshot text
- separating source data from surrounding clutter
- pulling offer data from a sales page
- extracting campaign metrics from pasted reports
The extraction stage should avoid interpretation too early.
The job is to pull out what is actually present.
The Extract rule is:
Extract first. Interpret second.
3. Clean
Cleaning removes clutter, noise, duplication, and irrelevant material.
Cleaning may include removing:
- newsletter footers
- unsubscribe text
- repeated headers
- navigation menus
- broken HTML
- unrelated ads
- tracking links
- excessive timestamps
- repeated transcript fragments
- duplicate sections
- irrelevant platform boilerplate
- empty lines
- corrupted characters
- sales page fluff where not useful
- copied page clutter
Cleaning does not mean deleting useful evidence.
It means removing noise so the useful material can be processed correctly.
The Clean rule is:
Remove clutter without destroying meaning.
4. Structure
Structure converts the cleaned input into a usable format.
Depending on the workflow, structure may include:
- title
- source
- date
- topic
- key sections
- extracted facts
- claims
- data points
- entities
- tools mentioned
- workflows mentioned
- risks mentioned
- possible actions
- impacted Brain
- output type
- priority marker
- follow-up need
For course absorption, structured input may include:
- course name
- module
- lesson title
- core concept
- frameworks
- practical systems
- tools mentioned
- MWMS relevance
- ignore/absorb/park decision
For newsletter intelligence, structured input may include:
- headline
- source
- insight
- underlying pattern
- primary Brain
- supporting Brains
- recommended action
- confidence
- urgency
- priority
For offer evaluation, structured input may include:
- offer name
- network
- payout
- niche
- claim type
- funnel type
- traffic fit
- compliance risk
- market angle
- test suitability
The Structure rule is:
Clean content must be converted into a format MWMS can use.
5. Classify
Classification identifies what the structured input is and where it belongs.
Classification should determine:
- input type
- owning Brain
- supporting Brains
- workflow type
- risk level
- urgency
- priority
- required AI Employee
- validation level
- likely destination
Classification examples:
A course transcript may classify as:
- Course Absorption
- HeadOffice Brain
- AI Agent Operations Core
- Medium risk
- Human review required before MCR save
A newsletter may classify as:
- Newsletter Intelligence
- HeadOffice Brain
- Primary Brain: Ads Brain
- Action: Monitor or Test
- Queue review required
An offer page may classify as:
- Offer Evaluation
- Affiliate Brain
- Supporting Brains: Research, Ads, Finance, Experimentation
- High risk
- Human review required
The Classification rule is:
MWMS must know what the input is before deciding what to do with it.
6. Validate
Validation checks whether the normalized input is good enough to proceed.
Validation should ask:
- Is the source complete enough?
- Is the content readable?
- Was important context preserved?
- Was noise removed safely?
- Is the structure usable?
- Is the classification sensible?
- Is the owning Brain correct?
- Is the risk level appropriate?
- Is human review needed?
- Is more input required?
- Is the input too weak to process?
Validation may lead to:
- proceed
- request more input
- clean again
- classify again
- park
- reject
- escalate
The Validate rule is:
Do not send weak normalized input into serious analysis.
7. Route
Routing sends the normalized input to the correct next stage.
Possible destinations include:
- Course Absorption Agent
- Newsletter Signal Extraction Agent
- Research Agent
- Offer Evaluation Agent
- Brain Room Task Builder Agent
- HeadOffice Validation Agent
- MCR page draft
- Supabase task record
- dashboard queue
- human review queue
- Parking System
- archive
The Route rule is:
Normalized input must move to the correct destination or be deliberately parked.
Messy Input Categories
MWMS should recognize different types of messy input.
1. Noisy Text Input
Examples:
- newsletters with ads and footers
- copied web pages
- long emails
- transcript exports
- HTML text
- social posts
Main risk:
- useful signal is buried inside clutter.
Normalization focus:
- remove noise
- preserve meaning
- extract key sections
- classify insight value
2. Broken Format Input
Examples:
- bad PDF extraction
- strange line breaks
- corrupted characters
- missing headings
- copied table data
- poor transcript timing
Main risk:
- AI misreads structure or merges unrelated sections.
Normalization focus:
- reconstruct sections
- preserve source meaning
- clarify fields
- mark uncertain areas
3. Overloaded Input
Examples:
- massive course blocks
- large reports
- many newsletter items
- multiple pasted documents
- long sales pages
Main risk:
- AI compresses too aggressively and misses important details.
Normalization focus:
- split into sections
- process in stages
- extract by category
- avoid premature summarization
4. Incomplete Input
Examples:
- partial screenshots
- missing course files
- incomplete offer data
- broken email body
- cut-off transcript
- missing metrics
Main risk:
- AI fills gaps with assumptions.
Normalization focus:
- identify missing pieces
- mark assumptions
- request more input if required
- avoid unsupported conclusions
5. Mixed-Purpose Input
Examples:
- a newsletter containing tools, news, ads, offers, opinions, and compliance signals
- a course lesson mixing tactics, philosophy, tool steps, and frameworks
- a sales page containing claims, bonuses, scarcity, and testimonials
Main risk:
- AI treats everything as equally important.
Normalization focus:
- separate categories
- classify business relevance
- route each part properly
6. Operational Data Input
Examples:
- Supabase rows
- Google Ads stats
- Finance numbers
- experiment results
- task logs
- dashboard data
Main risk:
- bad decisions from misunderstood fields.
Normalization focus:
- preserve numeric accuracy
- define fields
- check missing data
- separate data from interpretation
- validate before decision-making
7. Development Input
Examples:
- code snippets
- plugin files
- screenshots of errors
- WordPress page lists
- M’s notes
- site status updates
Main risk:
- vague or wrong developer instructions.
Normalization focus:
- identify exact site
- identify exact file or screen
- identify current state
- identify expected result
- avoid assumptions
- protect save point
Normalized Input Record
A normalized input record should include:
Input Title:
Source Type:
Source Name:
Date Received:
Original Format:
Workflow Type:
Owning Brain:
Supporting Brains:
Cleaned Content Summary:
Key Extracted Elements:
Known Gaps:
Risk Level:
Priority:
Required AI Employee:
Validation Level:
Next Destination:
Recommended Action:
Human Review Required:
Logging Required:
This record may be simplified for low-risk tasks.
It should not be skipped for high-value or high-risk workflows.
Application To Course Absorption
Course absorption depends heavily on messy input normalization.
Course files may include:
- PDFs
- SRT transcripts
- HTML descriptions
- lesson outlines
- screenshots
- module notes
- repeated course platform text
The course normalization process should be:
- Capture lesson/module identity
- Extract useful lesson content
- Remove course platform noise
- Separate concepts from tool steps
- Identify frameworks, rules, workflows, and principles
- Classify MWMS relevance
- Decide absorb, park, ignore, or request more files
- Route to Course Absorption Agent or Page Builder Agent
Course rule:
Do not absorb course content until it has been normalized and judged for MWMS value.
Application To Newsletter Intelligence
Newsletter input is often noisy.
Newsletters may contain:
- headlines
- commentary
- ads
- sponsored tools
- links
- repeated footers
- unsubscribe blocks
- weak news
- strong market signals
- tool updates
- compliance warnings
- monetization opportunities
The newsletter normalization process should be:
- Capture sender, subject, date, body, snippet
- Clean footer and ad clutter
- Extract distinct news/tool/policy/business items
- Separate generic news from MWMS-relevant signal
- Classify primary Brain and supporting Brains
- Identify ACT NOW / TEST / MONITOR / PARK / REJECT
- Validate usefulness and specificity
- Route to queue, dashboard, or parking system
Newsletter rule:
A newsletter is not intelligence until useful signal has been separated from noise.
Application To Brain Room
Brain Room messages can be messy because they may contain:
- quick notes
- incomplete instructions
- mixed topics
- emotional context
- developer references
- task ideas
- strategic questions
- build concerns
- pasted snippets
The Brain Room normalization process should be:
- Capture message and thread context
- Identify whether it is chat, decision, task, bug, idea, or handoff
- Extract the actionable request
- Identify owning Brain
- Check developer boundary
- Create Agentic Work Unit if needed
- Route to AI Manager, human review, or parking
Brain Room rule:
Brain Room messages must be converted into structured work before they enter operational workflows.
Application To Offer Evaluation
Affiliate offers are messy because they include hype, claims, bonuses, vendor positioning, affiliate metrics, funnel promises, and incomplete data.
The offer normalization process should be:
- Capture offer name, network, niche, payout, and source
- Extract actual offer facts
- Separate vendor claims from evidence
- Identify claims, mechanism, audience, and funnel style
- Identify missing data
- Classify risk level and traffic fit
- Route to Research Brain, Finance Brain, Ads Brain, Experimentation Brain, or rejection
Offer rule:
Vendor hype must be normalized into evidence before MWMS makes offer decisions.
Application To Research Brain
Research sources are messy because not all sources are equal.
Research input may contain:
- facts
- opinions
- old data
- biased claims
- vendor claims
- affiliate claims
- unsupported estimates
- conflicting evidence
Research normalization should:
- Capture source and date
- Identify source type
- Extract claims
- Separate evidence from opinion
- Check freshness
- Identify contradictions
- Classify confidence
- Route to analysis or validation
Research rule:
Evidence must be separated from opinion before research becomes decision support.
Application To Content Brain
Content input is messy when it comes from:
- transcripts
- blog drafts
- old articles
- newsletters
- social posts
- campaign scripts
- SEO notes
- customer research
- AI drafts
Content normalization should:
- Capture source
- Identify content type
- Extract useful ideas
- Remove repetition
- Classify content purpose
- Identify target audience
- Identify channel fit
- Route to production, refresh, repurposing, or archive
Content rule:
Content ideas must be normalized before they enter production queues.
Application To AIBS Client Systems
Future client systems will receive messy business input.
Examples:
- customer emails
- internal documents
- spreadsheets
- reports
- meeting notes
- process documents
- support tickets
- sales messages
- finance notes
AIBS normalization should:
- capture the source
- extract the useful business content
- remove irrelevant noise
- classify workflow type
- assign AI Employee role
- validate before action
- route into report, task, dashboard, or human review
AIBS rule:
Client automation must begin with input normalization before AI action.
This is a major quality and safety layer for future MWMS client delivery.
Normalization Quality Checklist
Before input is passed into analysis, check:
- Is the source identified?
- Is the original format known?
- Is the input complete enough?
- Has obvious noise been removed?
- Has useful meaning been preserved?
- Are sections or fields clear?
- Are missing pieces marked?
- Has the input type been classified?
- Is the owning Brain identified?
- Is risk level assigned?
- Is the next AI Employee identified?
- Is human review needed?
- Is the destination clear?
- Is the input safe to analyze?
- Should the input be parked or rejected instead?
Normalization Failure Modes
MWMS must watch for the following failure modes:
- Analyzing raw input before cleaning it
- Removing useful context during cleaning
- Summarizing too early
- Mixing facts with assumptions
- Treating vendor claims as evidence
- Routing input to the wrong Brain
- Ignoring missing source data
- Over-compressing long course material
- Letting newsletters become dashboard noise
- Creating tasks from incomplete Brain Room messages
- Misreading copied tables or broken PDFs
- Using old data as current data
- Allowing unstructured client input into automation
- Forgetting to log source and origin
- Treating messy input as validated output
Any workflow showing these failure modes should be paused and corrected.
Human Review Rule
Human review is required when messy input affects:
- MCR canon
- live systems
- developer instructions
- financial decisions
- offer testing
- compliance-sensitive outputs
- public-facing content
- client-facing workflows
- paid traffic decisions
- cross-Brain routing
- major architecture changes
Human review may also be required when the input is incomplete, corrupted, contradictory, or too important to normalize automatically.
Automation Readiness Rule
Input normalization should be automated only after the manual pattern is stable.
Before automating normalization, confirm:
- input source is predictable
- common noise patterns are known
- cleaned structure is defined
- classification rules exist
- validation rules exist
- failure cases are understood
- human review rules are defined
- output destination is clear
- logging is available
The automation rule is:
Do not automate messy input until the cleanup pattern is understood.
Governance Role
HeadOffice owns the MWMS Messy Input Normalization Framework.
HeadOffice is responsible for:
- defining normalization rules
- protecting MWMS from dirty input
- setting validation expectations
- ensuring input is routed correctly
- preventing dashboard noise
- protecting MCR from weak source material
- protecting M’s active build from vague instructions
- ensuring future AIBS systems normalize client input before acting
Individual Brains may define their own specialized normalization rules, but they must align with this framework.
Relationship To Other MWMS Standards
This framework supports and must align with:
- MWMS AI Agent Operations Core
- MWMS Agentic Work Unit Standard
- MWMS AI Employee Role Card Standard
- MWMS AI Agent Orchestration Framework
- MWMS AI Workflow Pipeline Standard
- MWMS AI Output Validation Standard
- MWMS Brain Routing Rule
- MWMS Brain To Brain Request Protocol
- MWMS AI Output Standard Full File Delivery Rule
- MWMS Brain Header Schema Standard
- MWMS Page Naming Standard
- MWMS Document Structure Standard
- MWMS Architecture Registry
- MWMS Brain Interaction Map
- MWMS System Data Flow Map
- MWMS Supabase Event Schema
- HeadOffice Newsletter Intelligence Operating Protocol
- HeadOffice Newsletter Intelligence Output Validation Protocol
- MWMS Course Absorption Operating Rule
- MWMS Opportunity System Operating Protocol
- AI Business Systems Brain Blueprint
This framework sits early in the workflow chain because normalization happens before analysis, validation, routing, and reporting.
Drift Protection
This framework protects MWMS from the following forms of drift:
- Treating raw input as clean intelligence
- Summarizing before extracting
- Analyzing noisy newsletters without cleaning
- Absorbing course material without filtering
- Treating vendor claims as facts
- Creating developer instructions from incomplete context
- Routing vague Brain Room messages into live workflows
- Filling dashboards with low-quality source material
- Automating client workflows on messy inputs
- Losing source origin
- Ignoring missing data
- Letting broken transcripts distort learning
- Allowing unstructured data to drive business decisions
- Confusing input cleanup with final validation
- Mistaking quantity of source material for quality of intelligence
Any MWMS workflow that processes messy input without normalization should be reviewed.
Architectural Intent
The architectural intent of the MWMS Messy Input Normalization Framework is to protect the intelligence quality of the entire ecosystem.
MWMS depends on information.
Information enters the system in messy forms.
The stronger MWMS becomes, the more inputs it will receive.
Without normalization, the system becomes noisy.
With normalization, the system becomes intelligent.
The long-term goal is that MWMS can take messy input from any source and move it through a controlled path:
capture → extract → clean → structure → classify → validate → route
This allows MWMS to turn raw material into usable work, reports, decisions, dashboards, tasks, and learning records.
A mature MWMS normalization system should always be able to answer:
- What came in?
- Where did it come from?
- Was it complete?
- What useful content was extracted?
- What noise was removed?
- What structure was applied?
- What type of input is it?
- Which Brain owns it?
- Is it safe to analyze?
- Where should it go next?
- What should be logged?
When MWMS can answer those questions consistently, it can turn messy information into reliable operational intelligence.
Change Log
v1.0 — Initial Draft
Created the MWMS Messy Input Normalization Framework as the standard for converting raw, noisy, incomplete, duplicated, or unstructured inputs into clean, structured, classified, validatable, and routable MWMS intelligence.
This framework supports the MWMS AI Agent Operations Core, Agentic Work Unit Standard, AI Employee Role Card Standard, AI Agent Orchestration Framework, AI Workflow Pipeline Standard, and AI Output Validation Standard by defining how input should be captured, extracted, cleaned, structured, classified, validated, and routed before serious AI analysis or automation occurs.