System: MWMS
Document Type: Operating Framework
Authority Level: MCR Source Of Truth
Status: Draft For MCR
Version: v1.0
Primary Location: MCR
Future Operational Destination: Prompting Framework, HeadOffice Brain, Automation Brain, AIBS Brain, Content Brain, Ads Brain, Research Brain, Data Brain, Experimentation Brain, AI Employee Canon, Compliance Brain, Risk Brain
Parent Page: Prompting Framework
Owner: Martyn
Developer Boundary: Do Not Touch M’s Active Build Areas Unless Specifically Assigned
Source Of Truth: MCR
Last Reviewed: 2026-06-08
Source / Origin: AI Automations by Jack Master Prompting And Prompt System Design Block / Master Prompting w Devin Part 1 / Master Prompting w Devin Part 2
MWMS Classification: Prompt Architecture Framework / Automation Output Reliability Standard / AI Employee Prompt Governance / Prompt Chain Design System / Prompt Quality Control Framework
Primary Brain: MWMS Prompting Framework
Supporting Brains: HeadOffice Brain, Automation Brain, AIBS Brain, Content Brain, Ads Brain, Research Brain, Data Brain, Experimentation Brain, AI Employee Canon, Compliance Brain, Risk Brain
Related Pages: MWMS Prompting Framework, MWMS AI Employee Evaluation Scorecard Standard, MWMS AI Observability Metadata Standard, MWMS AI Work Session Persistence Standard, MWMS Agent Loop Control Framework, MWMS Next Action Picker Standard, MWMS AI Usage And Cost Visibility Standard, MWMS Source Visibility And Evidence Display Standard, MWMS Buyer First Authority Content And Channel Growth Framework, MWMS AIBS Business Diagnostic And Opportunity Discovery Framework, MWMS AIOS Lead Capture And Conversion Infrastructure Framework, HeadOffice Kaizen Continuous Improvement Loop
Source Evidence: This framework is derived from the Master Prompting w Devin Part 1 and Part 2 training inside AI Automations by Jack. The lessons covered conversational prompting versus one-shot prompting, atomic and compound prompts, prompt deconstruction, prompt stacking, tell-and-show examples, import method, train-of-thought/planning methods, anti-keyword staining, prompt chaining, model selection, output quality testing, iteration, and prompt engineering as a system-building skill for reliable AI automations.
Purpose
The purpose of the MWMS Prompt Architecture And Automation Output Reliability Framework is to define how MWMS designs, tests, improves, and governs prompts used inside AI Employees, automations, content systems, research systems, client diagnostics, and operational workflows.
This framework exists because prompts inside automations are not casual chat messages.
A casual prompt may work once.
An automation prompt must work repeatedly.
A casual prompt can be adjusted manually.
An automation prompt must behave predictably without constant human correction.
A casual prompt can be exploratory.
An automation prompt must be structured, tested, versioned, and reliable.
MWMS must therefore treat prompts as system assets.
The core purpose is:
To make every important MWMS prompt reusable, testable, observable, reliable, cost-aware, and suitable for automation.
Core Doctrine
The MWMS doctrine is:
A prompt used inside an automation is not a conversation. It is system architecture.
MWMS should not rely on vague prompts such as:
- “Write this better.”
- “Summarize this.”
- “Make a good post.”
- “Analyze this file.”
- “Create a script.”
- “Find the best ideas.”
- “Give me a report.”
- “Act like an expert.”
- “Make it sound professional.”
Those may work in a chat, but they are weak inside repeatable systems.
For automations, MWMS needs prompts that define:
- role
- task
- context
- input variables
- output format
- quality standards
- examples
- constraints
- failure handling
- model choice
- cost boundaries
- expected structure
- evaluation method
- version history
The prompt must be designed to reduce guessing.
The less the AI has to guess, the more reliable the automation becomes.
Strategic Importance
This framework is strategically important because MWMS is building a system of Brains and AI Employees.
Those AI Employees will depend on prompts.
If prompts are weak, the AI Employees will be weak.
If prompts are inconsistent, the outputs will be inconsistent.
If prompts are not tested, errors will enter the system.
If prompt chains are badly designed, automations will become unreliable.
If prompts are too expensive, scaling becomes costly.
If prompts are too vague, M and Martyn will waste time correcting outputs.
Prompt quality affects:
- AI Employee performance
- content output quality
- automation reliability
- research quality
- client diagnostic quality
- AIBS recommendations
- ad creative quality
- newsletter intelligence extraction
- course absorption quality
- data classification
- report generation
- sales and outreach workflows
- client-facing deliverables
- system trust
This framework therefore becomes a core infrastructure layer for the whole MWMS ecosystem.
The strategic lesson is:
Prompt quality is system quality.
Definition
Prompt architecture is the structured design of a prompt or prompt chain so it reliably performs a defined task inside a repeatable workflow.
Prompt asset is a tested, reusable, documented, versioned prompt that improves workflow performance.
Prompt liability is an informal or poorly structured prompt that requires constant rewriting, manual correction, or inconsistent human interpretation.
Atomic prompt is a small prompt designed for a narrow task such as formatting, classification, extraction, sentiment tagging, or simple rewriting.
Compound prompt is a larger structured prompt that combines multiple guidelines, context sections, examples, formatting instructions, and task logic to produce a more nuanced output.
Prompt chain is a sequence where the output of one prompt becomes the input or context for another prompt.
MWMS Definition
The MWMS Prompt Architecture And Automation Output Reliability Framework is:
Prompting Framework’s standard for designing, testing, chaining, versioning, and governing prompts so MWMS AI Employees and automations produce consistent, high-quality, cost-aware, and reliable outputs.
Scope
This framework applies to:
- AI Employee prompts
- automation prompts
- Make.com prompts
- n8n prompts
- OpenAI API prompts
- Claude prompts
- Gemini prompts
- content generation prompts
- research prompts
- classification prompts
- extraction prompts
- newsletter analysis prompts
- course absorption prompts
- sales prompts
- cold email prompts
- LinkedIn prompts
- ad creative prompts
- YouTube script prompts
- landing page prompts
- AIBS diagnostic prompts
- client report prompts
- dashboard insight prompts
- data cleaning prompts
- prompt chains
- model selection
- prompt cost control
- prompt testing
- prompt versioning
- prompt observability
- future Prompt Vault systems
This framework applies whenever MWMS creates a prompt that may be reused or automated.
Core Principle
The core principle is:
Build prompts like reusable systems, not disposable messages.
A prompt should not be considered complete just because it produces one good output.
It should be considered complete only when it can repeatedly produce the right output under realistic input variation.
Rule
A prompt is not reliable until it has been tested against multiple realistic inputs.
The MWMS Prompt Architecture And Automation Output Reliability Model
Every important prompt system should be designed across twelve layers:
- Prompt Purpose Layer
- Prompt Type Layer
- Input And Variable Layer
- Context And Knowledge Layer
- Guideline And Constraint Layer
- Example And Tell And Show Layer
- Deconstruction And Chain Layer
- Output Format Layer
- Model Selection Layer
- Testing And Iteration Layer
- Cost Latency And Scale Layer
- Observability And Governance Layer
1. Prompt Purpose Layer
Every prompt must have a clear job.
A prompt should not exist because “AI can do it.”
It should exist because MWMS needs a specific task performed.
Prompt Purpose Questions
Ask:
- What is this prompt supposed to do?
- What workflow does it support?
- Which Brain or AI Employee uses it?
- What business outcome does it support?
- What input will it receive?
- What output must it produce?
- Who or what consumes the output?
- What happens if the output is wrong?
- How often will this prompt run?
- Is this exploratory or production-grade?
- Is this prompt temporary or reusable?
- Does this need human review?
Prompt Purpose Examples
A prompt may exist to:
- classify a lead
- extract course insights
- summarize a newsletter
- score an offer
- write a YouTube hook
- analyze a competitor page
- generate a sales email
- create an AIBS diagnostic report section
- turn transcript content into a content brief
- identify compliance risk
- route a task to a Brain
- extract structured data from a messy file
- create a buyer question map
- generate a client opportunity score
Rule
If the prompt purpose is vague, the output will be vague.
2. Prompt Type Layer
MWMS must choose the right prompt type for the task.
Not every task needs a large prompt.
Not every task should be broken into many prompts.
Prompt Type 1: Conversational Prompt
Used for:
- exploration
- brainstorming
- early thinking
- clarification
- manual coaching
- one-off analysis
- interactive development
Weakness:
- inconsistent
- hard to automate
- depends on human steering
- poor as a reusable asset
Prompt Type 2: One Shot Prompt
Used for:
- repeatable automations
- standard tasks
- structured outputs
- system workflows
- AI Employee operations
- API calls
Strength:
- reusable
- testable
- can be versioned
- supports automation
Prompt Type 3: Atomic Prompt
Used for:
- formatting
- classification
- tagging
- small extraction
- simple transformation
- routing
- binary decisions
- sentiment detection
Examples:
- “Classify this lead as qualified, unqualified, or needs review.”
- “Extract the company name.”
- “Add line breaks for mobile readability.”
- “Return only JSON.”
Prompt Type 4: Compound Prompt
Used for:
- complex writing
- diagnostic reports
- content generation
- sales page analysis
- structured research
- multi-criteria reasoning
- nuanced output control
Examples:
- course absorption framework prompt
- AIBS diagnostic report prompt
- authority content generation prompt
- LinkedIn Sales Navigator query parser
- ad creative analysis prompt
Rule
Use the smallest prompt that reliably performs the task, but not smaller.
3. Input And Variable Layer
A reliable prompt separates fixed instructions from dynamic inputs.
Fixed instructions should define the task.
Dynamic inputs should contain the changing data.
Common Dynamic Variables
Variables may include:
- source text
- transcript
- buyer avatar
- product name
- offer details
- client name
- industry
- target market
- hook
- outline
- prior output
- examples
- tone
- content type
- platform
- desired format
- data record
- CRM fields
- campaign name
- previous result
- task metadata
Variable Design Questions
Ask:
- What changes each run?
- What stays the same?
- Which variables are required?
- Which variables are optional?
- What happens if a variable is missing?
- Does the model know where the input starts and ends?
- Does the prompt separate instructions from data?
- Can this be safely used in an automation?
- Is the variable name clear to a developer or future AI Employee?
Variable Rule
Dynamic input must be clearly separated from prompt instructions.
4. Context And Knowledge Layer
AI needs context to reduce guessing.
The model is a pattern recognition and prediction system.
If MWMS does not provide the right patterns, the model will use generic patterns from training data.
That can create weak, generic, or misleading outputs.
Context Types
Provide context such as:
- business context
- buyer context
- offer context
- Brain context
- task context
- source context
- industry context
- platform context
- audience context
- prior decisions
- examples
- definitions
- frameworks
- constraints
- known risks
Import Method
The import method means bringing specialist knowledge into the prompt.
This may come from:
- course material
- internal SOPs
- expert interviews
- past winning examples
- client documents
- platform rules
- research notes
- sales call notes
- content swipe files
- proven frameworks
- product documentation
- market research
- MWMS Canon pages
- MCR pages
Import Method Rule
When public model knowledge is too generic, MWMS must import specialist knowledge.
5. Guideline And Constraint Layer
Prompts need clear rules.
Guidelines tell the AI what to do.
Constraints tell the AI what not to do.
Older models often struggled with negative instructions, but stronger modern models can often follow both positive and negative rules.
MWMS should still prefer clear positive instructions and use negative constraints where necessary.
Guideline Types
Use:
- style guidelines
- formatting guidelines
- reasoning guidelines
- evidence guidelines
- output guidelines
- tone guidelines
- audience guidelines
- compliance guidelines
- exclusion guidelines
- quality guidelines
- workflow guidelines
Constraint Examples
Do not:
- invent facts
- include unsupported claims
- use hype
- add unverified statistics
- mention banned product details
- create legal or medical certainty
- output outside the requested format
- include irrelevant commentary
- use platform-risk wording
- expose private data
- change the title format
Rule
Guidelines and constraints should remove ambiguity before the model creates the output.
6. Example And Tell And Show Layer
The tell and show method is one of the most important prompt quality controls.
Do not only tell the AI what to do.
Show it what good looks like.
Tell And Show Structure
Use:
- Explain the rule.
- Show a good example.
- Show a bad example if useful.
- Explain why the good example is better.
- Ask the model to follow the pattern.
Example Types
Examples may include:
- ideal output
- bad output
- before/after rewrite
- correct JSON format
- desired paragraph style
- classification examples
- hook examples
- CTA examples
- email examples
- report sections
- analysis examples
- tone examples
- formatting examples
Example Quality Questions
Ask:
- Does this example reflect the output we actually want?
- Is the example current?
- Is the example relevant to this task?
- Does the example include the desired structure?
- Does it show the correct tone?
- Does it show what not to do?
- Is the example too generic?
- Is the example legally or compliance safe?
Rule
When output style or structure matters, include examples.
7. Deconstruction And Chain Layer
Complex tasks should often be broken into smaller prompts.
This is the deconstruction method.
Instead of asking AI to complete a complex task in one pass, MWMS should split the task into staged outputs.
Deconstruction Examples
For a YouTube script:
- Analyze source material.
- Extract buyer pain.
- Generate hook options.
- Select strongest hook.
- Create outline.
- Write opening.
- Write body sections.
- Write CTA.
- Review compliance.
- Final polish.
For a course absorption block:
- Identify source themes.
- Extract valuable frameworks.
- Compare against MWMS existing knowledge.
- Decide absorb / merge / park / ignore.
- Generate page candidates.
- Draft full page.
- Create registry entry.
- Park deferred updates.
For an AIBS diagnostic:
- Read intake.
- Identify business context.
- Map leakage categories.
- Score opportunities.
- Assess AI readiness.
- Recommend first project.
- Draft diagnostic report.
- Draft proposal path.
Prompt Chaining
Prompt chaining means the output of one prompt becomes the input to another.
Use chaining when:
- quality improves through stages
- the task needs deep focus
- the output is too complex for one prompt
- each stage needs separate evaluation
- different models may suit different stages
- human approval is needed between steps
Rule
Break complex tasks into prompt chains when one prompt cannot reliably produce high-quality output.
8. Output Format Layer
Automation prompts need predictable output.
The output format must be clearly defined.
If the next system expects JSON, the prompt must output JSON.
If the next system expects a report, the prompt must output the right report structure.
If the next system expects classification, the prompt must output only the allowed labels.
Output Format Types
Use:
- plain text
- markdown
- JSON
- table
- bullet list
- scored result
- label only
- sectioned report
- summary block
- email format
- script format
- page format
- CSV-like structure
- WordPress-ready page output
Output Format Questions
Ask:
- Who or what uses this output next?
- Does the output need to be parsed by software?
- Does it need to be copied into WordPress?
- Does it need to be read by a human?
- Does it need exact headings?
- Does it need a fixed schema?
- Does it need to avoid extra commentary?
- Does it need error handling?
- Does it need a confidence field?
- Does it need source references?
Rule
The prompt must define the output format as tightly as the workflow requires.
9. Model Selection Layer
Different models perform differently on different tasks.
MWMS should not assume the newest or most expensive model is always best.
Some tasks need the strongest reasoning model.
Some tasks need fast low-cost classification.
Some tasks need long context.
Some tasks need writing quality.
Some tasks need strict formatting.
Some tasks need low latency.
Model Selection Criteria
Choose based on:
- task complexity
- context length
- output length
- instruction-following
- formatting reliability
- cost
- latency
- creativity needed
- reasoning needed
- classification consistency
- language support
- privacy requirements
- tool compatibility
- API availability
Model Testing Questions
Ask:
- Which model gives the most consistent output?
- Which model follows format best?
- Which model handles the context length?
- Which model is affordable at scale?
- Which model is fast enough?
- Which model fails least often?
- Which model handles the language best?
- Which model works best for this specific prompt?
Rule
Model choice must be tested against the use case, not assumed.
10. Testing And Iteration Layer
Prompt engineering is iterative.
A prompt should be improved through testing, not guesswork.
Prompt Testing Process
Use:
- Define the expected output.
- Create test inputs.
- Run the prompt.
- Review the output.
- Identify failure patterns.
- Adjust prompt structure.
- Add examples where needed.
- Adjust constraints.
- Test another model if needed.
- Repeat until reliable enough.
Prompt Testing Inputs
Test against:
- ideal input
- messy input
- short input
- long input
- ambiguous input
- missing data
- conflicting data
- edge cases
- high-risk examples
- real production samples
- past failure examples
Scientific Method Standard
Prompt improvement should follow:
- hypothesis
- test
- observation
- adjustment
- retest
- record
Rule
Do not deploy an important automation prompt after one successful test.
11. Cost Latency And Scale Layer
Prompt quality must be balanced against cost and speed.
A prompt that works well once may become too expensive at scale.
A prompt chain that produces excellent output may be too slow for a real-time workflow.
MWMS must decide the right balance.
Cost Factors
Costs may increase with:
- long prompts
- large examples
- long context
- chain-of-thought/planning outputs
- multiple prompt steps
- expensive models
- repeated context in each step
- large output length
- retries
- failed outputs
Latency Factors
Latency may increase with:
- large context
- multi-step chains
- slow models
- long output
- tool calls
- external API calls
- validation steps
- human approval stages
Quality Versus Cost Questions
Ask:
- How often will this prompt run?
- How much does each run cost?
- What does a failed output cost?
- Is this output client-facing?
- Is this output revenue-related?
- Is this output high-risk?
- Can a cheaper model do the task?
- Can context be reduced?
- Can prompts be stacked safely?
- Should prompts be deconstructed for quality?
- Is speed more important than depth?
Rule
High-value outputs can justify higher prompt cost. Low-value repetitive outputs need cost discipline.
12. Observability And Governance Layer
MWMS must track prompt performance.
A prompt hidden inside an automation should not become invisible.
Important prompts need metadata, logging, versioning, and review.
Prompt Metadata Fields
Track:
Prompt Name:
Prompt Version:
Brain / Employee:
Workflow:
Prompt Type:
Model Used:
Input Variables:
Output Format:
Test Status:
Average Cost:
Average Latency:
Failure Modes:
Last Reviewed:
Owner:
Change Notes:
Observability Questions
Ask:
- Which prompt generated this output?
- Which version was used?
- Which model was used?
- What input was passed?
- How much did it cost?
- How long did it take?
- Did it pass validation?
- Did it fail formatting?
- Was human review required?
- Was the output accepted or corrected?
- What changed since the last version?
Rule
A production prompt should be traceable.
Prompt Asset Standard
A prompt becomes an MWMS prompt asset only when it has:
- clear purpose
- defined owner
- defined Brain or Employee
- stable prompt text
- input variables
- output format
- quality criteria
- examples where needed
- test inputs
- model selection notes
- version number
- cost/latency awareness
- failure handling
- review date
Rule
Prompt assets should be stored, versioned, and reused.
Prompt Liability Warning
A prompt becomes a liability when it:
- is rewritten every time
- lives only in chat history
- has no version
- has no owner
- has no test examples
- creates inconsistent outputs
- requires manual fixing
- mixes instructions and data poorly
- lacks output format
- uses vague wording
- cannot be audited
- cannot be reused
- creates cost without visibility
Rule
Prompt liabilities must be converted into prompt assets or removed from production workflows.
Atomic Prompt Standard
Use atomic prompts for narrow tasks.
Good Atomic Prompt Uses
Use for:
- classification
- formatting
- line breaks
- tag assignment
- sentiment label
- yes/no decision
- extracting one field
- routing a task
- simple rewrite
- compliance flag
- deduplication check
Atomic Prompt Requirements
An atomic prompt should define:
- allowed output labels
- exact output format
- examples if classification matters
- what to do if uncertain
- no extra commentary rule
Rule
Atomic prompts should be small, clear, and easy to validate.
Compound Prompt Standard
Use compound prompts for complex tasks.
Good Compound Prompt Uses
Use for:
- structured reports
- content creation
- sales page analysis
- diagnostic output
- course absorption
- research synthesis
- competitor analysis
- prompt-to-query conversion
- detailed rewrite
- strategy creation
- multi-factor scoring
Compound Prompt Requirements
A compound prompt should include:
- identity
- task
- context
- input variables
- guidelines
- examples
- output format
- scoring rules if needed
- constraints
- failure instructions
Rule
Compound prompts should be structured in clear sections.
Deconstruction Method Standard
Use deconstruction when a task is too complex for one prompt.
Deconstruction Steps
- Identify the full task.
- Break it into smaller thinking steps.
- Decide which steps need separate prompts.
- Decide which outputs feed the next step.
- Add validation or human review where needed.
- Test each step separately.
- Test the full chain.
- Record failure points.
Deconstruction Rule
If one prompt produces generic or inconsistent output, break the task into stages.
Stacking Method Standard
Use stacking when multiple related instructions or prompt sections can safely live inside one prompt.
Stacking can reduce:
- duplicated context
- repeated API cost
- latency
- post-processing complexity
- unnecessary prompt calls
But stacking can reduce quality if the prompt becomes overloaded.
Stacking Questions
Ask:
- Can one prompt reliably handle this?
- Does stacking reduce cost?
- Does stacking reduce latency?
- Does stacking reduce output quality?
- Does stacking make the prompt harder to debug?
- Does each task still get enough attention?
- Would deconstruction produce better quality?
Rule
Stack only when output quality remains stable.
Tell And Show Method Standard
Use tell and show when output style, structure, or classification accuracy matters.
Tell And Show Template
Instruction:
Describe the rule.
Good Example:
Show the desired output.
Why It Works:
Explain the pattern.
Bad Example:
Show what to avoid if useful.
Task:
Ask the model to apply the pattern.
Rule
When the model keeps missing the target, add better examples before adding more vague instructions.
Import Method Standard
Use the import method when generic model knowledge is not enough.
Import Sources
Import from:
- MWMS Canon pages
- MCR pages
- course notes
- expert interviews
- SOPs
- client documents
- winning ads
- winning content
- proven sales emails
- industry rules
- platform documentation
- research reports
- audience language
- customer reviews
- competitor examples
Import Rule
The better the imported knowledge, the better the prompt can perform.
Planning Method Standard
The planning method asks the model to think through the task before creating the final output.
This is useful for:
- analysis
- classification
- content planning
- report generation
- diagnostic scoring
- research synthesis
- opportunity discovery
Planning Output Caution
Planning can improve quality but increase cost and output length.
For production systems, MWMS may need to:
- keep planning internal
- parse only the final answer
- use shorter planning steps
- use a cheaper model for planning
- suppress unnecessary reasoning in final output
Rule
Use planning when quality matters more than minimal token cost.
Anti Keyword Staining Standard
Some common words can bias the model toward weak generic outputs.
For example, words such as:
- tweet
- post
- headline
- blog
- article
- caption
- sales email
- motivational
- viral
may cause the model to imitate low-quality public training data.
Anti Keyword Staining Method
Instead of relying on generic labels, describe the actual output.
Examples:
Instead of:
- “Write a tweet.”
Use:
- “Write a concise short-form piece of copy designed to create curiosity and one clear takeaway.”
Instead of:
- “Write a headline.”
Use:
- “Write a single-sentence attention hook that names the pain and implies a specific benefit.”
Instead of:
- “Write a blog post.”
Use:
- “Write a structured answer-first guide for a problem-aware buyer.”
Rule
Use task-specific language when generic content labels produce generic outputs.
Prompt Chain Standard
Every prompt chain should define:
Chain Name:
Workflow:
Brain / Employee:
Step 1 Prompt:
Step 1 Output:
Step 2 Prompt:
Step 2 Output:
Step 3 Prompt:
Step 3 Output:
Human Review Point:
Validation Rules:
Failure Handling:
Final Output:
Prompt Chain Rule
Each step in a prompt chain should have a clear reason to exist.
Model Testing Standard
Before deploying a prompt, test multiple models where appropriate.
Model Testing Template
Prompt Name:
Task:
Test Input:
Model Tested:
Output Quality Score:
Formatting Score:
Consistency Score:
Cost:
Latency:
Failure Notes:
Decision: Use / Reject / Retest
Rule
The best model is the model that performs best for the specific task, not the newest model by default.
Prompt Iteration Log
Every important prompt should keep an iteration log.
Iteration Log Template
Prompt Name:
Version:
Date:
Change Made:
Reason For Change:
Test Inputs Used:
Result:
Failure Fixed:
New Failure Created:
Decision: Keep / Revert / Retest
Owner:
Rule
Prompt improvements should be recorded so MWMS does not lose learning.
Prompt Quality Scorecard
Score important prompts out of 100.
Score Categories
Purpose Clarity: 10
Input Clarity: 10
Context Quality: 10
Guideline Strength: 10
Example Quality: 10
Output Format Reliability: 10
Model Fit: 10
Testing Coverage: 10
Cost / Latency Fit: 10
Observability / Versioning: 10
Interpretation
85–100: Production ready
70–84: Good; monitor and improve
55–69: Usable with human review
40–54: Needs rewrite before automation
Below 40: Do not deploy
Rule
A prompt should pass the scorecard before becoming part of an important automation.
Automation Prompt Readiness Checklist
Before a prompt is used in automation, confirm:
Purpose
- task is clear
- workflow is clear
- owner is clear
- Brain / Employee is clear
Input
- variables are defined
- required inputs are clear
- missing input handling exists
- source boundaries are clear
Instructions
- guidelines are specific
- constraints are clear
- examples are included where needed
- output format is defined
Testing
- multiple test inputs used
- edge cases tested
- output quality reviewed
- model choice tested
- cost and latency checked
Governance
- prompt version recorded
- failure modes documented
- human review point defined if needed
- observability fields defined
- change log started
Rule
No important automation prompt should go live without readiness review.
Content Prompt Flow Standard
Content prompts should usually be chained, not written in one pass.
Example Content Flow
- Source content collection
- Proven content analysis
- Audience and psychographic extraction
- Hook generation
- Hook selection
- Outline generation
- Body section generation
- Persuasion layer
- CTA generation
- Editing and compliance review
- Final output
Rule
For content systems, first validate the process manually, then convert the proven process into prompts.
AIBS Diagnostic Prompt Flow Standard
AIBS prompts should support diagnostic-first thinking.
Example AIBS Diagnostic Flow
- Parse client intake.
- Identify business model.
- Identify stated problem.
- Identify possible deeper problems.
- Map leakage categories.
- Review data readiness.
- Score AI readiness.
- Score opportunities.
- Recommend first project.
- Draft diagnostic report.
- Draft proposal summary.
Rule
AIBS prompts should diagnose before recommending AI implementation.
Research Prompt Flow Standard
Research prompts should separate extraction, interpretation, and recommendation.
Example Research Flow
- Extract facts.
- Identify source type.
- Identify claims.
- Score credibility.
- Summarize key findings.
- Identify business relevance.
- Route to Brain.
- Recommend action.
Rule
Do not mix raw extraction and strategic recommendation unless the prompt is tested for both.
Compliance Prompt Flow Standard
Compliance prompts should be conservative.
Compliance Prompt Requirements
Compliance prompts should:
- identify claims
- classify risk
- identify missing proof
- flag sensitive categories
- avoid legal certainty
- recommend human review where needed
- preserve source evidence
- output clear risk labels
Rule
Compliance prompts should flag risk, not pretend to replace professional legal review.
Prompt Failure Modes
Common prompt failures include:
- generic output
- format drift
- hallucinated facts
- missing sections
- inconsistent scoring
- wrong tone
- overlong output
- too-short output
- ignoring constraints
- mixing examples into output
- poor parsing
- invalid JSON
- weak classification
- too much creativity
- not enough specificity
- output not suitable for next workflow step
Rule
Failure modes should be recorded and used to improve prompt versions.
Prompt Debugging Checklist
When a prompt fails, ask:
- Was the purpose clear?
- Was the input clear?
- Was the context enough?
- Were examples provided?
- Was the output format too vague?
- Was the task too complex for one prompt?
- Should the task be deconstructed?
- Was the prompt overloaded?
- Should parts be stacked or separated?
- Was the model wrong for the task?
- Was the temperature too high?
- Was there conflicting instruction?
- Did generic keywords bias the output?
- Was specialist knowledge missing?
- Was the prompt tested on enough examples?
Rule
When the model fails, first inspect the prompt architecture before blaming the model.
Prompt Governance Roles
Prompt Owner
Responsible for prompt purpose, quality, and updates.
Prompt User
Uses the prompt inside a workflow.
Prompt Reviewer
Tests output quality and failure modes.
Compliance Reviewer
Reviews sensitive prompts for risk.
Data Reviewer
Checks inputs and outputs where structured data is involved.
HeadOffice
Approves important prompt standards and prevents prompt chaos.
Rule
Important prompts need ownership.
Application To Prompting Framework
Prompting Framework owns this standard.
Prompting Framework should use it to:
- structure reusable prompts
- define prompt assets
- prevent prompt liabilities
- standardize prompt chains
- guide model testing
- improve output reliability
Prompting Framework Rule
Prompting Framework must make prompt quality repeatable.
Application To AI Employee Canon
AI Employee Canon should use this framework to define prompt requirements for every AI Employee.
Each AI Employee should have:
- role prompt
- task prompts
- output templates
- failure handling
- evaluation criteria
- prompt version history
- model choice notes
AI Employee Rule
An AI Employee is only as reliable as its prompt architecture.
Application To Automation Brain
Automation Brain should use this framework before deploying prompt-based automations.
Automation Brain should check:
- prompt type
- chain structure
- input variables
- output parsing
- model cost
- latency
- failure handling
- human review points
Automation Brain Rule
Automation Brain must not automate unstable prompts.
Application To AIBS Brain
AIBS Brain should use this framework for client diagnostics, reports, and AIOS workflows.
AIBS prompts must:
- diagnose before recommending
- use client context safely
- respect privacy boundaries
- output structured reports
- support opportunity scoring
- avoid unsupported claims
AIBS Rule
AIBS prompt systems must be reliable enough for client-facing work.
Application To Content Brain
Content Brain should use prompt chains for high-quality content production.
Content prompts should use:
- proven content analysis
- imported specialist knowledge
- examples
- deconstruction
- hook generation
- outline generation
- persuasive layer
- compliance review
Content Brain Rule
Content Brain should not rely on one-pass generic content prompts for important assets.
Application To Ads Brain
Ads Brain should use prompt architecture for ad creative generation and analysis.
Ads prompts should define:
- platform
- buyer
- awareness level
- hook type
- compliance constraints
- output format
- variation count
- testing hypothesis
Ads Brain Rule
Ads prompts should create testable creative assets, not random ad copy.
Application To Research Brain
Research Brain should use structured prompts for extraction, synthesis, and recommendation.
Research prompts should:
- separate fact extraction from interpretation
- preserve source context
- classify business relevance
- route insights to the correct Brain
- identify uncertainty
Research Brain Rule
Research prompts must protect evidence quality.
Application To Data Brain
Data Brain should use this framework for structured extraction, classification, and metadata creation.
Data prompts should:
- output parseable structure
- define field names
- handle missing data
- avoid invented fields
- record confidence where needed
Data Brain Rule
Data prompts should produce structured, auditable outputs.
Application To Experimentation Brain
Experimentation Brain should test prompt performance like any other system experiment.
Experimentation should test:
- model choice
- prompt version
- example count
- output format
- chain design
- cost
- latency
- accuracy
- consistency
Experimentation Brain Rule
Prompt changes should be treated as experiments when they affect important outputs.
Application To Compliance And Risk Brain
Compliance and Risk Brain should review prompts that affect sensitive outputs.
Review prompts for:
- claims
- privacy
- regulated topics
- financial assumptions
- health claims
- legal claims
- affiliate claims
- client data use
- AI processing risk
- hallucination risk
Compliance Rule
Sensitive prompts need compliance-aware constraints and review.
Application To HeadOffice Brain
HeadOffice governs prompt quality across MWMS.
HeadOffice should ask:
- is this prompt reusable?
- is it tested?
- is it versioned?
- is it observable?
- is it too expensive?
- is it reliable enough?
- does it create risk?
- does it support MWMS strategy?
- does it protect M from unnecessary rework?
HeadOffice Rule
HeadOffice must prevent MWMS from building on unstable prompt foundations.
Deferred Update And Parking Lot Section
This page creates later update needs.
Later Update 1: MWMS Prompting Framework
Add:
- prompt assets versus prompt liabilities
- one-shot prompt standard for automations
- prompt deconstruction and chaining
- tell and show examples
- import method
- anti-keyword staining
- model selection testing
- prompt iteration logs
Later Update 2: MWMS AI Employee Evaluation Scorecard Standard
Add:
- prompt reliability score
- output consistency score
- model fit score
- prompt iteration count
- example coverage score
- formatting reliability score
- failure-case testing
- prompt chain quality
Later Update 3: MWMS AI Observability Metadata Standard
Add:
- prompt version
- prompt chain step
- model used
- input token estimate
- output token estimate
- cost
- latency
- retry count
- validation status
- human review flag
- revision history
Later Update 4: MWMS AI Usage And Cost Visibility Standard
Add:
- prompt chain cost tracking
- high-volume prompt review
- context duplication warning
- stacking versus deconstruction cost comparison
- model cost comparison
- cost per successful output
Later Update 5: MWMS Buyer First Authority Content And Channel Growth Framework
Add:
- proven content deconstruction
- scrape/analyze/repurpose workflow
- example-led content prompting
- caption/video/script prompt flows
- validate manually before automation
- output quality review against real performance data
Later Update 6: MWMS AIBS Business Diagnostic And Opportunity Discovery Framework
Add:
- diagnostic prompt chain
- client intake parsing prompt
- opportunity scoring prompt
- AI readiness prompt
- report generation prompt
- privacy-aware prompt constraints
Future Employee Ideas
- Prompt Architecture Auditor
- Prompt Chain Designer
- Prompt Quality Evaluator
- Prompt Cost And Latency Analyst
- Specialist Knowledge Injector
- Prompt Observability Steward
- AI Employee Prompt Reviewer
- Prompt Failure Mode Analyst
- Model Selection Tester
- Prompt Asset Librarian
Drift Protection
This framework protects MWMS from:
- treating prompts as casual chat messages
- building automations on vague prompts
- relying on one good output as proof
- ignoring prompt failures
- not versioning prompts
- not testing models
- not tracking cost
- not tracking latency
- creating prompt chains with no structure
- putting too much into one prompt
- splitting prompts unnecessarily
- using generic training data when specialist knowledge is needed
- forgetting examples
- creating outputs that cannot be parsed
- making AI Employees unreliable
- forcing M to fix prompt-driven mistakes manually
- deploying client-facing prompts before testing
- losing prompt knowledge inside chat history
Drift Signals
Watch for:
- “Just ask ChatGPT to do it.”
- “The prompt worked once, so it is ready.”
- “We can fix it manually later.”
- “No need to version it.”
- “The model should know what I mean.”
- “The prompt is long, so it must be good.”
- “The prompt is short, so it must be efficient.”
- “We do not need examples.”
- “We do not know which model is being used.”
- “We do not know what this prompt costs.”
- “The automation output changes every time.”
- “The output format keeps breaking.”
- “The prompt lives only in a chat thread.”
- “The AI Employee is unreliable but nobody knows why.”
Rule
If the prompt cannot be tested, versioned, and explained, it is not ready for serious automation.
Strategic Summary
This framework captures the strongest useful lessons from the Master Prompting w Devin block.
The key lesson is:
Prompt engineering for automations is not about clever wording. It is about designing reliable prompt systems.
MWMS should treat prompts as assets that can be:
- built
- tested
- improved
- chained
- versioned
- scored
- logged
- monitored
- reused
- governed
The block showed that powerful AI automation depends on:
- one-shot prompts for repeatability
- atomic prompts for small tasks
- compound prompts for complex tasks
- deconstruction for quality
- stacking for efficiency
- tell and show examples
- imported specialist knowledge
- planning before output
- anti-keyword staining where generic terms hurt quality
- chaining outputs across steps
- model testing
- daily practice and iteration
For MWMS, this strengthens every Brain.
It improves AI Employees.
It improves course absorption.
It improves content production.
It improves AIBS diagnostics.
It improves automation reliability.
It improves observability and cost control.
The most important system-level standard is:
Every important AI automation must have prompt architecture, not just a prompt.
Final Standard
The MWMS final standard is:
Every important MWMS prompt used in an AI Employee, automation, report, content system, diagnostic system, research system, or client-facing workflow must be designed as a reusable prompt asset with clear purpose, structured inputs, imported context where needed, specific guidelines, examples, defined output format, model testing, iteration history, cost and latency awareness, failure handling, and observability metadata.
A valid MWMS prompt asset must define:
- prompt name
- purpose
- Brain / Employee
- workflow
- prompt type
- input variables
- context
- guidelines
- examples
- output format
- model
- test inputs
- quality criteria
- cost / latency notes
- failure modes
- version
- owner
- last reviewed date
That is the MWMS Prompt Architecture And Automation Output Reliability standard.
Change Log
Version: v1.0
Date: 2026-06-08
Author: HeadOffice
Change:
Created the MWMS Prompt Architecture And Automation Output Reliability Framework from the AI Automations by Jack Master Prompting And Prompt System Design Block.
Captured the strongest lessons from:
- Master Prompting w Devin Part 1
- Master Prompting w Devin Part 2
Defined the MWMS Prompt Architecture And Automation Output Reliability Model with twelve layers:
- Prompt Purpose Layer
- Prompt Type Layer
- Input And Variable Layer
- Context And Knowledge Layer
- Guideline And Constraint Layer
- Example And Tell And Show Layer
- Deconstruction And Chain Layer
- Output Format Layer
- Model Selection Layer
- Testing And Iteration Layer
- Cost Latency And Scale Layer
- Observability And Governance Layer
Added key operating sections:
- Prompt Asset Standard
- Prompt Liability Warning
- Atomic Prompt Standard
- Compound Prompt Standard
- Deconstruction Method Standard
- Stacking Method Standard
- Tell And Show Method Standard
- Import Method Standard
- Planning Method Standard
- Anti Keyword Staining Standard
- Prompt Chain Standard
- Model Testing Standard
- Prompt Iteration Log
- Prompt Quality Scorecard
- Automation Prompt Readiness Checklist
- Content Prompt Flow Standard
- AIBS Diagnostic Prompt Flow Standard
- Research Prompt Flow Standard
- Compliance Prompt Flow Standard
- Prompt Failure Modes
- Prompt Debugging Checklist
- Prompt Governance Roles
- Deferred Update And Parking Lot Section
Mapped the framework across:
- Prompting Framework
- AI Employee Canon
- Automation Brain
- AIBS Brain
- Content Brain
- Ads Brain
- Research Brain
- Data Brain
- Experimentation Brain
- Compliance Brain
- Risk Brain
- HeadOffice Brain
Purpose of creation:
To establish a formal MWMS standard for designing, testing, chaining, versioning, and governing prompt systems so MWMS AI Employees and automations produce reliable, consistent, cost-aware, observable, and high-quality outputs.
END — MWMS PROMPT ARCHITECTURE AND AUTOMATION OUTPUT RELIABILITY FRAMEWORK v1.0