System: MWMS
Document Type: Operating Framework
Authority Level: MCR Source Of Truth
Status: Draft For MCR
Version: v1.0
Primary Location: MCR
Future Operational Destination: Data Brain, Research Brain, AIBS Brain, Client Intelligence Systems, Automation Brain, HeadOffice Brain, Compliance Brain, Risk Brain
Parent Page: Data Brain
Owner: Martyn
Developer Boundary: Do Not Touch M’s Active Build Areas Unless Specifically Assigned
Source Of Truth: MCR
Last Reviewed: 2026-06-08
Source / Origin: AI Automations by Jack AI Native Entrepreneur Architecture And Tool Decision Block
MWMS Classification: RAG Framework / Knowledge Base Infrastructure Framework / Client Memory Framework / Vector Memory Standard / Source Based AI Retrieval Standard
Primary Brain: Data Brain
Supporting Brains: Research Brain, AIBS Brain, Automation Brain, HeadOffice Brain, Compliance Brain, Risk Brain, Sales Brain, Content Brain, Product Brain, UX Brain, Prompting Framework
Related Pages: MWMS Supabase RAG And Vector Memory Framework, MWMS Client Intelligence And Business Memory Automation Framework, MWMS AIBS Automation Audit And Opportunity Mapping Framework, MWMS Automation Architecture And Tool Selection Framework, MWMS Source Visibility And Evidence Display Standard, MWMS AI Observability Metadata Standard, MWMS AI Automation Security And Risk Checklist, MWMS Prompt Architecture And Automation Output Reliability Framework, MWMS Client Intelligence Report Automation Framework
Purpose
The purpose of the MWMS RAG Knowledge Base And Client Memory Infrastructure Framework is to define how MWMS creates reliable, source based knowledge systems that allow AI Employees, client assistants, diagnostic systems, report generators, support agents, voice agents, and business intelligence tools to answer from approved knowledge rather than generic model memory.
This framework exists because many MWMS systems will need to work from:
- uploaded documents
- client files
- website content
- SOPs
- sales decks
- case studies
- training materials
- FAQs
- call transcripts
- emails
- meeting notes
- Google Drive files
- internal MCR pages
- business knowledge bases
- client intelligence records
- research documents
- course transcripts
- support histories
- customer feedback
- product documentation
The core purpose is:
To help MWMS build RAG and knowledge base systems that retrieve the right information, preserve source context, avoid stale knowledge, reduce hallucination, protect client data, and create useful business memory.
Core Doctrine
The MWMS doctrine is:
AI should answer from approved knowledge when the task depends on business facts.
Generic AI is useful for general thinking.
But client systems, internal Brains, business reports, support agents, and AIBS diagnostics need grounded memory.
A good RAG system allows AI to:
- search approved sources
- retrieve relevant context
- answer from business-specific material
- avoid pretending to know what is missing
- cite or reference source material where needed
- update when documents change
- remove outdated knowledge
- keep client memory separate
- preserve confidence and source metadata
- support better diagnosis and decision-making
The key doctrine is:
Memory without source control becomes misinformation.
Strategic Importance
This framework is strategically important because RAG is one of the infrastructure foundations for the MWMS ecosystem.
It supports:
- AIBS client memory
- AI Business Systems Brain
- Research Brain
- Data Brain
- HeadOffice Brain
- Content Brain
- Sales Brain
- Client Intelligence reports
- knowledge based chatbots
- WhatsApp assistants
- AI voice agents
- internal support assistants
- proposal assistants
- business diagnostic tools
- client onboarding systems
- prompt vault retrieval
- MCR page retrieval
- course absorption intelligence
- future AI Employee memory
The AI Native Entrepreneur RAG material showed a practical pattern where documents are placed into a Google Drive folder, processed into database records, embedded into a vector searchable structure, moved into a processed folder, and later deleted from memory when placed into a delete folder.
That matters because a knowledge base is not only about adding information.
It must also support:
- processing
- storage
- retrieval
- source metadata
- deletion
- freshness
- confidence
- governance
- human review
The strategic upgrade is:
MWMS should not build AI assistants that merely sound smart. MWMS should build AI assistants that know where their answers came from.
Definition
RAG means retrieval augmented generation.
In MWMS terms, RAG means:
The AI retrieves relevant approved knowledge before it answers, drafts, recommends, or acts.
Knowledge base means an organized collection of approved information that AI systems are allowed to search.
Embedding means converting text into numerical form so similar ideas can be retrieved by meaning rather than only exact keywords.
Vector memory means stored embedded knowledge that can be searched semantically.
Client memory means approved business-specific knowledge about a client that can be retrieved for diagnostics, reports, support, sales, and automation.
Source metadata means information attached to each stored record so MWMS knows where it came from, when it was captured, who approved it, what client it belongs to, and how reliable it is.
MWMS Definition
The MWMS RAG Knowledge Base And Client Memory Infrastructure Framework is:
Data Brain’s standard for creating, maintaining, retrieving, deleting, and governing source based AI knowledge systems so MWMS Brains and AI Employees can answer from approved business memory instead of unsupported general model assumptions.
Scope
This framework applies to:
- RAG systems
- vector memory systems
- Supabase vector systems
- Pinecone style memory systems
- client knowledge bases
- internal MWMS knowledge bases
- document upload pipelines
- Google Drive based knowledge intake
- MCR page retrieval
- business memory automation
- client support assistants
- voice agent knowledge bases
- WhatsApp assistant memory
- website chatbot memory
- proposal generation memory
- report generation memory
- content generation from approved sources
- AI employee memory
- prompt vault retrieval
- client intelligence systems
- AIBS diagnostic systems
This framework does not provide development instructions.
It defines the operating rules for safe RAG infrastructure and knowledge governance.
Core Principle
The core principle is:
Retrieval is only useful when the source is trusted, current, relevant, and allowed.
A RAG system is not automatically reliable.
It can still fail if:
- poor documents are uploaded
- old documents are not removed
- sources are not tagged
- client data is mixed
- private information is exposed
- retrieval pulls the wrong chunk
- AI overstates retrieved context
- metadata is missing
- deletion is not handled
- human review is skipped
- source confidence is unknown
Rule
RAG must be treated as governed infrastructure, not a magic memory folder.
The MWMS RAG Knowledge Base And Client Memory Model
Every MWMS RAG system should be designed across twelve layers:
- Knowledge Purpose Layer
- Source Intake Layer
- Permission And Sensitivity Layer
- Chunking And Processing Layer
- Embedding And Vector Layer
- Metadata And Source Layer
- Storage And Database Layer
- Retrieval And Ranking Layer
- Answer Generation Layer
- Deletion And Freshness Layer
- Observability And Testing Layer
- Governance And Improvement Layer
1. Knowledge Purpose Layer
Every RAG system must start with a purpose.
Do not build a knowledge base just because documents exist.
Purpose Questions
Ask:
- what will this knowledge base answer
- who will use it
- which Brain owns it
- is it internal or client-facing
- is it for support
- is it for diagnostics
- is it for reports
- is it for content
- is it for sales
- is it for voice agents
- is it for employee training
- is it for client intelligence
- what decision or workflow does it improve
Valid Purpose Examples
Use RAG to:
- answer customer questions
- answer team questions
- support AI voice agents
- support WhatsApp assistants
- create client reports
- generate business diagnostics
- draft proposals
- retrieve SOPs
- search training materials
- support course absorption
- support research synthesis
- support content creation from approved sources
- support internal Brain memory
Weak Purpose Examples
Avoid:
- “store everything”
- “make AI smarter”
- “dump all client files in”
- “build a second brain without structure”
- “let the AI figure it out”
- “we might need it later”
Rule
No RAG system should be created without a defined knowledge purpose.
2. Source Intake Layer
The quality of the RAG system depends on source quality.
Source Types
Sources may include:
- PDFs
- Google Docs
- Word documents
- spreadsheets
- website pages
- service pages
- sales decks
- case studies
- SOPs
- training documents
- FAQs
- product documentation
- onboarding files
- call transcripts
- meeting notes
- email exports
- chat transcripts
- support tickets
- review data
- social content
- newsletter content
- MCR pages
- course transcripts
- client reports
- competitor pages
- public articles
- approved internal notes
Source Intake Questions
Ask:
- what is the source
- who owns it
- who approved it
- is it current
- is it complete
- is it accurate
- is it sensitive
- does it belong to a client
- does it need redaction
- does it need summarizing first
- is it worth storing
- should it be temporary or long term
- does it need deletion later
Source Folder Pattern
A simple source intake system may use:
- To Add folder
- Added or Processed folder
- Delete folder
- Error folder
- Review Needed folder
The course example used a Google Drive style intake where documents placed into a to add folder were processed into the database and then moved into an added folder so the operator could visually see that the files had been processed.
Rule
Every source must enter through a controlled intake path.
3. Permission And Sensitivity Layer
RAG systems can expose sensitive information if poorly governed.
Permission Types
Sources may be:
- public
- internal MWMS approved
- client approved
- client confidential
- customer sensitive
- employee sensitive
- regulated
- temporary processing only
- excluded from AI use
Sensitivity Levels
Use:
Level 1 Public
Public website, public article, public marketing content.
Level 2 Internal
MWMS internal frameworks, non-sensitive SOPs, course notes.
Level 3 Client Confidential
Client sales decks, internal SOPs, private business documents.
Level 4 Customer Sensitive
Customer records, support conversations, emails, personal data.
Level 5 Restricted
Health, legal, financial, employee, payment, or highly sensitive records.
Permission Questions
Ask:
- are we allowed to process this source
- are we allowed to store this source
- are we allowed to use it in AI output
- are we allowed to expose answers to staff
- are we allowed to expose answers to customers
- does the client know this is being used
- is AI processing approved
- should the source be redacted
- should it be excluded from retrieval
- who can access the memory
Rule
Private data must not enter RAG without permission and sensitivity labeling.
4. Chunking And Processing Layer
Documents must be processed into useful pieces.
Poor chunking creates poor retrieval.
Processing Steps
Processing may include:
- file detection
- text extraction
- OCR only when necessary
- cleaning
- splitting into chunks
- preserving headings
- preserving document title
- preserving page or section reference
- tagging
- summarizing
- metadata creation
- embedding
- storage
- processed file movement
Chunking Questions
Ask:
- how large should each chunk be
- should chunks overlap
- are headings preserved
- is page reference preserved
- are tables handled correctly
- are images or diagrams important
- is the text clean
- are repeated headers removed
- are irrelevant sections excluded
- should the document be summarized first
Chunk Quality Rules
Good chunks should:
- contain one coherent idea
- preserve enough context
- include source identity
- avoid being too long
- avoid being too short
- retain useful headings
- allow retrieval by meaning
- support accurate answering
Rule
Chunking should preserve meaning, not just split text mechanically.
5. Embedding And Vector Layer
Embedding converts text into searchable meaning.
The RAG material explained embeddings as numerical representations that allow the system to find the right drawer of information when a user asks a question.
Embedding Questions
Ask:
- which embedding model is used
- what dimension is required
- what language support is needed
- what cost applies
- what database stores the vector
- can vectors be deleted
- can vectors be updated
- is metadata stored with the vector
- is the embedding model suitable for the content
Vector Store Options
Possible vector stores include:
- Supabase vector
- Pinecone
- other vector databases
- managed RAG platforms
- internal MWMS future vector infrastructure
Rule
Embeddings are not the answer. They are the retrieval map.
6. Metadata And Source Layer
Metadata is what makes memory governable.
Without metadata, RAG becomes a black box.
Required Metadata Fields
Each knowledge record should include:
Client Or Brain:
Source Name:
Source Type:
Source URL Or Location:
Document Title:
Section Or Page:
Date Captured:
Last Reviewed:
Permission Level:
Sensitivity Level:
Topic:
Tags:
Owner:
Source Confidence:
Retention Rule:
Version:
Hash Or Source ID:
Processed Status:
Deletion Status:
Useful Metadata Fields
Optional fields:
- department
- workflow area
- customer journey stage
- offer
- service
- audience
- document category
- language
- country
- compliance flag
- human verified
- stale after date
- replaced by source
- source priority
Metadata Use Cases
Metadata allows MWMS to:
- retrieve only one client’s information
- filter by topic
- exclude sensitive sources
- remove outdated documents
- delete all chunks from one file
- preserve source confidence
- route outputs to the correct Brain
- separate facts from assumptions
- track stale knowledge
Rule
Every chunk must know where it came from.
7. Storage And Database Layer
The RAG system needs clear storage decisions.
Storage Types
Use:
- source file storage
- extracted text storage
- chunk storage
- vector storage
- metadata storage
- chat history storage
- query log storage
- output log storage
- deletion log storage
- review status storage
Database Options
Use:
- Supabase
- Pinecone
- WordPress database
- Google Drive
- Airtable
- Google Sheets
- custom database
- future MWMS memory infrastructure
Storage Questions
Ask:
- where are original files stored
- where are chunks stored
- where are embeddings stored
- where is metadata stored
- where is chat history stored
- where are outputs logged
- where are delete requests logged
- is storage client separated
- is access controlled
- can data be exported
- can data be deleted
Rule
The original source, extracted knowledge, embedding, and metadata must not become disconnected.
8. Retrieval And Ranking Layer
Retrieval is where RAG succeeds or fails.
Retrieval Questions
Ask:
- what question is being asked
- what source set should be searched
- what client or Brain should be searched
- what sources should be excluded
- how many chunks should be retrieved
- should metadata filters be applied
- should recent sources rank higher
- should verified sources rank higher
- should low-confidence sources be excluded
- should the answer include source references
- should the AI say when nothing relevant is found
Retrieval Filters
Use filters such as:
- client
- Brain
- source type
- topic
- date
- permission level
- sensitivity level
- source confidence
- document status
- verified status
- retention status
Retrieval Failure Examples
Failures include:
- wrong client retrieved
- old document retrieved
- irrelevant chunk retrieved
- sensitive source retrieved
- too many chunks retrieved
- too few chunks retrieved
- AI answers from general knowledge instead of source
- AI invents missing details
Rule
Retrieval must be filtered by context, not only similarity.
9. Answer Generation Layer
The AI must answer from retrieved knowledge responsibly.
Answering Rules
The AI should:
- answer from retrieved context
- identify when context is missing
- avoid unsupported claims
- avoid overconfidence
- preserve source distinction
- include citations or source references where required
- separate fact from interpretation
- ask for more information when needed
- recommend human review for high-risk outputs
- avoid exposing sensitive information unnecessarily
Answer Types
RAG outputs may include:
- direct answer
- summary
- report section
- proposal draft
- support reply
- customer reply
- internal note
- content draft
- sales call prep
- diagnostic finding
- opportunity recommendation
Rule
If retrieved context does not support the answer, the AI must not pretend that it does.
10. Deletion And Freshness Layer
RAG systems must remove or replace old information.
The RAG file showed that deleting the original file from a folder is not enough; the corresponding stored database records must also be deleted.
Deletion Reasons
Delete or retire knowledge when:
- document is outdated
- client revokes permission
- source is wrong
- source is replaced
- customer data should not be stored
- retention period ends
- privacy request is received
- duplicate data exists
- test data is no longer needed
- project is closed
Freshness Questions
Ask:
- when was this source captured
- is it still current
- has the business changed
- has pricing changed
- has the offer changed
- has the SOP changed
- has the website changed
- has the source been superseded
- should this be reprocessed
- should old chunks be removed
Deletion Standard
Deletion should remove:
- original file if required
- extracted text
- chunk records
- embeddings
- metadata records
- related source references where required
- access to the deleted source
Deletion should preserve:
- deletion log
- who deleted it
- why it was deleted
- date deleted
- source ID or hash
- audit note where needed
Rule
A RAG system without deletion control becomes dangerous over time.
11. Observability And Testing Layer
RAG systems must be tested.
Test Questions
Ask:
- does the system retrieve the right source
- does it ignore wrong sources
- does it answer only from context
- does it say when it does not know
- does it handle deleted sources
- does it handle updated sources
- does it separate clients
- does it preserve sensitive boundaries
- does it log retrieved chunks
- does it handle poor questions
- does it handle contradictory sources
- does it hallucinate
Observability Fields
Track:
User Query:
Retrieved Source IDs:
Retrieved Chunks:
Confidence:
Answer Generated:
Model Used:
Prompt Version:
Human Review Needed:
Accepted Or Corrected:
Error:
Date:
Rule
If MWMS cannot see what the RAG system retrieved, MWMS cannot trust the output.
12. Governance And Improvement Layer
RAG systems need ongoing governance.
Governance Questions
Ask:
- who owns the knowledge base
- who can add sources
- who can delete sources
- who approves private sources
- who reviews stale information
- who checks hallucination risk
- who maintains metadata
- who monitors failures
- who handles client deletion requests
- who decides when memory becomes production grade
Improvement Inputs
Improve the system using:
- failed queries
- bad retrieval examples
- human corrections
- user feedback
- stale source reviews
- new documents
- updated prompts
- better metadata
- better chunking
- better filters
- better source confidence scoring
Rule
A knowledge base must be maintained or it becomes a liability.
Knowledge Base Types
MWMS may create several types of RAG knowledge bases.
Type 1: Internal MWMS Knowledge Base
Purpose:
- retrieve MCR standards
- support course absorption
- support AI Employees
- support HeadOffice decisions
- support internal strategy
Primary users:
- Martyn
- M
- AI Employees
- future MWMS operators
Risk level:
- medium, depending on source content
Type 2: AIBS Client Knowledge Base
Purpose:
- support client diagnostics
- answer business questions
- generate reports
- map opportunities
- support proposals
Primary users:
- AIBS Brain
- Client Intelligence systems
- Sales Brain
- client facing reports
Risk level:
- medium to high
Type 3: Customer Support Knowledge Base
Purpose:
- answer customer questions
- support chatbots
- support WhatsApp assistants
- support voice agents
- reduce support workload
Primary users:
- customer service AI
- support staff
- customers
Risk level:
- high if customer facing
Type 4: Sales And Proposal Knowledge Base
Purpose:
- retrieve case studies
- retrieve offer details
- retrieve pricing logic
- retrieve objections
- support proposal drafting
Primary users:
- Sales Brain
- AIBS Brain
- proposal assistants
Risk level:
- medium
Type 5: Content Knowledge Base
Purpose:
- retrieve approved content
- preserve brand voice
- repurpose source material
- create educational assets
- support AI visibility
Primary users:
- Content Brain
- Social Media Brain
- Affiliate Brain
- AIBS Brain
Risk level:
- medium
Type 6: Voice Agent Knowledge Base
Purpose:
- give voice agents business knowledge
- answer caller questions
- route calls
- qualify leads
- support appointment booking
Primary users:
- AI voice agents
- Sales Brain
- AIBS Brain
- local business systems
Risk level:
- high because output is live conversation
RAG Intake Checklist
Before adding a knowledge source, confirm:
Source
- source name
- source type
- source owner
- source location
- source date
- source purpose
- source quality
- source status
Permission
- public or private
- client approval
- AI processing approval
- storage approval
- allowed users
- excluded users
- retention rule
Sensitivity
- personal data
- customer data
- staff data
- confidential business data
- regulated data
- public data
- internal data
Processing
- extract method
- chunking rule
- metadata fields
- topic tags
- source confidence
- stale after date
- deletion rule
Rule
No source should enter RAG without metadata and permission review.
Knowledge Record Standard
Each stored knowledge record should include:
Record ID:
Client Or Brain:
Source ID:
Source Name:
Source Type:
Document Title:
Chunk Text:
Embedding:
Topic:
Tags:
Permission Level:
Sensitivity Level:
Source Confidence:
Date Captured:
Last Reviewed:
Stale After:
Retention Rule:
Hash Or File ID:
Status: Active / Stale / Replaced / Deleted / Review Needed
Human Verified: Yes / No
Rule
A knowledge record must be traceable, filterable, and deletable.
Source Confidence Standard
RAG systems must label source confidence.
High Confidence Sources
Examples:
- official client documents
- current approved SOPs
- signed proposals
- current service pages
- current product documentation
- verified call transcripts
- approved MCR pages
- client supplied training material
Medium Confidence Sources
Examples:
- public website pages
- social posts
- public reviews
- old but still useful reports
- staff notes
- informal meeting summaries
- competitor pages
Low Confidence Sources
Examples:
- unverified AI summaries
- copied notes with no source
- outdated pages
- incomplete transcripts
- unclear third-party content
- rough workshop notes
- unapproved scraped content
Rule
Low-confidence sources should not drive high-confidence recommendations without human review.
Retrieval Safety Standard
Before using retrieved context, the system should check:
- client match
- Brain match
- source confidence
- permission level
- sensitivity level
- freshness
- relevance
- contradiction
- output risk
- human review requirement
Rule
The system should retrieve the best allowed context, not merely the closest semantic match.
RAG Answering Standard
When answering from RAG, the AI should follow this pattern:
- Understand the user question.
- Retrieve relevant approved context.
- Check whether context is sufficient.
- Answer from the retrieved context.
- State uncertainty when context is weak.
- Avoid unsupported general claims.
- Include source reference where required.
- Recommend human review when output is high risk.
- Log retrieval and output.
Rule
RAG answers must be source-led.
Document Add Process
A standard document add process should include:
- File placed into approved intake location.
- Automation detects new file.
- File text is extracted.
- Text is cleaned.
- Text is split into chunks.
- Metadata is added.
- Embeddings are created.
- Records are stored.
- Original file is moved to processed folder.
- Processing log is created.
- Errors are routed to review.
Rule
The operator should be able to see whether a file has been processed successfully.
Document Delete Process
A standard document delete process should include:
- File or source ID is marked for deletion.
- System identifies related stored chunks.
- System identifies matching source hash or file ID.
- Related vectors and records are deleted or retired.
- Original file is removed or archived where required.
- Deletion log is created.
- Retrieval tests confirm deleted content is no longer used.
Rule
Deleting a file from storage is not enough if its chunks remain in vector memory.
Chat History Standard
Some RAG systems should store chat history.
Chat History Uses
Chat history can support:
- better follow-up answers
- user context
- support review
- client intelligence
- content ideas
- product improvement
- common question analysis
- frustration detection
- sales insight
- report generation
Chat History Risks
Risks include:
- storing personal data
- storing sensitive questions
- unclear retention
- client confidentiality
- customer privacy
- using chat history without consent
Rule
Chat history should be stored only when it has a clear purpose and retention rule.
Hallucination Protection Standard
RAG reduces hallucination but does not eliminate it.
Protection Rules
Use:
- source-only answering prompts
- no context, no answer rule
- uncertainty statements
- source confidence labels
- human review
- retrieval logs
- output validation
- answer quality testing
- stale source warnings
- conflicting source warnings
Required Instruction
Important RAG systems should include a rule such as:
If the retrieved context does not contain the answer, say that the knowledge base does not currently contain enough information.
Rule
RAG systems must be designed to admit missing knowledge.
Client Facing RAG Standard
Client-facing RAG systems need stronger controls.
Client Facing Requirements
Required:
- approved source list
- sensitivity review
- user access control
- answer boundaries
- fallback message
- human escalation
- audit log
- retrieval log
- delete process
- privacy note
- testing
- client approval
Rule
A client-facing knowledge assistant must be safer than an internal research assistant.
Voice Agent RAG Standard
Voice agents using RAG need extra caution because answers happen live.
Voice Agent Requirements
Use:
- short retrieved context
- low latency retrieval
- verified knowledge base
- fallback to human
- disclosure rules
- call logging
- post-call analysis
- answer boundaries
- appointment or routing limits
- no unsupported claims
Rule
Voice agents must not improvise beyond approved knowledge when answering business-specific questions.
AIBS Diagnostic RAG Standard
AIBS can use RAG to support client diagnostics.
Diagnostic Uses
Use RAG to retrieve:
- client SOPs
- sales decks
- service descriptions
- customer feedback
- reviews
- call notes
- meeting notes
- workflow documents
- prior reports
- competitor findings
Diagnostic Rules
The system should:
- preserve source evidence
- separate fact from inference
- highlight missing data
- support opportunity mapping
- support first project selection
- require human review before proposal
Rule
AIBS diagnostic RAG should support recommendations, not blindly make them.
RAG Quality Scorecard
Score each RAG system out of 100.
Score Categories
Knowledge Purpose Clarity: 10
Source Quality: 10
Permission Safety: 10
Metadata Completeness: 10
Chunk Quality: 10
Retrieval Accuracy: 10
Freshness And Deletion Control: 10
Answer Grounding: 10
Observability: 10
Governance: 10
Interpretation
85–100: Strong RAG system
70–84: Good system with minor improvements
55–69: Internal use only with human review
40–54: Too weak for client use
Below 40: Do not use yet
Rule
Client-facing RAG systems require high score and strong testing.
RAG Build Readiness Checklist
Before building a RAG system, confirm:
Purpose
- use case defined
- owner defined
- user defined
- output type defined
- success metric defined
Sources
- source list defined
- source permission confirmed
- sensitivity reviewed
- source confidence assigned
- source update pattern understood
Data
- database selected
- metadata fields defined
- client separation defined
- retention rule defined
- deletion process defined
AI
- embedding model selected
- retrieval method defined
- prompt rules defined
- no context no answer rule added
- human review points defined
Security
- access control planned
- API keys protected
- private data boundaries defined
- user permissions defined
- audit logs planned
Testing
- retrieval tests planned
- deletion tests planned
- stale data tests planned
- hallucination tests planned
- client boundary tests planned
Rule
Do not build RAG until source and permission rules are clear.
Application To Data Brain
Data Brain owns this framework.
Data Brain should define:
- schemas
- metadata fields
- source IDs
- vector storage rules
- retention rules
- deletion rules
- source confidence
- client separation
- retrieval filters
- observability fields
Data Brain Rule
RAG is data infrastructure first and AI output second.
Application To Research Brain
Research Brain should use RAG to preserve and retrieve research evidence.
Research Brain can use RAG for:
- source libraries
- competitor knowledge
- market research
- course notes
- trend archives
- newsletter intelligence
- evidence retrieval
- insight synthesis
Research Brain Rule
Research RAG must preserve source visibility and confidence.
Application To AIBS Brain
AIBS Brain should use RAG for client memory and diagnostics.
AIBS can use RAG to:
- understand client documents
- support business audits
- create client reports
- power assistants
- support proposals
- retrieve SOPs
- support staff knowledge
- identify value leaks
AIBS Rule
Client memory must be permission-safe, source-led, and separate by client.
Application To Automation Brain
Automation Brain should build the processing and retrieval workflows only after governance is clear.
Automation Brain should manage:
- intake workflows
- processing workflows
- embeddings
- storage
- deletion
- retrieval
- logs
- error routing
- review queues
Automation Brain Rule
RAG automation must include deletion and error handling, not only upload.
Application To Compliance And Risk Brain
Compliance and Risk Brain should review RAG systems for:
- private data
- customer data
- sensitive documents
- AI processing permission
- retention
- deletion rights
- client separation
- data exposure
- hallucination risk
- source confidence
- client-facing output
Compliance Rule
A RAG system that stores client data is a governance responsibility, not a toy.
Application To Content Brain
Content Brain may use RAG to create source-led content.
Content Brain can retrieve:
- approved brand voice
- approved offers
- approved claims
- approved FAQs
- approved case studies
- approved research
- content source libraries
Content Brain Rule
Content generated from RAG still requires claim and source review.
Application To Sales Brain
Sales Brain may use RAG for proposals and follow-up.
Sales Brain can retrieve:
- case studies
- client notes
- sales decks
- objections
- pricing logic
- prior proposals
- diagnostic findings
- offer scope
Sales Brain Rule
RAG can draft sales material, but humans approve claims, price, and scope.
Application To HeadOffice Brain
HeadOffice should approve major memory systems.
HeadOffice should ask:
- does this memory system support MWMS strategy
- is it safe
- who owns it
- who maintains it
- what data enters it
- what data is excluded
- is it internal or client-facing
- does it need M
- does it deserve build priority
HeadOffice Rule
Not every knowledge base deserves infrastructure.
What Not To Do
Do not:
- dump all documents into memory without purpose
- mix client data
- skip permission review
- store sensitive data unnecessarily
- rely on old documents
- forget deletion workflows
- retrieve without metadata filters
- answer from low-confidence sources as if certain
- let AI invent missing context
- create client-facing RAG without testing
- use RAG as excuse to avoid human review
- store chat history without purpose
- expose service role keys
- treat embeddings as understandable business records
- build complex RAG before proving the use case
Rule
A messy knowledge base creates confident wrong answers.
Deferred Update And Parking Lot Section
This page creates later update needs.
Later Update 1: MWMS Supabase RAG And Vector Memory Framework
Add:
- broader client memory governance
- source confidence scoring
- deletion workflow requirements
- stale document handling
- client-facing RAG readiness score
- no context no answer rule
- source metadata requirements
Later Update 2: MWMS Client Intelligence And Business Memory Automation Framework
Add:
- RAG as client memory infrastructure
- document add and delete lifecycle
- source confidence labels
- vector memory retrieval
- chat history governance
- source freshness review
Later Update 3: MWMS Source Visibility And Evidence Display Standard
Add:
- retrieved source IDs
- chunk source references
- fact versus inference separation
- evidence-backed answers
- confidence labels
- missing source warning
Later Update 4: MWMS AI Observability Metadata Standard
Add:
- retrieved chunk IDs
- retrieval score
- source ID
- vector store name
- embedding model
- prompt version
- answer grounded flag
- no context flag
- human review flag
Later Update 5: MWMS AI Automation Security And Risk Checklist
Add:
- vector memory privacy risk
- client memory separation
- source permission review
- deletion request handling
- service role key caution
- private file ingestion controls
- chat history retention rules
Later Update 6: MWMS AIBS Automation Audit And Opportunity Mapping Framework
Add:
- audit documents as RAG sources
- client diagnostic memory
- opportunity map retrieval
- report evidence retrieval
- business knowledge base as audit deliverable
Later Update 7: MWMS AI Voice Agent Design Testing And Governance Framework
Add:
- voice agent knowledge base rules
- live answer boundaries
- retrieved context limits
- fallback when knowledge missing
- post-call retrieval review
Later Update 8: MWMS Prompt Architecture And Automation Output Reliability Framework
Add:
- RAG answering prompt rules
- source-only response instructions
- no context no answer instruction
- retrieved context formatting
- citation and uncertainty handling
Future AI Employee Ideas
These AI Employee ideas are parked candidates only.
RAG Knowledge Base Architect
Primary Brain: Data Brain / Research Brain
Status: Parked Candidate
Purpose: Designs RAG knowledge base structure, source intake, metadata, chunking, retrieval, deletion, and governance standards.
Source Intake Controller
Primary Brain: Data Brain / Compliance Brain
Status: Parked Candidate
Purpose: Reviews sources before they enter memory and assigns permission level, sensitivity level, topic, confidence, and retention rules.
Vector Memory Curator
Primary Brain: Data Brain
Status: Parked Candidate
Purpose: Maintains vector memory, removes stale records, checks duplicates, validates metadata, and protects source quality.
Retrieval Quality Tester
Primary Brain: Research Brain / Data Brain
Status: Parked Candidate
Purpose: Tests whether RAG systems retrieve correct, current, allowed, and relevant context.
Knowledge Deletion Steward
Primary Brain: Data Brain / Risk Brain
Status: Parked Candidate
Purpose: Ensures deleted or replaced sources are removed from files, records, embeddings, and retrieval paths.
Source Confidence Analyst
Primary Brain: Research Brain
Status: Parked Candidate
Purpose: Scores source reliability and decides whether information can support answers, reports, proposals, or client-facing recommendations.
RAG Answer Quality Reviewer
Primary Brain: Prompting Framework / Research Brain
Status: Parked Candidate
Purpose: Reviews RAG outputs for source grounding, uncertainty handling, hallucination risk, and answer usefulness.
Client Memory Privacy Reviewer
Primary Brain: Compliance Brain / Risk Brain
Status: Parked Candidate
Purpose: Reviews client memory systems for privacy, data separation, sensitive content, retention, and deletion obligations.
Drift Protection
This framework protects MWMS from:
- ungoverned memory systems
- stale knowledge
- source confusion
- mixed client data
- unsupported AI answers
- poor retrieval
- missing metadata
- forgotten deletion
- overconfidence
- private data exposure
- chat history misuse
- source-free reports
- AI voice agents making things up
- client assistants answering beyond approved knowledge
- dumping documents into vector memory without purpose
- treating RAG as magic
Drift Signals
Watch for:
- “Just upload everything.”
- “The AI will find what it needs.”
- “We do not need metadata.”
- “We can delete the file manually later.”
- “The old document is probably still fine.”
- “No need to separate clients.”
- “The chatbot sounds accurate.”
- “We do not need source references.”
- “Let it answer from general knowledge.”
- “We can add deletion later.”
- “The client will not care where the answer came from.”
- “The voice agent can improvise.”
- “Let’s store all chat history just in case.”
Rule
When these drift signals appear, return to source permission, metadata, deletion, and answer grounding.
Strategic Summary
The AI Native Entrepreneur Architecture And Tool Decision Block reinforced a core MWMS lesson:
Useful AI memory is not created by simply uploading documents. It is created by controlled source intake, clean processing, meaningful metadata, reliable retrieval, deletion control, and governed answer generation.
RAG is essential for MWMS because future AI Employees and client systems must work from real business context.
But RAG is also risky when poorly governed.
This framework establishes the discipline needed for MWMS to use RAG safely across internal Brains, AIBS client systems, support agents, voice agents, reports, and productized AIOS modules.
The strategic upgrade is:
MWMS memory must be source-led, permission-safe, current, retrievable, deletable, and observable.
Final Standard
The MWMS final standard is:
No MWMS RAG, knowledge base, client memory, voice agent memory, support assistant, or business intelligence memory system should be used until its purpose, source list, permission level, sensitivity level, metadata, chunking, storage, retrieval rules, answer boundaries, deletion process, freshness review, observability, and human review requirements are defined.
A valid MWMS RAG system must define:
- purpose
- owner
- user
- source list
- permission rules
- sensitivity levels
- source confidence
- chunking method
- embedding model
- vector store
- metadata fields
- storage location
- retrieval filters
- answer rules
- no context no answer rule
- deletion process
- stale source review
- chat history rules
- observability fields
- human review points
- governance owner
That is the MWMS RAG Knowledge Base And Client Memory Infrastructure standard.
Change Log
Version: v1.0
Date: 2026-06-08
Author: HeadOffice
Change:
Created the MWMS RAG Knowledge Base And Client Memory Infrastructure Framework from the AI Automations by Jack AI Native Entrepreneur Architecture And Tool Decision Block.
Captured the strongest lessons from practical and strategic workshop material involving:
- Build Your First RAG AI Agent from Scratch
- retrieval augmented generation
- document intake through Google Drive
- to add, added, and delete folder workflow
- embeddings
- vector storage
- Supabase based knowledge storage
- Pinecone style vector memory
- document hash and source ID matching
- metadata based deletion
- source freshness
- chat history
- company specific knowledge retrieval
- no hallucination positioning
- client knowledge base systems
- business memory for AIBS
- voice agent and support assistant knowledge bases
Defined the MWMS RAG Knowledge Base And Client Memory Model with twelve layers:
- Knowledge Purpose Layer
- Source Intake Layer
- Permission And Sensitivity Layer
- Chunking And Processing Layer
- Embedding And Vector Layer
- Metadata And Source Layer
- Storage And Database Layer
- Retrieval And Ranking Layer
- Answer Generation Layer
- Deletion And Freshness Layer
- Observability And Testing Layer
- Governance And Improvement Layer
Added key operating sections:
- Knowledge Base Types
- RAG Intake Checklist
- Knowledge Record Standard
- Source Confidence Standard
- Retrieval Safety Standard
- RAG Answering Standard
- Document Add Process
- Document Delete Process
- Chat History Standard
- Hallucination Protection Standard
- Client Facing RAG Standard
- Voice Agent RAG Standard
- AIBS Diagnostic RAG Standard
- RAG Quality Scorecard
- RAG Build Readiness Checklist
- Deferred Update And Parking Lot Section
- Future AI Employee Ideas
Mapped the framework across:
- Data Brain
- Research Brain
- AIBS Brain
- Automation Brain
- HeadOffice Brain
- Compliance Brain
- Risk Brain
- Sales Brain
- Content Brain
- Product Brain
- UX Brain
- Prompting Framework
Purpose of creation:
To establish a formal MWMS standard for building source based RAG, knowledge base, vector memory, and client memory systems that allow AI Employees and client systems to retrieve approved knowledge, preserve source context, avoid stale information, reduce hallucination, protect client data, and support better diagnostics, reports, support, sales, content, and decision-making.
END — MWMS RAG KNOWLEDGE BASE AND CLIENT MEMORY INFRASTRUCTURE FRAMEWORK v1.0