MWMS RAG Knowledge Base And Client Memory Infrastructure Framework

System: MWMS
Document Type: Operating Framework
Authority Level: MCR Source Of Truth
Status: Draft For MCR
Version: v1.0
Primary Location: MCR
Future Operational Destination: Data Brain, Research Brain, AIBS Brain, Client Intelligence Systems, Automation Brain, HeadOffice Brain, Compliance Brain, Risk Brain
Parent Page: Data Brain
Owner: Martyn
Developer Boundary: Do Not Touch M’s Active Build Areas Unless Specifically Assigned
Source Of Truth: MCR
Last Reviewed: 2026-06-08
Source / Origin: AI Automations by Jack AI Native Entrepreneur Architecture And Tool Decision Block
MWMS Classification: RAG Framework / Knowledge Base Infrastructure Framework / Client Memory Framework / Vector Memory Standard / Source Based AI Retrieval Standard
Primary Brain: Data Brain
Supporting Brains: Research Brain, AIBS Brain, Automation Brain, HeadOffice Brain, Compliance Brain, Risk Brain, Sales Brain, Content Brain, Product Brain, UX Brain, Prompting Framework

Related Pages: MWMS Supabase RAG And Vector Memory Framework, MWMS Client Intelligence And Business Memory Automation Framework, MWMS AIBS Automation Audit And Opportunity Mapping Framework, MWMS Automation Architecture And Tool Selection Framework, MWMS Source Visibility And Evidence Display Standard, MWMS AI Observability Metadata Standard, MWMS AI Automation Security And Risk Checklist, MWMS Prompt Architecture And Automation Output Reliability Framework, MWMS Client Intelligence Report Automation Framework


Purpose

The purpose of the MWMS RAG Knowledge Base And Client Memory Infrastructure Framework is to define how MWMS creates reliable, source based knowledge systems that allow AI Employees, client assistants, diagnostic systems, report generators, support agents, voice agents, and business intelligence tools to answer from approved knowledge rather than generic model memory.

This framework exists because many MWMS systems will need to work from:

  • uploaded documents
  • client files
  • website content
  • SOPs
  • sales decks
  • case studies
  • training materials
  • FAQs
  • call transcripts
  • emails
  • meeting notes
  • Google Drive files
  • internal MCR pages
  • business knowledge bases
  • client intelligence records
  • research documents
  • course transcripts
  • support histories
  • customer feedback
  • product documentation

The core purpose is:

To help MWMS build RAG and knowledge base systems that retrieve the right information, preserve source context, avoid stale knowledge, reduce hallucination, protect client data, and create useful business memory.


Core Doctrine

The MWMS doctrine is:

AI should answer from approved knowledge when the task depends on business facts.

Generic AI is useful for general thinking.

But client systems, internal Brains, business reports, support agents, and AIBS diagnostics need grounded memory.

A good RAG system allows AI to:

  • search approved sources
  • retrieve relevant context
  • answer from business-specific material
  • avoid pretending to know what is missing
  • cite or reference source material where needed
  • update when documents change
  • remove outdated knowledge
  • keep client memory separate
  • preserve confidence and source metadata
  • support better diagnosis and decision-making

The key doctrine is:

Memory without source control becomes misinformation.


Strategic Importance

This framework is strategically important because RAG is one of the infrastructure foundations for the MWMS ecosystem.

It supports:

  • AIBS client memory
  • AI Business Systems Brain
  • Research Brain
  • Data Brain
  • HeadOffice Brain
  • Content Brain
  • Sales Brain
  • Client Intelligence reports
  • knowledge based chatbots
  • WhatsApp assistants
  • AI voice agents
  • internal support assistants
  • proposal assistants
  • business diagnostic tools
  • client onboarding systems
  • prompt vault retrieval
  • MCR page retrieval
  • course absorption intelligence
  • future AI Employee memory

The AI Native Entrepreneur RAG material showed a practical pattern where documents are placed into a Google Drive folder, processed into database records, embedded into a vector searchable structure, moved into a processed folder, and later deleted from memory when placed into a delete folder.

That matters because a knowledge base is not only about adding information.

It must also support:

  • processing
  • storage
  • retrieval
  • source metadata
  • deletion
  • freshness
  • confidence
  • governance
  • human review

The strategic upgrade is:

MWMS should not build AI assistants that merely sound smart. MWMS should build AI assistants that know where their answers came from.


Definition

RAG means retrieval augmented generation.

In MWMS terms, RAG means:

The AI retrieves relevant approved knowledge before it answers, drafts, recommends, or acts.

Knowledge base means an organized collection of approved information that AI systems are allowed to search.

Embedding means converting text into numerical form so similar ideas can be retrieved by meaning rather than only exact keywords.

Vector memory means stored embedded knowledge that can be searched semantically.

Client memory means approved business-specific knowledge about a client that can be retrieved for diagnostics, reports, support, sales, and automation.

Source metadata means information attached to each stored record so MWMS knows where it came from, when it was captured, who approved it, what client it belongs to, and how reliable it is.

MWMS Definition

The MWMS RAG Knowledge Base And Client Memory Infrastructure Framework is:

Data Brain’s standard for creating, maintaining, retrieving, deleting, and governing source based AI knowledge systems so MWMS Brains and AI Employees can answer from approved business memory instead of unsupported general model assumptions.


Scope

This framework applies to:

  • RAG systems
  • vector memory systems
  • Supabase vector systems
  • Pinecone style memory systems
  • client knowledge bases
  • internal MWMS knowledge bases
  • document upload pipelines
  • Google Drive based knowledge intake
  • MCR page retrieval
  • business memory automation
  • client support assistants
  • voice agent knowledge bases
  • WhatsApp assistant memory
  • website chatbot memory
  • proposal generation memory
  • report generation memory
  • content generation from approved sources
  • AI employee memory
  • prompt vault retrieval
  • client intelligence systems
  • AIBS diagnostic systems

This framework does not provide development instructions.

It defines the operating rules for safe RAG infrastructure and knowledge governance.


Core Principle

The core principle is:

Retrieval is only useful when the source is trusted, current, relevant, and allowed.

A RAG system is not automatically reliable.

It can still fail if:

  • poor documents are uploaded
  • old documents are not removed
  • sources are not tagged
  • client data is mixed
  • private information is exposed
  • retrieval pulls the wrong chunk
  • AI overstates retrieved context
  • metadata is missing
  • deletion is not handled
  • human review is skipped
  • source confidence is unknown

Rule

RAG must be treated as governed infrastructure, not a magic memory folder.


The MWMS RAG Knowledge Base And Client Memory Model

Every MWMS RAG system should be designed across twelve layers:

  1. Knowledge Purpose Layer
  2. Source Intake Layer
  3. Permission And Sensitivity Layer
  4. Chunking And Processing Layer
  5. Embedding And Vector Layer
  6. Metadata And Source Layer
  7. Storage And Database Layer
  8. Retrieval And Ranking Layer
  9. Answer Generation Layer
  10. Deletion And Freshness Layer
  11. Observability And Testing Layer
  12. Governance And Improvement Layer

1. Knowledge Purpose Layer

Every RAG system must start with a purpose.

Do not build a knowledge base just because documents exist.

Purpose Questions

Ask:

  • what will this knowledge base answer
  • who will use it
  • which Brain owns it
  • is it internal or client-facing
  • is it for support
  • is it for diagnostics
  • is it for reports
  • is it for content
  • is it for sales
  • is it for voice agents
  • is it for employee training
  • is it for client intelligence
  • what decision or workflow does it improve

Valid Purpose Examples

Use RAG to:

  • answer customer questions
  • answer team questions
  • support AI voice agents
  • support WhatsApp assistants
  • create client reports
  • generate business diagnostics
  • draft proposals
  • retrieve SOPs
  • search training materials
  • support course absorption
  • support research synthesis
  • support content creation from approved sources
  • support internal Brain memory

Weak Purpose Examples

Avoid:

  • “store everything”
  • “make AI smarter”
  • “dump all client files in”
  • “build a second brain without structure”
  • “let the AI figure it out”
  • “we might need it later”

Rule

No RAG system should be created without a defined knowledge purpose.


2. Source Intake Layer

The quality of the RAG system depends on source quality.

Source Types

Sources may include:

  • PDFs
  • Google Docs
  • Word documents
  • spreadsheets
  • website pages
  • service pages
  • sales decks
  • case studies
  • SOPs
  • training documents
  • FAQs
  • product documentation
  • onboarding files
  • call transcripts
  • meeting notes
  • email exports
  • chat transcripts
  • support tickets
  • review data
  • social content
  • newsletter content
  • MCR pages
  • course transcripts
  • client reports
  • competitor pages
  • public articles
  • approved internal notes

Source Intake Questions

Ask:

  • what is the source
  • who owns it
  • who approved it
  • is it current
  • is it complete
  • is it accurate
  • is it sensitive
  • does it belong to a client
  • does it need redaction
  • does it need summarizing first
  • is it worth storing
  • should it be temporary or long term
  • does it need deletion later

Source Folder Pattern

A simple source intake system may use:

  • To Add folder
  • Added or Processed folder
  • Delete folder
  • Error folder
  • Review Needed folder

The course example used a Google Drive style intake where documents placed into a to add folder were processed into the database and then moved into an added folder so the operator could visually see that the files had been processed.

Rule

Every source must enter through a controlled intake path.


3. Permission And Sensitivity Layer

RAG systems can expose sensitive information if poorly governed.

Permission Types

Sources may be:

  • public
  • internal MWMS approved
  • client approved
  • client confidential
  • customer sensitive
  • employee sensitive
  • regulated
  • temporary processing only
  • excluded from AI use

Sensitivity Levels

Use:

Level 1 Public

Public website, public article, public marketing content.

Level 2 Internal

MWMS internal frameworks, non-sensitive SOPs, course notes.

Level 3 Client Confidential

Client sales decks, internal SOPs, private business documents.

Level 4 Customer Sensitive

Customer records, support conversations, emails, personal data.

Level 5 Restricted

Health, legal, financial, employee, payment, or highly sensitive records.

Permission Questions

Ask:

  • are we allowed to process this source
  • are we allowed to store this source
  • are we allowed to use it in AI output
  • are we allowed to expose answers to staff
  • are we allowed to expose answers to customers
  • does the client know this is being used
  • is AI processing approved
  • should the source be redacted
  • should it be excluded from retrieval
  • who can access the memory

Rule

Private data must not enter RAG without permission and sensitivity labeling.


4. Chunking And Processing Layer

Documents must be processed into useful pieces.

Poor chunking creates poor retrieval.

Processing Steps

Processing may include:

  • file detection
  • text extraction
  • OCR only when necessary
  • cleaning
  • splitting into chunks
  • preserving headings
  • preserving document title
  • preserving page or section reference
  • tagging
  • summarizing
  • metadata creation
  • embedding
  • storage
  • processed file movement

Chunking Questions

Ask:

  • how large should each chunk be
  • should chunks overlap
  • are headings preserved
  • is page reference preserved
  • are tables handled correctly
  • are images or diagrams important
  • is the text clean
  • are repeated headers removed
  • are irrelevant sections excluded
  • should the document be summarized first

Chunk Quality Rules

Good chunks should:

  • contain one coherent idea
  • preserve enough context
  • include source identity
  • avoid being too long
  • avoid being too short
  • retain useful headings
  • allow retrieval by meaning
  • support accurate answering

Rule

Chunking should preserve meaning, not just split text mechanically.


5. Embedding And Vector Layer

Embedding converts text into searchable meaning.

The RAG material explained embeddings as numerical representations that allow the system to find the right drawer of information when a user asks a question.

Embedding Questions

Ask:

  • which embedding model is used
  • what dimension is required
  • what language support is needed
  • what cost applies
  • what database stores the vector
  • can vectors be deleted
  • can vectors be updated
  • is metadata stored with the vector
  • is the embedding model suitable for the content

Vector Store Options

Possible vector stores include:

  • Supabase vector
  • Pinecone
  • other vector databases
  • managed RAG platforms
  • internal MWMS future vector infrastructure

Rule

Embeddings are not the answer. They are the retrieval map.


6. Metadata And Source Layer

Metadata is what makes memory governable.

Without metadata, RAG becomes a black box.

Required Metadata Fields

Each knowledge record should include:

Client Or Brain:
Source Name:
Source Type:
Source URL Or Location:
Document Title:
Section Or Page:
Date Captured:
Last Reviewed:
Permission Level:
Sensitivity Level:
Topic:
Tags:
Owner:
Source Confidence:
Retention Rule:
Version:
Hash Or Source ID:
Processed Status:
Deletion Status:

Useful Metadata Fields

Optional fields:

  • department
  • workflow area
  • customer journey stage
  • offer
  • service
  • audience
  • document category
  • language
  • country
  • compliance flag
  • human verified
  • stale after date
  • replaced by source
  • source priority

Metadata Use Cases

Metadata allows MWMS to:

  • retrieve only one client’s information
  • filter by topic
  • exclude sensitive sources
  • remove outdated documents
  • delete all chunks from one file
  • preserve source confidence
  • route outputs to the correct Brain
  • separate facts from assumptions
  • track stale knowledge

Rule

Every chunk must know where it came from.


7. Storage And Database Layer

The RAG system needs clear storage decisions.

Storage Types

Use:

  • source file storage
  • extracted text storage
  • chunk storage
  • vector storage
  • metadata storage
  • chat history storage
  • query log storage
  • output log storage
  • deletion log storage
  • review status storage

Database Options

Use:

  • Supabase
  • Pinecone
  • WordPress database
  • Google Drive
  • Airtable
  • Google Sheets
  • custom database
  • future MWMS memory infrastructure

Storage Questions

Ask:

  • where are original files stored
  • where are chunks stored
  • where are embeddings stored
  • where is metadata stored
  • where is chat history stored
  • where are outputs logged
  • where are delete requests logged
  • is storage client separated
  • is access controlled
  • can data be exported
  • can data be deleted

Rule

The original source, extracted knowledge, embedding, and metadata must not become disconnected.


8. Retrieval And Ranking Layer

Retrieval is where RAG succeeds or fails.

Retrieval Questions

Ask:

  • what question is being asked
  • what source set should be searched
  • what client or Brain should be searched
  • what sources should be excluded
  • how many chunks should be retrieved
  • should metadata filters be applied
  • should recent sources rank higher
  • should verified sources rank higher
  • should low-confidence sources be excluded
  • should the answer include source references
  • should the AI say when nothing relevant is found

Retrieval Filters

Use filters such as:

  • client
  • Brain
  • source type
  • topic
  • date
  • permission level
  • sensitivity level
  • source confidence
  • document status
  • verified status
  • retention status

Retrieval Failure Examples

Failures include:

  • wrong client retrieved
  • old document retrieved
  • irrelevant chunk retrieved
  • sensitive source retrieved
  • too many chunks retrieved
  • too few chunks retrieved
  • AI answers from general knowledge instead of source
  • AI invents missing details

Rule

Retrieval must be filtered by context, not only similarity.


9. Answer Generation Layer

The AI must answer from retrieved knowledge responsibly.

Answering Rules

The AI should:

  • answer from retrieved context
  • identify when context is missing
  • avoid unsupported claims
  • avoid overconfidence
  • preserve source distinction
  • include citations or source references where required
  • separate fact from interpretation
  • ask for more information when needed
  • recommend human review for high-risk outputs
  • avoid exposing sensitive information unnecessarily

Answer Types

RAG outputs may include:

  • direct answer
  • summary
  • report section
  • proposal draft
  • support reply
  • customer reply
  • internal note
  • content draft
  • sales call prep
  • diagnostic finding
  • opportunity recommendation

Rule

If retrieved context does not support the answer, the AI must not pretend that it does.


10. Deletion And Freshness Layer

RAG systems must remove or replace old information.

The RAG file showed that deleting the original file from a folder is not enough; the corresponding stored database records must also be deleted.

Deletion Reasons

Delete or retire knowledge when:

  • document is outdated
  • client revokes permission
  • source is wrong
  • source is replaced
  • customer data should not be stored
  • retention period ends
  • privacy request is received
  • duplicate data exists
  • test data is no longer needed
  • project is closed

Freshness Questions

Ask:

  • when was this source captured
  • is it still current
  • has the business changed
  • has pricing changed
  • has the offer changed
  • has the SOP changed
  • has the website changed
  • has the source been superseded
  • should this be reprocessed
  • should old chunks be removed

Deletion Standard

Deletion should remove:

  • original file if required
  • extracted text
  • chunk records
  • embeddings
  • metadata records
  • related source references where required
  • access to the deleted source

Deletion should preserve:

  • deletion log
  • who deleted it
  • why it was deleted
  • date deleted
  • source ID or hash
  • audit note where needed

Rule

A RAG system without deletion control becomes dangerous over time.


11. Observability And Testing Layer

RAG systems must be tested.

Test Questions

Ask:

  • does the system retrieve the right source
  • does it ignore wrong sources
  • does it answer only from context
  • does it say when it does not know
  • does it handle deleted sources
  • does it handle updated sources
  • does it separate clients
  • does it preserve sensitive boundaries
  • does it log retrieved chunks
  • does it handle poor questions
  • does it handle contradictory sources
  • does it hallucinate

Observability Fields

Track:

User Query:
Retrieved Source IDs:
Retrieved Chunks:
Confidence:
Answer Generated:
Model Used:
Prompt Version:
Human Review Needed:
Accepted Or Corrected:
Error:
Date:

Rule

If MWMS cannot see what the RAG system retrieved, MWMS cannot trust the output.


12. Governance And Improvement Layer

RAG systems need ongoing governance.

Governance Questions

Ask:

  • who owns the knowledge base
  • who can add sources
  • who can delete sources
  • who approves private sources
  • who reviews stale information
  • who checks hallucination risk
  • who maintains metadata
  • who monitors failures
  • who handles client deletion requests
  • who decides when memory becomes production grade

Improvement Inputs

Improve the system using:

  • failed queries
  • bad retrieval examples
  • human corrections
  • user feedback
  • stale source reviews
  • new documents
  • updated prompts
  • better metadata
  • better chunking
  • better filters
  • better source confidence scoring

Rule

A knowledge base must be maintained or it becomes a liability.


Knowledge Base Types

MWMS may create several types of RAG knowledge bases.

Type 1: Internal MWMS Knowledge Base

Purpose:

  • retrieve MCR standards
  • support course absorption
  • support AI Employees
  • support HeadOffice decisions
  • support internal strategy

Primary users:

  • Martyn
  • M
  • AI Employees
  • future MWMS operators

Risk level:

  • medium, depending on source content

Type 2: AIBS Client Knowledge Base

Purpose:

  • support client diagnostics
  • answer business questions
  • generate reports
  • map opportunities
  • support proposals

Primary users:

  • AIBS Brain
  • Client Intelligence systems
  • Sales Brain
  • client facing reports

Risk level:

  • medium to high

Type 3: Customer Support Knowledge Base

Purpose:

  • answer customer questions
  • support chatbots
  • support WhatsApp assistants
  • support voice agents
  • reduce support workload

Primary users:

  • customer service AI
  • support staff
  • customers

Risk level:

  • high if customer facing

Type 4: Sales And Proposal Knowledge Base

Purpose:

  • retrieve case studies
  • retrieve offer details
  • retrieve pricing logic
  • retrieve objections
  • support proposal drafting

Primary users:

  • Sales Brain
  • AIBS Brain
  • proposal assistants

Risk level:

  • medium

Type 5: Content Knowledge Base

Purpose:

  • retrieve approved content
  • preserve brand voice
  • repurpose source material
  • create educational assets
  • support AI visibility

Primary users:

  • Content Brain
  • Social Media Brain
  • Affiliate Brain
  • AIBS Brain

Risk level:

  • medium

Type 6: Voice Agent Knowledge Base

Purpose:

  • give voice agents business knowledge
  • answer caller questions
  • route calls
  • qualify leads
  • support appointment booking

Primary users:

  • AI voice agents
  • Sales Brain
  • AIBS Brain
  • local business systems

Risk level:

  • high because output is live conversation

RAG Intake Checklist

Before adding a knowledge source, confirm:

Source

  • source name
  • source type
  • source owner
  • source location
  • source date
  • source purpose
  • source quality
  • source status

Permission

  • public or private
  • client approval
  • AI processing approval
  • storage approval
  • allowed users
  • excluded users
  • retention rule

Sensitivity

  • personal data
  • customer data
  • staff data
  • confidential business data
  • regulated data
  • public data
  • internal data

Processing

  • extract method
  • chunking rule
  • metadata fields
  • topic tags
  • source confidence
  • stale after date
  • deletion rule

Rule

No source should enter RAG without metadata and permission review.


Knowledge Record Standard

Each stored knowledge record should include:

Record ID:
Client Or Brain:
Source ID:
Source Name:
Source Type:
Document Title:
Chunk Text:
Embedding:
Topic:
Tags:
Permission Level:
Sensitivity Level:
Source Confidence:
Date Captured:
Last Reviewed:
Stale After:
Retention Rule:
Hash Or File ID:
Status: Active / Stale / Replaced / Deleted / Review Needed
Human Verified: Yes / No

Rule

A knowledge record must be traceable, filterable, and deletable.


Source Confidence Standard

RAG systems must label source confidence.

High Confidence Sources

Examples:

  • official client documents
  • current approved SOPs
  • signed proposals
  • current service pages
  • current product documentation
  • verified call transcripts
  • approved MCR pages
  • client supplied training material

Medium Confidence Sources

Examples:

  • public website pages
  • social posts
  • public reviews
  • old but still useful reports
  • staff notes
  • informal meeting summaries
  • competitor pages

Low Confidence Sources

Examples:

  • unverified AI summaries
  • copied notes with no source
  • outdated pages
  • incomplete transcripts
  • unclear third-party content
  • rough workshop notes
  • unapproved scraped content

Rule

Low-confidence sources should not drive high-confidence recommendations without human review.


Retrieval Safety Standard

Before using retrieved context, the system should check:

  • client match
  • Brain match
  • source confidence
  • permission level
  • sensitivity level
  • freshness
  • relevance
  • contradiction
  • output risk
  • human review requirement

Rule

The system should retrieve the best allowed context, not merely the closest semantic match.


RAG Answering Standard

When answering from RAG, the AI should follow this pattern:

  1. Understand the user question.
  2. Retrieve relevant approved context.
  3. Check whether context is sufficient.
  4. Answer from the retrieved context.
  5. State uncertainty when context is weak.
  6. Avoid unsupported general claims.
  7. Include source reference where required.
  8. Recommend human review when output is high risk.
  9. Log retrieval and output.

Rule

RAG answers must be source-led.


Document Add Process

A standard document add process should include:

  1. File placed into approved intake location.
  2. Automation detects new file.
  3. File text is extracted.
  4. Text is cleaned.
  5. Text is split into chunks.
  6. Metadata is added.
  7. Embeddings are created.
  8. Records are stored.
  9. Original file is moved to processed folder.
  10. Processing log is created.
  11. Errors are routed to review.

Rule

The operator should be able to see whether a file has been processed successfully.


Document Delete Process

A standard document delete process should include:

  1. File or source ID is marked for deletion.
  2. System identifies related stored chunks.
  3. System identifies matching source hash or file ID.
  4. Related vectors and records are deleted or retired.
  5. Original file is removed or archived where required.
  6. Deletion log is created.
  7. Retrieval tests confirm deleted content is no longer used.

Rule

Deleting a file from storage is not enough if its chunks remain in vector memory.


Chat History Standard

Some RAG systems should store chat history.

Chat History Uses

Chat history can support:

  • better follow-up answers
  • user context
  • support review
  • client intelligence
  • content ideas
  • product improvement
  • common question analysis
  • frustration detection
  • sales insight
  • report generation

Chat History Risks

Risks include:

  • storing personal data
  • storing sensitive questions
  • unclear retention
  • client confidentiality
  • customer privacy
  • using chat history without consent

Rule

Chat history should be stored only when it has a clear purpose and retention rule.


Hallucination Protection Standard

RAG reduces hallucination but does not eliminate it.

Protection Rules

Use:

  • source-only answering prompts
  • no context, no answer rule
  • uncertainty statements
  • source confidence labels
  • human review
  • retrieval logs
  • output validation
  • answer quality testing
  • stale source warnings
  • conflicting source warnings

Required Instruction

Important RAG systems should include a rule such as:

If the retrieved context does not contain the answer, say that the knowledge base does not currently contain enough information.

Rule

RAG systems must be designed to admit missing knowledge.


Client Facing RAG Standard

Client-facing RAG systems need stronger controls.

Client Facing Requirements

Required:

  • approved source list
  • sensitivity review
  • user access control
  • answer boundaries
  • fallback message
  • human escalation
  • audit log
  • retrieval log
  • delete process
  • privacy note
  • testing
  • client approval

Rule

A client-facing knowledge assistant must be safer than an internal research assistant.


Voice Agent RAG Standard

Voice agents using RAG need extra caution because answers happen live.

Voice Agent Requirements

Use:

  • short retrieved context
  • low latency retrieval
  • verified knowledge base
  • fallback to human
  • disclosure rules
  • call logging
  • post-call analysis
  • answer boundaries
  • appointment or routing limits
  • no unsupported claims

Rule

Voice agents must not improvise beyond approved knowledge when answering business-specific questions.


AIBS Diagnostic RAG Standard

AIBS can use RAG to support client diagnostics.

Diagnostic Uses

Use RAG to retrieve:

  • client SOPs
  • sales decks
  • service descriptions
  • customer feedback
  • reviews
  • call notes
  • meeting notes
  • workflow documents
  • prior reports
  • competitor findings

Diagnostic Rules

The system should:

  • preserve source evidence
  • separate fact from inference
  • highlight missing data
  • support opportunity mapping
  • support first project selection
  • require human review before proposal

Rule

AIBS diagnostic RAG should support recommendations, not blindly make them.


RAG Quality Scorecard

Score each RAG system out of 100.

Score Categories

Knowledge Purpose Clarity: 10
Source Quality: 10
Permission Safety: 10
Metadata Completeness: 10
Chunk Quality: 10
Retrieval Accuracy: 10
Freshness And Deletion Control: 10
Answer Grounding: 10
Observability: 10
Governance: 10

Interpretation

85–100: Strong RAG system
70–84: Good system with minor improvements
55–69: Internal use only with human review
40–54: Too weak for client use
Below 40: Do not use yet

Rule

Client-facing RAG systems require high score and strong testing.


RAG Build Readiness Checklist

Before building a RAG system, confirm:

Purpose

  • use case defined
  • owner defined
  • user defined
  • output type defined
  • success metric defined

Sources

  • source list defined
  • source permission confirmed
  • sensitivity reviewed
  • source confidence assigned
  • source update pattern understood

Data

  • database selected
  • metadata fields defined
  • client separation defined
  • retention rule defined
  • deletion process defined

AI

  • embedding model selected
  • retrieval method defined
  • prompt rules defined
  • no context no answer rule added
  • human review points defined

Security

  • access control planned
  • API keys protected
  • private data boundaries defined
  • user permissions defined
  • audit logs planned

Testing

  • retrieval tests planned
  • deletion tests planned
  • stale data tests planned
  • hallucination tests planned
  • client boundary tests planned

Rule

Do not build RAG until source and permission rules are clear.


Application To Data Brain

Data Brain owns this framework.

Data Brain should define:

  • schemas
  • metadata fields
  • source IDs
  • vector storage rules
  • retention rules
  • deletion rules
  • source confidence
  • client separation
  • retrieval filters
  • observability fields

Data Brain Rule

RAG is data infrastructure first and AI output second.


Application To Research Brain

Research Brain should use RAG to preserve and retrieve research evidence.

Research Brain can use RAG for:

  • source libraries
  • competitor knowledge
  • market research
  • course notes
  • trend archives
  • newsletter intelligence
  • evidence retrieval
  • insight synthesis

Research Brain Rule

Research RAG must preserve source visibility and confidence.


Application To AIBS Brain

AIBS Brain should use RAG for client memory and diagnostics.

AIBS can use RAG to:

  • understand client documents
  • support business audits
  • create client reports
  • power assistants
  • support proposals
  • retrieve SOPs
  • support staff knowledge
  • identify value leaks

AIBS Rule

Client memory must be permission-safe, source-led, and separate by client.


Application To Automation Brain

Automation Brain should build the processing and retrieval workflows only after governance is clear.

Automation Brain should manage:

  • intake workflows
  • processing workflows
  • embeddings
  • storage
  • deletion
  • retrieval
  • logs
  • error routing
  • review queues

Automation Brain Rule

RAG automation must include deletion and error handling, not only upload.


Application To Compliance And Risk Brain

Compliance and Risk Brain should review RAG systems for:

  • private data
  • customer data
  • sensitive documents
  • AI processing permission
  • retention
  • deletion rights
  • client separation
  • data exposure
  • hallucination risk
  • source confidence
  • client-facing output

Compliance Rule

A RAG system that stores client data is a governance responsibility, not a toy.


Application To Content Brain

Content Brain may use RAG to create source-led content.

Content Brain can retrieve:

  • approved brand voice
  • approved offers
  • approved claims
  • approved FAQs
  • approved case studies
  • approved research
  • content source libraries

Content Brain Rule

Content generated from RAG still requires claim and source review.


Application To Sales Brain

Sales Brain may use RAG for proposals and follow-up.

Sales Brain can retrieve:

  • case studies
  • client notes
  • sales decks
  • objections
  • pricing logic
  • prior proposals
  • diagnostic findings
  • offer scope

Sales Brain Rule

RAG can draft sales material, but humans approve claims, price, and scope.


Application To HeadOffice Brain

HeadOffice should approve major memory systems.

HeadOffice should ask:

  • does this memory system support MWMS strategy
  • is it safe
  • who owns it
  • who maintains it
  • what data enters it
  • what data is excluded
  • is it internal or client-facing
  • does it need M
  • does it deserve build priority

HeadOffice Rule

Not every knowledge base deserves infrastructure.


What Not To Do

Do not:

  • dump all documents into memory without purpose
  • mix client data
  • skip permission review
  • store sensitive data unnecessarily
  • rely on old documents
  • forget deletion workflows
  • retrieve without metadata filters
  • answer from low-confidence sources as if certain
  • let AI invent missing context
  • create client-facing RAG without testing
  • use RAG as excuse to avoid human review
  • store chat history without purpose
  • expose service role keys
  • treat embeddings as understandable business records
  • build complex RAG before proving the use case

Rule

A messy knowledge base creates confident wrong answers.


Deferred Update And Parking Lot Section

This page creates later update needs.

Later Update 1: MWMS Supabase RAG And Vector Memory Framework

Add:

  • broader client memory governance
  • source confidence scoring
  • deletion workflow requirements
  • stale document handling
  • client-facing RAG readiness score
  • no context no answer rule
  • source metadata requirements

Later Update 2: MWMS Client Intelligence And Business Memory Automation Framework

Add:

  • RAG as client memory infrastructure
  • document add and delete lifecycle
  • source confidence labels
  • vector memory retrieval
  • chat history governance
  • source freshness review

Later Update 3: MWMS Source Visibility And Evidence Display Standard

Add:

  • retrieved source IDs
  • chunk source references
  • fact versus inference separation
  • evidence-backed answers
  • confidence labels
  • missing source warning

Later Update 4: MWMS AI Observability Metadata Standard

Add:

  • retrieved chunk IDs
  • retrieval score
  • source ID
  • vector store name
  • embedding model
  • prompt version
  • answer grounded flag
  • no context flag
  • human review flag

Later Update 5: MWMS AI Automation Security And Risk Checklist

Add:

  • vector memory privacy risk
  • client memory separation
  • source permission review
  • deletion request handling
  • service role key caution
  • private file ingestion controls
  • chat history retention rules

Later Update 6: MWMS AIBS Automation Audit And Opportunity Mapping Framework

Add:

  • audit documents as RAG sources
  • client diagnostic memory
  • opportunity map retrieval
  • report evidence retrieval
  • business knowledge base as audit deliverable

Later Update 7: MWMS AI Voice Agent Design Testing And Governance Framework

Add:

  • voice agent knowledge base rules
  • live answer boundaries
  • retrieved context limits
  • fallback when knowledge missing
  • post-call retrieval review

Later Update 8: MWMS Prompt Architecture And Automation Output Reliability Framework

Add:

  • RAG answering prompt rules
  • source-only response instructions
  • no context no answer instruction
  • retrieved context formatting
  • citation and uncertainty handling

Future AI Employee Ideas

These AI Employee ideas are parked candidates only.

RAG Knowledge Base Architect

Primary Brain: Data Brain / Research Brain
Status: Parked Candidate
Purpose: Designs RAG knowledge base structure, source intake, metadata, chunking, retrieval, deletion, and governance standards.


Source Intake Controller

Primary Brain: Data Brain / Compliance Brain
Status: Parked Candidate
Purpose: Reviews sources before they enter memory and assigns permission level, sensitivity level, topic, confidence, and retention rules.


Vector Memory Curator

Primary Brain: Data Brain
Status: Parked Candidate
Purpose: Maintains vector memory, removes stale records, checks duplicates, validates metadata, and protects source quality.


Retrieval Quality Tester

Primary Brain: Research Brain / Data Brain
Status: Parked Candidate
Purpose: Tests whether RAG systems retrieve correct, current, allowed, and relevant context.


Knowledge Deletion Steward

Primary Brain: Data Brain / Risk Brain
Status: Parked Candidate
Purpose: Ensures deleted or replaced sources are removed from files, records, embeddings, and retrieval paths.


Source Confidence Analyst

Primary Brain: Research Brain
Status: Parked Candidate
Purpose: Scores source reliability and decides whether information can support answers, reports, proposals, or client-facing recommendations.


RAG Answer Quality Reviewer

Primary Brain: Prompting Framework / Research Brain
Status: Parked Candidate
Purpose: Reviews RAG outputs for source grounding, uncertainty handling, hallucination risk, and answer usefulness.


Client Memory Privacy Reviewer

Primary Brain: Compliance Brain / Risk Brain
Status: Parked Candidate
Purpose: Reviews client memory systems for privacy, data separation, sensitive content, retention, and deletion obligations.


Drift Protection

This framework protects MWMS from:

  • ungoverned memory systems
  • stale knowledge
  • source confusion
  • mixed client data
  • unsupported AI answers
  • poor retrieval
  • missing metadata
  • forgotten deletion
  • overconfidence
  • private data exposure
  • chat history misuse
  • source-free reports
  • AI voice agents making things up
  • client assistants answering beyond approved knowledge
  • dumping documents into vector memory without purpose
  • treating RAG as magic

Drift Signals

Watch for:

  • “Just upload everything.”
  • “The AI will find what it needs.”
  • “We do not need metadata.”
  • “We can delete the file manually later.”
  • “The old document is probably still fine.”
  • “No need to separate clients.”
  • “The chatbot sounds accurate.”
  • “We do not need source references.”
  • “Let it answer from general knowledge.”
  • “We can add deletion later.”
  • “The client will not care where the answer came from.”
  • “The voice agent can improvise.”
  • “Let’s store all chat history just in case.”

Rule

When these drift signals appear, return to source permission, metadata, deletion, and answer grounding.


Strategic Summary

The AI Native Entrepreneur Architecture And Tool Decision Block reinforced a core MWMS lesson:

Useful AI memory is not created by simply uploading documents. It is created by controlled source intake, clean processing, meaningful metadata, reliable retrieval, deletion control, and governed answer generation.

RAG is essential for MWMS because future AI Employees and client systems must work from real business context.

But RAG is also risky when poorly governed.

This framework establishes the discipline needed for MWMS to use RAG safely across internal Brains, AIBS client systems, support agents, voice agents, reports, and productized AIOS modules.

The strategic upgrade is:

MWMS memory must be source-led, permission-safe, current, retrievable, deletable, and observable.


Final Standard

The MWMS final standard is:

No MWMS RAG, knowledge base, client memory, voice agent memory, support assistant, or business intelligence memory system should be used until its purpose, source list, permission level, sensitivity level, metadata, chunking, storage, retrieval rules, answer boundaries, deletion process, freshness review, observability, and human review requirements are defined.

A valid MWMS RAG system must define:

  • purpose
  • owner
  • user
  • source list
  • permission rules
  • sensitivity levels
  • source confidence
  • chunking method
  • embedding model
  • vector store
  • metadata fields
  • storage location
  • retrieval filters
  • answer rules
  • no context no answer rule
  • deletion process
  • stale source review
  • chat history rules
  • observability fields
  • human review points
  • governance owner

That is the MWMS RAG Knowledge Base And Client Memory Infrastructure standard.


Change Log

Version: v1.0

Date: 2026-06-08
Author: HeadOffice

Change:
Created the MWMS RAG Knowledge Base And Client Memory Infrastructure Framework from the AI Automations by Jack AI Native Entrepreneur Architecture And Tool Decision Block.

Captured the strongest lessons from practical and strategic workshop material involving:

  • Build Your First RAG AI Agent from Scratch
  • retrieval augmented generation
  • document intake through Google Drive
  • to add, added, and delete folder workflow
  • embeddings
  • vector storage
  • Supabase based knowledge storage
  • Pinecone style vector memory
  • document hash and source ID matching
  • metadata based deletion
  • source freshness
  • chat history
  • company specific knowledge retrieval
  • no hallucination positioning
  • client knowledge base systems
  • business memory for AIBS
  • voice agent and support assistant knowledge bases

Defined the MWMS RAG Knowledge Base And Client Memory Model with twelve layers:

  1. Knowledge Purpose Layer
  2. Source Intake Layer
  3. Permission And Sensitivity Layer
  4. Chunking And Processing Layer
  5. Embedding And Vector Layer
  6. Metadata And Source Layer
  7. Storage And Database Layer
  8. Retrieval And Ranking Layer
  9. Answer Generation Layer
  10. Deletion And Freshness Layer
  11. Observability And Testing Layer
  12. Governance And Improvement Layer

Added key operating sections:

  • Knowledge Base Types
  • RAG Intake Checklist
  • Knowledge Record Standard
  • Source Confidence Standard
  • Retrieval Safety Standard
  • RAG Answering Standard
  • Document Add Process
  • Document Delete Process
  • Chat History Standard
  • Hallucination Protection Standard
  • Client Facing RAG Standard
  • Voice Agent RAG Standard
  • AIBS Diagnostic RAG Standard
  • RAG Quality Scorecard
  • RAG Build Readiness Checklist
  • Deferred Update And Parking Lot Section
  • Future AI Employee Ideas

Mapped the framework across:

  • Data Brain
  • Research Brain
  • AIBS Brain
  • Automation Brain
  • HeadOffice Brain
  • Compliance Brain
  • Risk Brain
  • Sales Brain
  • Content Brain
  • Product Brain
  • UX Brain
  • Prompting Framework

Purpose of creation:
To establish a formal MWMS standard for building source based RAG, knowledge base, vector memory, and client memory systems that allow AI Employees and client systems to retrieve approved knowledge, preserve source context, avoid stale information, reduce hallucination, protect client data, and support better diagnostics, reports, support, sales, content, and decision-making.

END — MWMS RAG KNOWLEDGE BASE AND CLIENT MEMORY INFRASTRUCTURE FRAMEWORK v1.0