Document Type: Framework
Status: Canon
Authority: HeadOffice
Applies To: Data Brain, Product Brain, Experimentation Brain, Finance Brain, Affiliate Brain, HeadOffice
Parent: Data Brain Canon
Version: v1.0
Last Reviewed: 2026-05-03
Purpose
The Data Blending And Source Integrity Framework defines how MWMS combines data from multiple sources while preserving accuracy, consistency, and trust.
As MWMS grows, data will come from:
- WordPress (UI + forms)
- Supabase (events + tasks)
- Google Ads (traffic + cost)
- Affiliate platforms (revenue)
- email systems
- product usage systems
- payment systems
Blending this data incorrectly leads to:
- false insights
- incorrect decisions
- broken reporting
- system-wide data mistrust
This framework ensures:
- reliable combined datasets
- clear ownership of data
- consistent definitions
- safe integration of multiple sources
Core Principle
Blended data is only as reliable as its weakest source and its join logic.
Role In MWMS System
This framework protects:
- Data Brain → accuracy
- Finance Brain → revenue truth
- Affiliate Brain → performance tracking
- Experimentation Brain → test validity
- HeadOffice → decision confidence
Definition
Data Blending
Combining data from multiple sources into a unified dataset.
Source Integrity
Ensuring each data source:
- is accurate
- is consistent
- has a defined ownership
- is used correctly
Data Source Types
MWMS data sources include:
1. Product Data
- events
- usage
- feature interaction
2. Marketing Data
- traffic
- campaigns
- ads
3. Financial Data
- revenue
- cost
- payouts
4. Customer Data
- user profiles
- segments
- lifecycle
5. External Data
- third-party tools
- affiliate networks
- integrations
Source Ownership Rule
Each metric must have:
→ one primary source of truth
Examples
- revenue → payment system
- clicks → ad platform
- events → product analytics
Rule
Never mix sources for the same metric without clear reconciliation logic
Join Key Requirement
All data blending must use:
→ consistent identifiers
Common Join Keys
- user_id
- session_id
- email (hashed)
- transaction_id
- campaign_id
Rule
If join keys do not align:
→ data must not be blended
Data Scope Rule
Data must be blended at the correct level:
Levels
- user level
- session level
- event level
- transaction level
Rule
Never combine:
- different scopes
without transformation
Example
Incorrect:
- mixing session data with user data directly
Correct:
- aggregate session → user level
- then combine
Data Alignment Rule
Before blending, confirm:
- timeframes match
- definitions match
- granularity matches
Rule
Misaligned data produces false insights
Time Consistency Rule
All data must use:
- consistent timezone
- aligned time windows
Rule
Time mismatch creates:
- false trends
- incorrect attribution
Data Transformation Layer
Blended data must pass through a transformation layer.
Purpose
- normalize formats
- align fields
- standardize values
Rule
Never blend raw incompatible data
Attribution Integrity Rule
When combining marketing and product data:
- attribution logic must be defined
- last-click, multi-touch, or custom
Rule
Attribution must be consistent across reports
Data Validation Protocol
Before blended data is used:
Must Confirm
- data completeness
- correct joins
- no duplication
- correct aggregation
- expected values
Rule
Unvalidated data must not be used
Error Detection Signals
Watch for:
- sudden spikes
- unexpected drops
- inconsistent ratios
- duplicate records
- missing data
Rule
Anomalies must be investigated before decisions
Data Blending Failure Modes
1. Broken Joins
Incorrect matching of records
2. Double Counting
Same data counted multiple times
3. Missing Data
Incomplete datasets
4. Misaligned Timeframes
Comparing different periods
5. Mixed Definitions
Different meanings of the same metric
Drift Protection
The system must prevent:
- blending without validation
- mixing data sources blindly
- ignoring source ownership
- inconsistent definitions
- silent data changes
Operational Rules
Rule 1: Start With One Source
Understand single source before blending
Rule 2: Add Sources Gradually
Blend step-by-step
Rule 3: Validate Each Step
Check data at each stage
Rule 4: Document Logic
All blending logic must be recorded
Cross Brain Integration
Data Brain
→ owns blending logic
Finance Brain
→ validates revenue
Affiliate Brain
→ validates campaign data
Experimentation Brain
→ validates test data
Product Brain
→ validates usage data
HeadOffice
→ governs
Architectural Intent
This framework ensures MWMS:
- maintains data trust
- prevents false insights
- supports complex analysis
- scales safely
Final Rule
If data cannot be trusted:
→ decisions cannot be trusted
Change Log
Version: v1.0
Date: 2026-05-03
Author: HeadOffice
Change:
Created Data Blending And Source Integrity Framework defining rules for combining multi-source data safely across MWMS.
Change Impact Declaration
Pages Created:
Data Brain Data Blending And Source Integrity Framework
Pages Updated:
None
Pages Deprecated:
None
Registries Requiring Update:
MWMS Architecture Registry
Data Brain Page Registry
Canon Version Update Required:
No
Change Log Entry Required:
Yes