Data Brain Data Blending And Source Integrity Framework

Document Type: Framework
Status: Canon
Authority: HeadOffice
Applies To: Data Brain, Product Brain, Experimentation Brain, Finance Brain, Affiliate Brain, HeadOffice
Parent: Data Brain Canon
Version: v1.0
Last Reviewed: 2026-05-03


Purpose

The Data Blending And Source Integrity Framework defines how MWMS combines data from multiple sources while preserving accuracy, consistency, and trust.

As MWMS grows, data will come from:

  • WordPress (UI + forms)
  • Supabase (events + tasks)
  • Google Ads (traffic + cost)
  • Affiliate platforms (revenue)
  • email systems
  • product usage systems
  • payment systems

Blending this data incorrectly leads to:

  • false insights
  • incorrect decisions
  • broken reporting
  • system-wide data mistrust

This framework ensures:

  • reliable combined datasets
  • clear ownership of data
  • consistent definitions
  • safe integration of multiple sources

Core Principle

Blended data is only as reliable as its weakest source and its join logic.


Role In MWMS System

This framework protects:

  • Data Brain → accuracy
  • Finance Brain → revenue truth
  • Affiliate Brain → performance tracking
  • Experimentation Brain → test validity
  • HeadOffice → decision confidence

Definition

Data Blending

Combining data from multiple sources into a unified dataset.


Source Integrity

Ensuring each data source:

  • is accurate
  • is consistent
  • has a defined ownership
  • is used correctly

Data Source Types

MWMS data sources include:


1. Product Data

  • events
  • usage
  • feature interaction

2. Marketing Data

  • traffic
  • campaigns
  • ads

3. Financial Data

  • revenue
  • cost
  • payouts

4. Customer Data

  • user profiles
  • segments
  • lifecycle

5. External Data

  • third-party tools
  • affiliate networks
  • integrations

Source Ownership Rule

Each metric must have:

→ one primary source of truth


Examples

  • revenue → payment system
  • clicks → ad platform
  • events → product analytics

Rule

Never mix sources for the same metric without clear reconciliation logic


Join Key Requirement

All data blending must use:

→ consistent identifiers


Common Join Keys

  • user_id
  • session_id
  • email (hashed)
  • transaction_id
  • campaign_id

Rule

If join keys do not align:

→ data must not be blended


Data Scope Rule

Data must be blended at the correct level:


Levels

  • user level
  • session level
  • event level
  • transaction level

Rule

Never combine:

  • different scopes

without transformation


Example

Incorrect:

  • mixing session data with user data directly

Correct:

  • aggregate session → user level
  • then combine

Data Alignment Rule

Before blending, confirm:

  • timeframes match
  • definitions match
  • granularity matches

Rule

Misaligned data produces false insights


Time Consistency Rule

All data must use:

  • consistent timezone
  • aligned time windows

Rule

Time mismatch creates:

  • false trends
  • incorrect attribution

Data Transformation Layer

Blended data must pass through a transformation layer.


Purpose

  • normalize formats
  • align fields
  • standardize values

Rule

Never blend raw incompatible data


Attribution Integrity Rule

When combining marketing and product data:

  • attribution logic must be defined
  • last-click, multi-touch, or custom

Rule

Attribution must be consistent across reports


Data Validation Protocol

Before blended data is used:


Must Confirm

  • data completeness
  • correct joins
  • no duplication
  • correct aggregation
  • expected values

Rule

Unvalidated data must not be used


Error Detection Signals

Watch for:

  • sudden spikes
  • unexpected drops
  • inconsistent ratios
  • duplicate records
  • missing data

Rule

Anomalies must be investigated before decisions


Data Blending Failure Modes


1. Broken Joins

Incorrect matching of records


2. Double Counting

Same data counted multiple times


3. Missing Data

Incomplete datasets


4. Misaligned Timeframes

Comparing different periods


5. Mixed Definitions

Different meanings of the same metric


Drift Protection

The system must prevent:

  • blending without validation
  • mixing data sources blindly
  • ignoring source ownership
  • inconsistent definitions
  • silent data changes

Operational Rules


Rule 1: Start With One Source

Understand single source before blending


Rule 2: Add Sources Gradually

Blend step-by-step


Rule 3: Validate Each Step

Check data at each stage


Rule 4: Document Logic

All blending logic must be recorded


Cross Brain Integration

Data Brain
→ owns blending logic

Finance Brain
→ validates revenue

Affiliate Brain
→ validates campaign data

Experimentation Brain
→ validates test data

Product Brain
→ validates usage data

HeadOffice
→ governs


Architectural Intent

This framework ensures MWMS:

  • maintains data trust
  • prevents false insights
  • supports complex analysis
  • scales safely

Final Rule

If data cannot be trusted:

→ decisions cannot be trusted


Change Log

Version: v1.0
Date: 2026-05-03
Author: HeadOffice

Change:
Created Data Blending And Source Integrity Framework defining rules for combining multi-source data safely across MWMS.


Change Impact Declaration

Pages Created:
Data Brain Data Blending And Source Integrity Framework

Pages Updated:
None

Pages Deprecated:
None

Registries Requiring Update:
MWMS Architecture Registry
Data Brain Page Registry

Canon Version Update Required:
No

Change Log Entry Required:
Yes


END DATA BRAIN DATA BLENDING AND SOURCE INTEGRITY FRAMEWORK v1.0