Data Brain Raw Data Access Framework


Document Type: Framework
Status: Active
Authority: Data Brain
Parent: Data Brain Architecture
Applies To: All MWMS environments where analytics, experimentation, or performance data is used for analysis or decision-making
Version: v1.0
Last Reviewed: 2026-04-23


Purpose

The Data Brain Raw Data Access Framework defines how MWMS accesses, validates, and uses raw event-level data from underlying data systems.

Most analytics platforms present:

• aggregated data
• sampled data
• estimated data
• thresholded data

These representations are useful for trend analysis but may not reflect exact behaviour.

Raw data access provides:

• full event-level visibility
• exact counts
• complete behavioural logs
• direct query control

This framework ensures MWMS uses raw data where accuracy is required, especially for experimentation and decision-critical analysis.


Core Principle

Interface data is not guaranteed truth.

Raw data provides the closest representation of actual behaviour.

If decision accuracy requires precision:

→ raw data must be used


Position in MWMS System

This framework operates within:

• Data Brain → data access and validation
• Experimentation Brain → test analysis
• HeadOffice → decision control

It supports:

• Data Trust Framework
• Measurement Integrity Framework
• Attribution Reliability Framework
• Statistical Confidence Framework


Raw Data Definition

Raw data refers to:

• event-level records
• unaggregated behavioural data
• system-exported logs
• direct database tables

Examples:

• GA4 BigQuery event tables
• database event logs
• server-side tracking logs

Raw data represents:

→ what actually happened at the event level


Interface Data vs Raw Data


Interface Data

Characteristics:

• aggregated
• sampled
• estimated
• simplified

Advantages:

• fast
• easy to interpret
• accessible

Limitations:

• may not be exact
• may differ over time
• may hide small differences


Raw Data

Characteristics:

• unaggregated
• event-level
• exact counts
• fully queryable

Advantages:

• high accuracy
• full flexibility
• complete visibility

Limitations:

• requires technical querying
• higher complexity
• requires interpretation


Raw Data Requirement Rule

Raw data must be used when:

• analysing A/B tests
• validating measurement accuracy
• investigating anomalies
• resolving discrepancies
• validating attribution differences
• making high-impact decisions

Interface data may be used for:

• trend monitoring
• directional insights
• high-level reporting


🔴 Critical Use Case — Experimentation

A/B testing requires precise data.

Small differences in:

• conversions
• users
• events

can change test outcomes.

If interface data is:

• sampled
• thresholded
• estimated

→ test results may be incorrect

Therefore:

→ raw data is required for experiment validation


🔴 Critical Use Case — Discrepancy Resolution

When systems disagree:

• GA4 vs Ads
• analytics vs backend
• dashboards vs reports

Raw data must be used to:

• identify source of discrepancy
• validate counts
• confirm correct behaviour


Raw Data Access Methods


1. Data Warehouse Access

Primary method for raw data access.

Examples:

• BigQuery
• data warehouses
• internal databases

Provides:

• full event logs
• structured tables
• query flexibility


2. Query-Based Access

Data is retrieved using structured queries.

Examples:

• SQL queries
• filtered extraction
• aggregation queries

Allows:

• precise data selection
• custom segmentation
• experiment analysis


3. Export Pipelines

Raw data may be accessed through:

• automated exports
• streaming pipelines
• data integrations

Purpose:

• continuous data availability


Raw Data Structure Awareness

Raw data is often structured as:

• event logs
• timestamped records
• nested parameter structures

Important considerations:

• events may contain multiple parameters
• parameters must be extracted correctly
• data may require transformation

Incorrect interpretation of raw structure leads to incorrect analysis.


🔴 Nested Data Rule

Raw data may contain nested structures.

Examples:

• key-value parameters
• event attributes stored within columns

These require:

→ correct extraction logic

If nested data is not handled properly:

→ analysis results become incorrect


Raw Data Validation Rule

Raw data must still be validated.

Raw does not mean automatically correct.

Validation includes:

• event integrity checks
• duplication checks
• missing data checks
• timestamp validation
• session validation


Raw Data Limitations

Raw data may still have limitations:

• tracking gaps
• missing events
• system-level visibility gaps
• delayed data availability
• export limitations

Raw data improves accuracy, but does not eliminate all limitations.


Data Latency Consideration

Raw data may not be immediately complete.

Examples:

• delayed exports
• backfilled data
• streaming vs batch differences

Recent data must be treated cautiously.


Cost Awareness Rule

Raw data querying may incur:

• compute costs
• storage costs

Queries must be:

• efficient
• targeted
• controlled


Automation and Monitoring Integration

Raw data should feed into:

• automated reporting systems
• dashboards
• experiment monitoring
• decision surfaces

Automation reduces manual querying.


Relationship to Other Frameworks

Supports:

• Data Brain Data Trust Framework
• Data Brain Measurement Integrity Framework
• Data Brain Attribution Reliability Framework
• Data Brain Signal Flow Framework
• Experimentation Brain Statistical Confidence Framework


Failure Modes Prevented

relying on estimated data
incorrect test conclusions
misinterpreting small differences
ignoring discrepancies
over-trusting interface reports


Drift Protection

The system must prevent:

• reliance on interface data for critical decisions
• ignoring raw data availability
• degradation of query accuracy
• incorrect interpretation of event structures


Architectural Intent

The Data Brain Raw Data Access Framework ensures MWMS operates with:

evidence-based decision inputs

It transforms:

analytics consumption → data interrogation


Final Rule

If decision accuracy requires precision:

→ raw data must be used


Change Log

Version: v1.0
Date: 2026-04-23
Author: Data Brain

Change:
Initial creation of Raw Data Access Framework defining how MWMS accesses and uses event-level data for accurate decision-making.


Change Impact Declaration

Pages Created:
Data Brain Raw Data Access Framework

Pages Updated:
None

Pages Deprecated:
None

Registries Requiring Update:
MWMS Architecture Registry

Canon Version Update Required:
No

Change Log Entry Required:
Yes