SYSTEM Data Brain Product Analytics Data Collection Standard

Document Type: Standard
Status: Canon
Authority: HeadOffice
Applies To: Data Brain, Product Brain, Experimentation Brain, SIT Brain
Parent: Data Brain Canon
Version: v1.0
Last Reviewed: 2026-05-03


Purpose

The Product Analytics Data Collection Standard defines how all product-related data must be:

  • structured
  • captured
  • validated
  • maintained

within the MWMS ecosystem.

This standard ensures:

  • data reliability
  • cross-brain consistency
  • accurate analysis
  • correct decision-making

Without a strict data collection standard, all downstream systems become unreliable.


Core Principle

All analytics quality is determined at the point of data collection.

If data is:

  • inconsistent
  • incomplete
  • incorrectly defined

then:

  • reporting becomes misleading
  • experiments become invalid
  • decisions become flawed

Role In MWMS System

Product Analytics Data Collection sits at the foundation of:

  • Data Brain (data integrity)
  • Product Brain (feature usage analysis)
  • Experimentation Brain (test evaluation)
  • SIT Brain (system validation)
  • HeadOffice (decision oversight)

Data Collection Objectives

The system must enable:

  1. Accurate event tracking
  2. Consistent naming across the system
  3. User-level visibility
  4. Cross-session tracking
  5. Future analysis flexibility
  6. Low dependency on developers (where possible)

Data Collection Structure

All product analytics data must follow this hierarchy:

1. User Level

Represents:

  • unique user
  • account or identity
  • cross-session behaviour

Examples:

  • user_id
  • account_id
  • subscription_status

2. Session Level

Represents:

  • a continuous interaction period

Examples:

  • session_start
  • session_end
  • session_duration
  • device type
  • traffic source

3. Event Level

Represents:

  • specific actions taken by the user

Examples:

  • button_click
  • form_submit
  • feature_used
  • page_view

Data Layer Standard

All tracked data must originate from a structured data layer.

Data Layer Requirements

The data layer must be:

  • tool-agnostic
  • consistent across the system
  • human-readable
  • developer-controlled but analyst-defined

Data Layer Structure Example

{
user_id: "12345",
session_id: "abc123",
event_name: "feature_click",
feature_name: "offer_intelligence",
timestamp: "2026-05-03T10:00:00",
device_type: "desktop",
traffic_source: "youtube"
}

Event Design Standard

Every event must follow a consistent structure.

Required Event Fields

  • event_name
  • event_category
  • event_timestamp
  • user_id
  • session_id

Optional (But Recommended)

  • feature_name
  • page_name
  • traffic_source
  • device_type
  • experiment_id

Event Naming Rules

All events must be:

  • lowercase
  • underscore-separated
  • descriptive
  • consistent across system

Example

Correct:

  • feature_click
  • form_submit
  • offer_view

Incorrect:

  • ClickButton
  • formSubmit123
  • random_event

Metrics Collection Standard

All metrics must be derived from events.

Metric Types

1. Engagement Metrics

  • sessions
  • active users
  • feature usage

2. Conversion Metrics

  • form completions
  • trial to paid
  • click-through rates

3. Retention Metrics

  • churn
  • repeat usage
  • cohort retention

4. Value Metrics

  • CAC
  • CLTV
  • revenue per user

Retroactive vs Classic Tracking Rule

MWMS must prefer:

Retroactive Tracking (Preferred)

  • capture as much data as possible
  • define events later

Benefit

  • flexibility
  • future analysis capability
  • no data loss

Classic Tracking (Limited Use)

  • predefined events only

Risk

  • missing data permanently
  • limited analysis
  • high dependency on dev changes

Data Integrity Rules

1. No Undefined Events

Every event must:

  • have a defined purpose
  • map to a business objective

2. No Duplicate Event Definitions

Same action = same event name


3. No Silent Data Changes

All changes must:

  • be logged
  • be version controlled

4. Consistent Definitions

Metrics must:

  • be calculated the same way over time
  • never change logic mid-stream

Data Ownership Rule

Each metric must have a single source of truth

Examples:

  • sessions → Google Analytics
  • revenue → payment system
  • user events → product analytics tool

Never mix sources for the same metric.


Data Scope Protection Rule

Never mix:

  • user-level data
  • session-level data
  • event-level data

without clear transformation logic.


Data Collection Governance

Responsibilities

Data Brain

  • defines standards
  • ensures integrity

Product Brain

  • defines feature tracking needs

Experimentation Brain

  • defines test tracking requirements

SIT Brain

  • validates implementation

HeadOffice

  • enforces compliance

Validation Protocol

Before any data is accepted:

Must Confirm

  • event fires correctly
  • event data is accurate
  • naming matches standard
  • data appears in analytics tools
  • data matches expected behaviour

Failure Consequences

If this standard is not followed:

  • incorrect insights
  • failed experiments
  • wasted ad spend
  • poor product decisions
  • system-wide data corruption

Key Principle

You cannot fix bad data later.

You can only:

  • prevent it
  • detect it early
  • enforce standards

Summary

This standard ensures:

  • structured data collection
  • reliable analytics
  • scalable system intelligence

It is the foundation of:

  • MWMS decision-making
  • MWMS automation
  • MWMS future AI systems

Status

Approved for integration into Data Brain Canon