Experimentation Brain Sample Size Governance Framework

Document Type: Framework
Status: Canon
Authority: HeadOffice
Applies To: Experimentation Brain, Affiliate Brain, Ads Brain, Conversion Brain, Data Brain, Finance Brain, HeadOffice
Parent: Experimentation Brain Canon
Version: v1.0
Last Reviewed: 2026-05-07


Purpose

The Sample Size Governance Framework defines how MWMS determines whether an experiment has sufficient observational data to support reliable decision-making.

This framework ensures MWMS understands that sample size is not:

  • a vanity number
  • a random stopping target
  • a “gut feeling” estimate

It is a controlled statistical requirement connected to:

  • uncertainty reduction
  • signal reliability
  • detectable effect size
  • confidence quality
  • business risk management

The framework governs how MWMS plans and validates experiment sample requirements before scaling decisions are made.


Core Principle

Reliable decisions require sufficient evidence volume.


Definition

Sample size is the number of observations required for an experiment to detect a meaningful effect with acceptable statistical confidence and risk tolerance.


Structural Role

This framework connects:

Experimentation Brain
→ experiment planning systems

Affiliate Brain
→ offer testing governance

Ads Brain
→ campaign and creative testing

Conversion Brain
→ funnel experimentation systems

Data Brain
→ statistical reliability systems

Finance Brain
→ traffic and budget allocation

HeadOffice
→ experimentation oversight


Sample Size Reality

Small samples frequently create:

  • unstable outcomes
  • false confidence
  • noisy results
  • exaggerated lifts
  • inconsistent scaling decisions

Rule

Low evidence volume weakens decision quality.


Governance Principle

Sample size must be planned before traffic allocation.


Required Planning Inputs

Before estimating sample size, MWMS should define:

  • baseline performance
  • minimum meaningful effect
  • confidence threshold
  • acceptable risk
  • expected traffic volume
  • experiment structure

Rule

Sample size planning must occur before experimentation begins.


Baseline Performance Layer

Sample requirements depend on current performance levels.


Examples

  • baseline conversion rate
  • baseline CTR
  • baseline CPA
  • baseline revenue per visitor

Rule

Lower baseline rates often require larger samples.


Minimum Effect Layer

Smaller detectable effects require larger samples.


Examples

  • 1% lift
  • 5% lift
  • 10% lift

Rule

The smaller the expected effect, the larger the required evidence set.


Confidence Layer

Higher confidence thresholds increase required sample size.


Examples

  • 80% confidence
  • 90% confidence
  • 95% confidence
  • 99% confidence

Rule

Confidence strength increases statistical cost.


Variability Layer

High variance environments require more observations.


Examples

  • unstable traffic
  • inconsistent user behavior
  • seasonal volatility
  • fluctuating conversion rates

Rule

Noise increases evidence requirements.


Traffic Segmentation Layer

Sample planning must reflect actual audience segmentation.


Examples

  • mobile users
  • desktop users
  • cold traffic
  • retargeting traffic
  • geographic traffic
  • paid traffic only

Rule

Incorrect segmentation assumptions weaken planning validity.


Multi Variant Layer

Additional variants increase sample requirements significantly.


Examples

  • AB testing
  • ABN testing
  • multiple creative tests
  • multiple offer tests

Rule

More variants dilute available evidence.


Sequential Testing Layer

Repeated interim checking can distort statistical validity if not governed correctly.


Rule

Sequential evaluation requires predefined stopping governance.


Duration Planning Layer

Sample size influences expected experiment duration.


Examples

  • low traffic tests may require extended timelines
  • high traffic tests may reach validity rapidly

Rule

Sample sufficiency and duration are structurally connected.


Practical Business Layer

Statistical perfection is not always operationally optimal.

MWMS must balance:

  • rigor
  • speed
  • traffic availability
  • business urgency
  • decision impact

Rule

Business context influences acceptable sample thresholds.


False Positive Layer

Small samples increase false positive risk.


Examples

  • random spikes mistaken for winners
  • unstable lifts interpreted as scalable outcomes

Rule

Weak evidence can create dangerous scaling decisions.


False Negative Layer

Insufficient samples may hide true improvements.


Examples

  • profitable creatives rejected
  • valuable offers abandoned
  • winning funnels killed prematurely

Rule

Insufficient evidence creates hidden opportunity loss.


Stopping Governance Layer

Experiments should define:

  • minimum sample targets
  • stopping conditions
  • review intervals
  • acceptable uncertainty

before launch.


Rule

Stopping decisions must not become emotional reactions.


Experiment Type Layer

Different experiment structures require different evidence levels.


Examples

  • landing page tests
  • CTA tests
  • pricing tests
  • offer tests
  • traffic source tests
  • VSL tests

Rule

Experiment complexity influences sample requirements.


Traffic Allocation Layer

Traffic distribution impacts sample accumulation speed.


Examples

  • equal split testing
  • weighted traffic allocation
  • exploration vs exploitation logic

Rule

Traffic allocation affects evidence reliability.


Statistical Integrity Layer

Sample size governance strengthens:

  • predictive reliability
  • result stability
  • confidence quality
  • scaling confidence
  • decision defensibility

Rule

Strong evidence improves long-term optimization systems.


Measurement Layer

MWMS should track:

  • target sample size
  • achieved sample size
  • sample completion rate
  • confidence progression
  • variance observations
  • estimated uncertainty

Rule

Experiment sufficiency must remain measurable.


Cross Brain Integration

Experimentation Brain
→ owns sample size governance

Affiliate Brain
→ applies sample governance to offer testing

Ads Brain
→ governs campaign and creative evidence quality

Conversion Brain
→ applies sample governance to funnel experiments

Data Brain
→ validates statistical assumptions and variance

Finance Brain
→ evaluates testing cost efficiency

HeadOffice
→ governance and experimentation oversight


Failure Modes Prevented

This framework prevents:

  • premature scaling decisions
  • false confidence
  • underpowered testing
  • random noise interpretation
  • unstable optimization systems
  • wasted traffic allocation

Drift Protection

The system must prevent:

  • arbitrary stopping decisions
  • emotional interpretation of small samples
  • uncontrolled multi-variant testing
  • ignoring variance
  • weak segmentation assumptions
  • traffic starvation during testing

Architectural Intent

This framework transforms MWMS experimentation thinking from:

→ traffic guessing systems

into:

→ governed evidence accumulation systems

It ensures MWMS develops:

  • reliable scaling logic
  • evidence-based experimentation
  • statistically defensible optimization
  • structured uncertainty management
  • long-term experimentation stability

Final Rule

If the sample is insufficient:

→ the confidence behind the decision becomes unstable.


Change Log

Version: v1.0

Date: 2026-05-07
Author: HeadOffice

Change:
Created Sample Size Governance Framework defining evidence sufficiency planning, variance-aware experimentation, stopping governance, segmentation-aware sample planning, and structured statistical reliability systems.


Change Impact Declaration

Pages Created:
Experimentation Brain Sample Size Governance Framework

Pages Updated:
None

Pages Deprecated:
None

Registries Requiring Update:
MWMS Architecture Registry
Experimentation Brain Page Registry

Canon Version Update Required:
No

Change Log Entry Required:
Yes


END EXPERIMENTATION BRAIN SAMPLE SIZE GOVERNANCE FRAMEWORK v1.0