Document Type: Framework
Status: Canon
Authority: HeadOffice
Applies To: Experimentation Brain, Affiliate Brain, Ads Brain, Conversion Brain, Data Brain, Finance Brain, HeadOffice
Parent: Experimentation Brain Canon
Version: v1.0
Last Reviewed: 2026-05-07
Purpose
The Sample Size Governance Framework defines how MWMS determines whether an experiment has sufficient observational data to support reliable decision-making.
This framework ensures MWMS understands that sample size is not:
- a vanity number
- a random stopping target
- a “gut feeling” estimate
It is a controlled statistical requirement connected to:
- uncertainty reduction
- signal reliability
- detectable effect size
- confidence quality
- business risk management
The framework governs how MWMS plans and validates experiment sample requirements before scaling decisions are made.
Core Principle
Reliable decisions require sufficient evidence volume.
Definition
Sample size is the number of observations required for an experiment to detect a meaningful effect with acceptable statistical confidence and risk tolerance.
Structural Role
This framework connects:
Experimentation Brain
→ experiment planning systems
Affiliate Brain
→ offer testing governance
Ads Brain
→ campaign and creative testing
Conversion Brain
→ funnel experimentation systems
Data Brain
→ statistical reliability systems
Finance Brain
→ traffic and budget allocation
HeadOffice
→ experimentation oversight
Sample Size Reality
Small samples frequently create:
- unstable outcomes
- false confidence
- noisy results
- exaggerated lifts
- inconsistent scaling decisions
Rule
Low evidence volume weakens decision quality.
Governance Principle
Sample size must be planned before traffic allocation.
Required Planning Inputs
Before estimating sample size, MWMS should define:
- baseline performance
- minimum meaningful effect
- confidence threshold
- acceptable risk
- expected traffic volume
- experiment structure
Rule
Sample size planning must occur before experimentation begins.
Baseline Performance Layer
Sample requirements depend on current performance levels.
Examples
- baseline conversion rate
- baseline CTR
- baseline CPA
- baseline revenue per visitor
Rule
Lower baseline rates often require larger samples.
Minimum Effect Layer
Smaller detectable effects require larger samples.
Examples
- 1% lift
- 5% lift
- 10% lift
Rule
The smaller the expected effect, the larger the required evidence set.
Confidence Layer
Higher confidence thresholds increase required sample size.
Examples
- 80% confidence
- 90% confidence
- 95% confidence
- 99% confidence
Rule
Confidence strength increases statistical cost.
Variability Layer
High variance environments require more observations.
Examples
- unstable traffic
- inconsistent user behavior
- seasonal volatility
- fluctuating conversion rates
Rule
Noise increases evidence requirements.
Traffic Segmentation Layer
Sample planning must reflect actual audience segmentation.
Examples
- mobile users
- desktop users
- cold traffic
- retargeting traffic
- geographic traffic
- paid traffic only
Rule
Incorrect segmentation assumptions weaken planning validity.
Multi Variant Layer
Additional variants increase sample requirements significantly.
Examples
- AB testing
- ABN testing
- multiple creative tests
- multiple offer tests
Rule
More variants dilute available evidence.
Sequential Testing Layer
Repeated interim checking can distort statistical validity if not governed correctly.
Rule
Sequential evaluation requires predefined stopping governance.
Duration Planning Layer
Sample size influences expected experiment duration.
Examples
- low traffic tests may require extended timelines
- high traffic tests may reach validity rapidly
Rule
Sample sufficiency and duration are structurally connected.
Practical Business Layer
Statistical perfection is not always operationally optimal.
MWMS must balance:
- rigor
- speed
- traffic availability
- business urgency
- decision impact
Rule
Business context influences acceptable sample thresholds.
False Positive Layer
Small samples increase false positive risk.
Examples
- random spikes mistaken for winners
- unstable lifts interpreted as scalable outcomes
Rule
Weak evidence can create dangerous scaling decisions.
False Negative Layer
Insufficient samples may hide true improvements.
Examples
- profitable creatives rejected
- valuable offers abandoned
- winning funnels killed prematurely
Rule
Insufficient evidence creates hidden opportunity loss.
Stopping Governance Layer
Experiments should define:
- minimum sample targets
- stopping conditions
- review intervals
- acceptable uncertainty
before launch.
Rule
Stopping decisions must not become emotional reactions.
Experiment Type Layer
Different experiment structures require different evidence levels.
Examples
- landing page tests
- CTA tests
- pricing tests
- offer tests
- traffic source tests
- VSL tests
Rule
Experiment complexity influences sample requirements.
Traffic Allocation Layer
Traffic distribution impacts sample accumulation speed.
Examples
- equal split testing
- weighted traffic allocation
- exploration vs exploitation logic
Rule
Traffic allocation affects evidence reliability.
Statistical Integrity Layer
Sample size governance strengthens:
- predictive reliability
- result stability
- confidence quality
- scaling confidence
- decision defensibility
Rule
Strong evidence improves long-term optimization systems.
Measurement Layer
MWMS should track:
- target sample size
- achieved sample size
- sample completion rate
- confidence progression
- variance observations
- estimated uncertainty
Rule
Experiment sufficiency must remain measurable.
Cross Brain Integration
Experimentation Brain
→ owns sample size governance
Affiliate Brain
→ applies sample governance to offer testing
Ads Brain
→ governs campaign and creative evidence quality
Conversion Brain
→ applies sample governance to funnel experiments
Data Brain
→ validates statistical assumptions and variance
Finance Brain
→ evaluates testing cost efficiency
HeadOffice
→ governance and experimentation oversight
Failure Modes Prevented
This framework prevents:
- premature scaling decisions
- false confidence
- underpowered testing
- random noise interpretation
- unstable optimization systems
- wasted traffic allocation
Drift Protection
The system must prevent:
- arbitrary stopping decisions
- emotional interpretation of small samples
- uncontrolled multi-variant testing
- ignoring variance
- weak segmentation assumptions
- traffic starvation during testing
Architectural Intent
This framework transforms MWMS experimentation thinking from:
→ traffic guessing systems
into:
→ governed evidence accumulation systems
It ensures MWMS develops:
- reliable scaling logic
- evidence-based experimentation
- statistically defensible optimization
- structured uncertainty management
- long-term experimentation stability
Final Rule
If the sample is insufficient:
→ the confidence behind the decision becomes unstable.
Change Log
Version: v1.0
Date: 2026-05-07
Author: HeadOffice
Change:
Created Sample Size Governance Framework defining evidence sufficiency planning, variance-aware experimentation, stopping governance, segmentation-aware sample planning, and structured statistical reliability systems.
Change Impact Declaration
Pages Created:
Experimentation Brain Sample Size Governance Framework
Pages Updated:
None
Pages Deprecated:
None
Registries Requiring Update:
MWMS Architecture Registry
Experimentation Brain Page Registry
Canon Version Update Required:
No
Change Log Entry Required:
Yes