Experimentation Brain Statistical Power Framework

Document Type: Framework
Status: Canon
Authority: HeadOffice
Applies To: Experimentation Brain, Data Brain, Affiliate Brain, Ads Brain, Conversion Brain, Finance Brain, Research Brain, HeadOffice
Parent: Experimentation Brain Canon
Version: v1.0
Last Reviewed: 2026-05-07


Purpose

The Statistical Power Framework defines how MWMS governs the probability that experimentation systems can reliably detect meaningful effects when true improvements actually exist.

This framework ensures MWMS understands that weak experimentation systems may fail not because improvements are absent, but because:

  • evidence volume is insufficient
  • variance is excessive
  • measurement is unstable
  • detectable effects are too small
  • experimentation environments are underpowered

The framework governs how MWMS builds sufficiently sensitive experimentation systems while balancing operational cost and decision reliability.


Core Principle

An experiment cannot reliably detect what it does not have enough power to observe.


Definition

Statistical power is the probability that an experimentation system will correctly detect a meaningful effect when that effect genuinely exists.


Structural Role

This framework connects:

Experimentation Brain
→ experimentation sensitivity governance

Data Brain
→ variance and evidence reliability systems

Affiliate Brain
→ offer validation systems

Ads Brain
→ creative and campaign testing governance

Conversion Brain
→ optimization detection reliability

Finance Brain
→ resource allocation and experimentation efficiency

Research Brain
→ interpretation discipline systems

HeadOffice
→ governance and operational oversight


Power Reality

Many failed tests are actually:

  • underpowered tests
  • low-evidence environments
  • variance-heavy systems
  • poorly structured experiments

rather than true negative outcomes.


Rule

Failure to detect improvement is not always proof of absence.


Core Components Of Statistical Power

Statistical power depends on:

  • sample size
  • effect size
  • variance level
  • measurement quality
  • significance threshold

Rule

Power reflects overall experimentation sensitivity.


Sample Size Layer

Larger evidence volume improves detection capability.


Examples

  • more impressions
  • more clicks
  • more conversions
  • longer observation periods

Rule

Small samples reduce detection reliability.


Effect Size Layer

Larger improvements are easier to detect reliably.


Examples

Easy to detect:

  • major conversion improvements

Hard to detect:

  • tiny optimization lifts

Rule

Small effects require stronger experimentation sensitivity.


Variance Layer

High variance weakens detection reliability.


Examples

  • fluctuating ROAS
  • unstable traffic quality
  • inconsistent conversion behavior

Rule

Noise reduces statistical power.


Measurement Integrity Layer

Reliable tracking improves experimentation sensitivity.


Examples

  • accurate attribution
  • stable event collection
  • consistent conversion recording

Rule

Weak measurement systems reduce detection quality.


Threshold Layer

Stricter evidence thresholds require greater experimentation power.


Examples

High confidence requirements:

  • larger sample needs

Lower confidence requirements:

  • reduced evidence burden

Rule

Confidence rigor influences required sensitivity.


Underpowered Experiment Layer

Underpowered systems frequently produce:

  • inconclusive results
  • false negatives
  • unstable interpretation
  • weak optimization reliability

Rule

Low power weakens learning efficiency.


Resource Efficiency Layer

Higher power requires greater:

  • traffic
  • time
  • budget
  • operational patience

Rule

Power increases operational cost.


Exploratory Testing Layer

Exploratory environments may tolerate lower power conditions.


Examples

  • creative ideation
  • directional learning
  • early market exploration

Rule

Exploration and scaling require different sensitivity standards.


Scaling Validation Layer

Scaling decisions require stronger statistical power than exploratory testing.


Examples

  • major budget increases
  • infrastructure dependency
  • automation rollout
  • market expansion

Rule

Scaling requires stronger detection confidence.


Minimum Detectable Effect Layer

Power interacts directly with meaningful effect size requirements.


Examples

  • detecting small profitability lifts requires stronger sensitivity
  • detecting large conversion shifts requires less sensitivity

Rule

Smaller targets require more evidence strength.


Concurrent Experimentation Layer

Multiple simultaneous tests dilute available evidence volume.


Examples

  • excessive creative variants
  • overlapping audience tests
  • fragmented traffic allocation

Rule

Fragmentation weakens experimentation power.


Time Horizon Layer

Longer observation periods may improve detection reliability.


Examples

  • seasonal stabilization
  • delayed conversion tracking
  • repeat purchase evaluation

Rule

Some effects require extended observation windows.


AI Governance Layer

AI Employees should:

  • identify underpowered conditions
  • classify detection limitations
  • flag weak evidence environments
  • recommend evidence expansion when required

Rule

AI systems must remain sensitivity-aware.


Reporting Layer

Experiment reports should communicate:

  • evidence sufficiency
  • variance conditions
  • power limitations
  • detectable effect assumptions
  • confidence implications

Rule

Sensitivity limitations should remain operationally visible.


Decision Governance Layer

Weakly powered experiments may require:

  • extended testing
  • additional traffic
  • reduced confidence claims
  • broader validation

Rule

Weak sensitivity should slow irreversible decisions.


Measurement Layer

MWMS should monitor:

  • evidence sufficiency
  • variance exposure
  • detectable effect capability
  • false negative frequency
  • confidence stability
  • experimentation efficiency

Rule

Experimentation sensitivity must remain measurable.


Cross Brain Integration

Experimentation Brain
→ owns statistical power governance

Data Brain
→ governs variance and measurement reliability

Affiliate Brain
→ validates offer evidence sufficiency

Ads Brain
→ governs creative testing sensitivity

Conversion Brain
→ evaluates optimization detectability

Finance Brain
→ governs experimentation efficiency and resource allocation

Research Brain
→ governs interpretation discipline

HeadOffice
→ governance and oversight


Failure Modes Prevented

This framework prevents:

  • underpowered experimentation
  • false negative interpretation
  • weak learning systems
  • unstable optimization decisions
  • fragmented evidence environments
  • unreliable scaling governance

Drift Protection

The system must prevent:

  • low-evidence scaling
  • ignoring variance exposure
  • weak sensitivity environments
  • fragmented traffic allocation
  • false certainty from weak detection systems
  • AI overconfidence in underpowered environments

Architectural Intent

This framework transforms MWMS experimentation thinking from:

→ surface-level testing systems

into:

→ governed evidence sensitivity systems

It ensures MWMS develops:

  • scalable experimentation reliability
  • sensitivity-aware optimization systems
  • evidence-efficient testing architectures
  • disciplined detection governance
  • long-term learning stability

Final Rule

If experimentation systems lack sufficient power:

→ meaningful improvements may remain invisible.


Change Log

Version: v1.0

Date: 2026-05-07
Author: HeadOffice

Change:
Created Statistical Power Framework defining experimentation sensitivity governance, evidence sufficiency systems, variance-aware detection logic, and scalable learning reliability architecture.


Change Impact Declaration

Pages Created:
Experimentation Brain Statistical Power Framework

Pages Updated:
None

Pages Deprecated:
None

Registries Requiring Update:
MWMS Architecture Registry
Experimentation Brain Page Registry

Canon Version Update Required:
No

Change Log Entry Required:
Yes


END EXPERIMENTATION BRAIN STATISTICAL POWER FRAMEWORK v1.0