coordination.qmd

---
title: "The Importance of Coordination"
share:
  permalink: "https://book.martinez.fyi/coordination.html"
  description: "Understanding and managing experimental interference in business settings"
  linkedin: true
  email: true
  mastodon: true
---

## Introduction

In large companies, multiple teams often run experiments simultaneously. It is
common practice in such organizations to run experiments independently, without
coordination. There is a belief among some that they are fine because they are
both randomizing. The truth is that the lack of coordination can lead to
unanticipated consequences, including contamination of control groups and
obscuring the true interaction effects. This chapter focuses on why coordinating
experimental designs is crucial for obtaining reliable business insights and
making business decisions.

## A Motivating Example

Let's explore a scenario where two teams are independently trying to improve the
same outcome (e.g., conversion rate, customer engagement, sales):

  - Team A is testing a monetary incentive (e.g., a discount, a gift card)
  - Team B is testing a non-monetary incentive (e.g., a personalized message,
    early access to features)

With proper coordination, these teams could run a more informative 2x2 factorial
experiment with four arms:

1.  Monetary Incentive Only
2.  Non-Monetary Incentive Only
3.  Both Incentives Combined
4.  Control (No Incentives)

This design enables understanding both individual effects and their interaction:
Does combining incentives create synergy (greater than the sum), antagonism
(less than the sum), or no interaction?

## The Problem with Uncoordinated Experiments

Without coordination, each team typically runs a two-arm experiment (their
incentive vs. control). This creates two major issues:

1.  Each team's "control" group is contaminated by the other team's experiment
2.  The true interaction effects between interventions remain unknown

Let's demonstrate this with synthetic data. We'll assume:

  - Monetary Incentive Effect: +2 units
  - Non-Monetary Incentive Effect: +1 unit
  - Combined Effect: +2 units (showing a ceiling effect)
  - Baseline (No Incentives): 0 units

```{r}
#| label: setup
#| message: false

library(tidyverse)

# Set seed for reproducibility
set.seed(42)

# Generate synthetic data
n <- 500  # observations per group

# Create data frame for coordinated experiment
coordinated_data <- data.frame(
  monetary = rep(c(0, 1), each = 2 * n),
  non_monetary = rep(c(0, 1, 0, 1), each = n)
)

# Define treatment effects
monetary_effect <- 2
non_monetary_effect <- 1
interaction_effect <- -1  # Negative interaction

# Generate outcome
coordinated_data$outcome <- with(
  coordinated_data,
  0 +  # Baseline
    monetary * monetary_effect +
    non_monetary * non_monetary_effect +
    monetary * non_monetary * interaction_effect +
    rnorm(4 * n, mean = 0, sd = 1)  # Random noise
)
```

## Analyzing Uncoordinated Experiments

Let's see how each team might analyze their data in isolation:

```{r}
#| label: team-a-analysis
#| warning: false

# Team A's regression
lm(outcome ~ monetary, data = coordinated_data) %>% 
  broom::tidy(conf.int = TRUE) %>%
  select(term, estimate, conf.low, conf.high, p.value) %>% 
  knitr::kable(
  caption = "Team A's Analysis (Monetary Incentive)",
  digits = 3
)
```

In the absence of coordination, Team A’s analysis would likely lead them to
conclude that monetary incentives increase the outcome of interest by 1.5 units
compared to no incentives, a statistically significant result. However, their
model is misspecified because some of the control units are receiving
non-monetary incentives, while some of their treated units are also receiving
non-monetary incentives.

The code below shows that the same would happen to the non-monetary incentives
team:

```{r}
#| label: team-b-analysis
#| warning: false

# Team B's regression
lm(outcome ~ non_monetary, data = coordinated_data) %>% 
  broom::tidy(conf.int = TRUE) %>%
  select(term, estimate, conf.low, conf.high, p.value) %>% 
  knitr::kable(
  caption = "Team B's Analysis (Non-Monetary Incentive)",
  digits = 3
)

```

Team B would conclude that the impact of non-monetary incentives is 0.5.

## The Power of Coordination

Now let's analyze the data properly using a coordinated approach:

```{r}
#| label: coordinated-analysis
#| warning: false

# Full factorial analysis
lm(outcome ~ monetary * non_monetary, data = coordinated_data) %>% 
  broom::tidy(conf.int = TRUE) %>%
  select(term, estimate, conf.low, conf.high, p.value) %>% 
  knitr::kable(
  caption = "Coordinated Analysis (Full Factorial Design)",
  digits = 3
)
```

With this analysis they would be able to correctly communicate what would happen
if they only did one of the incentives or both together. Moreover, although the
impact of the non-monetary incentives is smaller, it is possible that the ROI of
doing just the non-monetary incentive is higher and therefore that is the right
decision for their company.

## The Deceptive Case of Sub-additive Interactions

The dangers of uncoordinated experiments become even more pronounced, and
potentially misleading, when interventions exhibit **sub-additive
interactions**. This occurs when the combined effect of multiple incentives is
*less* than the sum of their individual effects. In some unfortunate scenarios,
the combined effect can even be less than the effect of the most potent
incentive applied in isolation.

Imagine, for instance, that our monetary and non-monetary incentives are email
communications. If a customer receives *both* a discount offer and a
personalized message, they might perceive this as excessive or even spammy.
Perhaps the user's spam filter becomes more aggressive, or the sheer volume of
communication leads to both emails being overlooked or dismissed. In such cases,
the combined impact could be diminished, or even negated, compared to using just
one incentive type.

To simulate this sub-additive interaction, let's adjust our synthetic data.
We'll keep the monetary incentive effect strong and positive, and the
non-monetary incentive with a small positive effect in isolation. However, we'll
introduce a *negative* interaction effect that is large enough to make the
combined effect smaller than the sum of the individual effects:

```{r}
#| label: sub-additive-effects
#| message: false
#| warning: false

# Create data frame for coordinated experiment
coordinated_data_modified <- data.frame(
  monetary = rep(c(0, 1), each = 2 * n),
  non_monetary = rep(c(0, 1, 0, 1), each = n)
)

# Define treatment effects - SUB-ADDITIVE EFFECTS
monetary_effect <- 2
non_monetary_effect <- 0.5 # Small positive effect
interaction_effect <- -1.5 # Strong negative interaction

# Generate outcome
coordinated_data_modified$outcome <- with(
  coordinated_data_modified,
  0 +  # Baseline
    monetary * monetary_effect +
    non_monetary * non_monetary_effect +
    monetary * non_monetary * interaction_effect +
    rnorm(4 * n, mean = 0, sd = 1)  # Random noise
)
```

Now, let's consider what happens when the non-monetary incentive team analyzes
their experimental data in isolation, oblivious to the monetary incentive
experiment running concurrently. They would perform their standard regression
analysis, focusing solely on the `non_monetary` variable:

```{r}
#| label: team-b-analysis-sub-additive
#| warning: false

# Team B's regression (Non-Monetary Incentive - Sub-additive Interaction)
lm(outcome ~ non_monetary, data = coordinated_data_modified) %>%
  broom::tidy(conf.int = TRUE) %>%
  select(term, estimate, conf.low, conf.high, p.value) %>%
  knitr::kable(
  caption = "Team B's Analysis (Non-Monetary Incentive) - Sub-additive Interaction",
  digits = 3
)
```

Examining Team B's isolated analysis, we observe a troubling outcome. They would
conclude that their intervention has a negative impact when the impact is
actually positive.

## Beyond the Simple Case: Real-World Complexities

This example, while illustrative, simplifies reality. Consider these additional
complexities:

  - **Sequential Experiments and Carryover:** If Team A runs their experiment in
    Q1 and Team B in Q2, Team A might incorrectly conclude their effect decays
    over time. This could be due to Team B's intervention, creating a carryover
    effect. This is particularly damaging for ROI comparisons. Team A's
    intervention might be superior long-term, but this would be masked.

  - **Interference Beyond Shared Outcomes:** Even experiments targeting
    different outcomes can interfere. For example, if both teams use [randomized
    encouragement designs](\(https://book.martinez.fyi/iv.html\)) (even for
    different actions), they are competing for limited user attention and
    decision-making capacity. Encouraging action A can impact the likelihood of
    action B, and vice versa. This creates subtle but important dependencies.

  - **The Difficulty of Estimating Interactions:** As [Andrew Gelman points
    out](https://statmodeling.stat.columbia.edu/2018/03/15/need16/), estimating
    interaction effects often requires much larger sample sizes than estimating
    main effects. Trying to do too much at once can lead to noisy, inconclusive
    results. Prioritize the most critical business questions, and be prepared to
    make trade-offs.

## Building a Coordination Framework

To address these challenges, organizations need a systematic approach to
experimental coordination. The following framework combines practical management
steps with institutional practices to ensure reliable, interpretable results:

### Pre-registration: The Foundation of Effective Coordination

Pre-registration involves publicly documenting your study's design, hypotheses,
and analysis plan before collecting or analyzing data. This practice has gained
significant traction across research fields for good reason:

The importance of pre-registration has been increasingly recognized across
various research fields. In medical research, a significant milestone was
the 2005 requirement by the International Committee of Medical Journal Editors
(ICMJE) for clinical trials to be registered [@zarin2005trial]. This landmark
decision aimed to combat publication bias and selective reporting of trial
outcomes, ensuring that the full scope of research efforts, regardless of the
results, would be publicly accessible.

In social sciences, including psychology and economics, the pre-registration
movement gained momentum in the early 2010s, fueled by growing concerns about
replicability. The launch of registries like the [American Economic Association
(AEA) RCT Registry](https://www.aeaweb.org/news/rct-registry-over-1000) in 2013
played a crucial role in facilitating pre-registration specifically among
economists.

For large companies where multiple teams are continuously running experiments,
pre-registration serves as the cornerstone of coordination by:

1.  **Creating Transparency**: A centralized pre-registration system acts as a
    company-wide experiment registry, making all planned and ongoing experiments
    visible to relevant stakeholders.
  
2.  **Preventing Interference**: Teams can identify potential overlaps or
    interferences before launching experiments, enabling proactive coordination.
  
3.  **Reducing Bias**: By documenting hypotheses and analysis plans beforehand,
    teams are less likely to engage in selective reporting or post-hoc
    rationalization.
  
4.  **Improving Design Quality**: The act of pre-registering forces more
    thoughtful experimental design and clarifies the key questions being
    addressed.

### Coordination Checklist

In addition to pre-registration, follow these key steps for effective
experimental coordination:

1.  **Map the Experimental Landscape**: Before launching any experiment, use the
    pre-registration system to identify potential interference with other
    ongoing or planned tests.

2.  **Consider All Interaction Types**: Think about both direct (shared outcome)
    and indirect (competing for resources, attention, or carryover)
    interactions.

3.  **Embrace Factorial Designs**: Whenever feasible, use factorial designs to
    disentangle main effects and interactions. This provides a much richer
    understanding.

4.  **Prioritize and Simplify**: Balance the desire for complete understanding
    with the need for statistical power. Focus on the most important
    interactions first.

5.  **Document, Document, Document**: Clearly state all assumptions about
    potential interference in your experimental design and analysis plan. This
    promotes transparency and facilitates learning.

6.  **Communicate**: If full coordination isn't possible, at least communicate
    with other teams. Sharing experimental designs and timelines can help
    mitigate some of the risks.


## Conclusion: The Business Value of Coordinated Experimentation

Coordinated experimentation isn't just a technical best practice—it delivers
tangible business value. By implementing the framework described in this
chapter, organizations can:

1.  **Make Better-Informed Decisions**: Understanding true treatment effects and
    interactions leads to more accurate ROI calculations and better
    prioritization.

2.  **Avoid Costly Mistakes**: Prevent the adoption of interventions that might
    appear beneficial in isolation but have negative interactions with other
    initiatives.

3.  **Accelerate Learning**: A systematic approach to coordination builds
    institutional knowledge more efficiently than siloed experimentation.

4.  **Optimize Resource Allocation**: By understanding interaction effects,
    companies can identify where complementary initiatives deliver outsized
    returns or where efforts are redundant.

By embracing pre-registration and following a structured coordination framework,
business data scientists can avoid the pitfalls of uncoordinated
experimentation, leading to more reliable results, better decisions, and
ultimately, a greater impact on the business.