-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathcoordination.qmd
358 lines (279 loc) · 13.7 KB
/
coordination.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
---
title: "The Importance of Coordination"
share:
permalink: "https://book.martinez.fyi/coordination.html"
description: "Understanding and managing experimental interference in business settings"
linkedin: true
email: true
mastodon: true
---
## Introduction
In large companies, multiple teams often run experiments simultaneously. It is
common practice in such organizations to run experiments independently, without
coordination. There is a belief among some that they are fine because they are
both randomizing. The truth is that the lack of coordination can lead to
unanticipated consequences, including contamination of control groups and
obscuring the true interaction effects. This chapter focuses on why coordinating
experimental designs is crucial for obtaining reliable business insights and
making business decisions.
## A Motivating Example
Let's explore a scenario where two teams are independently trying to improve the
same outcome (e.g., conversion rate, customer engagement, sales):
- Team A is testing a monetary incentive (e.g., a discount, a gift card)
- Team B is testing a non-monetary incentive (e.g., a personalized message,
early access to features)
With proper coordination, these teams could run a more informative 2x2 factorial
experiment with four arms:
1. Monetary Incentive Only
2. Non-Monetary Incentive Only
3. Both Incentives Combined
4. Control (No Incentives)
This design enables understanding both individual effects and their interaction:
Does combining incentives create synergy (greater than the sum), antagonism
(less than the sum), or no interaction?
## The Problem with Uncoordinated Experiments
Without coordination, each team typically runs a two-arm experiment (their
incentive vs. control). This creates two major issues:
1. Each team's "control" group is contaminated by the other team's experiment
2. The true interaction effects between interventions remain unknown
Let's demonstrate this with synthetic data. We'll assume:
- Monetary Incentive Effect: +2 units
- Non-Monetary Incentive Effect: +1 unit
- Combined Effect: +2 units (showing a ceiling effect)
- Baseline (No Incentives): 0 units
```{r}
#| label: setup
#| message: false
library(tidyverse)
# Set seed for reproducibility
set.seed(42)
# Generate synthetic data
n <- 500 # observations per group
# Create data frame for coordinated experiment
coordinated_data <- data.frame(
monetary = rep(c(0, 1), each = 2 * n),
non_monetary = rep(c(0, 1, 0, 1), each = n)
)
# Define treatment effects
monetary_effect <- 2
non_monetary_effect <- 1
interaction_effect <- -1 # Negative interaction
# Generate outcome
coordinated_data$outcome <- with(
coordinated_data,
0 + # Baseline
monetary * monetary_effect +
non_monetary * non_monetary_effect +
monetary * non_monetary * interaction_effect +
rnorm(4 * n, mean = 0, sd = 1) # Random noise
)
```
## Analyzing Uncoordinated Experiments
Let's see how each team might analyze their data in isolation:
```{r}
#| label: team-a-analysis
#| warning: false
# Team A's regression
lm(outcome ~ monetary, data = coordinated_data) %>%
broom::tidy(conf.int = TRUE) %>%
select(term, estimate, conf.low, conf.high, p.value) %>%
knitr::kable(
caption = "Team A's Analysis (Monetary Incentive)",
digits = 3
)
```
In the absence of coordination, Team A’s analysis would likely lead them to
conclude that monetary incentives increase the outcome of interest by 1.5 units
compared to no incentives, a statistically significant result. However, their
model is misspecified because some of the control units are receiving
non-monetary incentives, while some of their treated units are also receiving
non-monetary incentives.
The code below shows that the same would happen to the non-monetary incentives
team:
```{r}
#| label: team-b-analysis
#| warning: false
# Team B's regression
lm(outcome ~ non_monetary, data = coordinated_data) %>%
broom::tidy(conf.int = TRUE) %>%
select(term, estimate, conf.low, conf.high, p.value) %>%
knitr::kable(
caption = "Team B's Analysis (Non-Monetary Incentive)",
digits = 3
)
```
Team B would conclude that the impact of non-monetary incentives is 0.5.
## The Power of Coordination
Now let's analyze the data properly using a coordinated approach:
```{r}
#| label: coordinated-analysis
#| warning: false
# Full factorial analysis
lm(outcome ~ monetary * non_monetary, data = coordinated_data) %>%
broom::tidy(conf.int = TRUE) %>%
select(term, estimate, conf.low, conf.high, p.value) %>%
knitr::kable(
caption = "Coordinated Analysis (Full Factorial Design)",
digits = 3
)
```
With this analysis they would be able to correctly communicate what would happen
if they only did one of the incentives or both together. Moreover, although the
impact of the non-monetary incentives is smaller, it is possible that the ROI of
doing just the non-monetary incentive is higher and therefore that is the right
decision for their company.
## The Deceptive Case of Sub-additive Interactions
The dangers of uncoordinated experiments become even more pronounced, and
potentially misleading, when interventions exhibit **sub-additive
interactions**. This occurs when the combined effect of multiple incentives is
*less* than the sum of their individual effects. In some unfortunate scenarios,
the combined effect can even be less than the effect of the most potent
incentive applied in isolation.
Imagine, for instance, that our monetary and non-monetary incentives are email
communications. If a customer receives *both* a discount offer and a
personalized message, they might perceive this as excessive or even spammy.
Perhaps the user's spam filter becomes more aggressive, or the sheer volume of
communication leads to both emails being overlooked or dismissed. In such cases,
the combined impact could be diminished, or even negated, compared to using just
one incentive type.
To simulate this sub-additive interaction, let's adjust our synthetic data.
We'll keep the monetary incentive effect strong and positive, and the
non-monetary incentive with a small positive effect in isolation. However, we'll
introduce a *negative* interaction effect that is large enough to make the
combined effect smaller than the sum of the individual effects:
```{r}
#| label: sub-additive-effects
#| message: false
#| warning: false
# Create data frame for coordinated experiment
coordinated_data_modified <- data.frame(
monetary = rep(c(0, 1), each = 2 * n),
non_monetary = rep(c(0, 1, 0, 1), each = n)
)
# Define treatment effects - SUB-ADDITIVE EFFECTS
monetary_effect <- 2
non_monetary_effect <- 0.5 # Small positive effect
interaction_effect <- -1.5 # Strong negative interaction
# Generate outcome
coordinated_data_modified$outcome <- with(
coordinated_data_modified,
0 + # Baseline
monetary * monetary_effect +
non_monetary * non_monetary_effect +
monetary * non_monetary * interaction_effect +
rnorm(4 * n, mean = 0, sd = 1) # Random noise
)
```
Now, let's consider what happens when the non-monetary incentive team analyzes
their experimental data in isolation, oblivious to the monetary incentive
experiment running concurrently. They would perform their standard regression
analysis, focusing solely on the `non_monetary` variable:
```{r}
#| label: team-b-analysis-sub-additive
#| warning: false
# Team B's regression (Non-Monetary Incentive - Sub-additive Interaction)
lm(outcome ~ non_monetary, data = coordinated_data_modified) %>%
broom::tidy(conf.int = TRUE) %>%
select(term, estimate, conf.low, conf.high, p.value) %>%
knitr::kable(
caption = "Team B's Analysis (Non-Monetary Incentive) - Sub-additive Interaction",
digits = 3
)
```
Examining Team B's isolated analysis, we observe a troubling outcome. They would
conclude that their intervention has a negative impact when the impact is
actually positive.
## Beyond the Simple Case: Real-World Complexities
This example, while illustrative, simplifies reality. Consider these additional
complexities:
- **Sequential Experiments and Carryover:** If Team A runs their experiment in
Q1 and Team B in Q2, Team A might incorrectly conclude their effect decays
over time. This could be due to Team B's intervention, creating a carryover
effect. This is particularly damaging for ROI comparisons. Team A's
intervention might be superior long-term, but this would be masked.
- **Interference Beyond Shared Outcomes:** Even experiments targeting
different outcomes can interfere. For example, if both teams use [randomized
encouragement designs](\(https://book.martinez.fyi/iv.html\)) (even for
different actions), they are competing for limited user attention and
decision-making capacity. Encouraging action A can impact the likelihood of
action B, and vice versa. This creates subtle but important dependencies.
- **The Difficulty of Estimating Interactions:** As [Andrew Gelman points
out](https://statmodeling.stat.columbia.edu/2018/03/15/need16/), estimating
interaction effects often requires much larger sample sizes than estimating
main effects. Trying to do too much at once can lead to noisy, inconclusive
results. Prioritize the most critical business questions, and be prepared to
make trade-offs.
## Building a Coordination Framework
To address these challenges, organizations need a systematic approach to
experimental coordination. The following framework combines practical management
steps with institutional practices to ensure reliable, interpretable results:
### Pre-registration: The Foundation of Effective Coordination
Pre-registration involves publicly documenting your study's design, hypotheses,
and analysis plan before collecting or analyzing data. This practice has gained
significant traction across research fields for good reason:
The importance of pre-registration has been increasingly recognized across
various research fields. In medical research, a significant milestone was
the 2005 requirement by the International Committee of Medical Journal Editors
(ICMJE) for clinical trials to be registered [@zarin2005trial]. This landmark
decision aimed to combat publication bias and selective reporting of trial
outcomes, ensuring that the full scope of research efforts, regardless of the
results, would be publicly accessible.
In social sciences, including psychology and economics, the pre-registration
movement gained momentum in the early 2010s, fueled by growing concerns about
replicability. The launch of registries like the [American Economic Association
(AEA) RCT Registry](https://www.aeaweb.org/news/rct-registry-over-1000) in 2013
played a crucial role in facilitating pre-registration specifically among
economists.
For large companies where multiple teams are continuously running experiments,
pre-registration serves as the cornerstone of coordination by:
1. **Creating Transparency**: A centralized pre-registration system acts as a
company-wide experiment registry, making all planned and ongoing experiments
visible to relevant stakeholders.
2. **Preventing Interference**: Teams can identify potential overlaps or
interferences before launching experiments, enabling proactive coordination.
3. **Reducing Bias**: By documenting hypotheses and analysis plans beforehand,
teams are less likely to engage in selective reporting or post-hoc
rationalization.
4. **Improving Design Quality**: The act of pre-registering forces more
thoughtful experimental design and clarifies the key questions being
addressed.
### Coordination Checklist
In addition to pre-registration, follow these key steps for effective
experimental coordination:
1. **Map the Experimental Landscape**: Before launching any experiment, use the
pre-registration system to identify potential interference with other
ongoing or planned tests.
2. **Consider All Interaction Types**: Think about both direct (shared outcome)
and indirect (competing for resources, attention, or carryover)
interactions.
3. **Embrace Factorial Designs**: Whenever feasible, use factorial designs to
disentangle main effects and interactions. This provides a much richer
understanding.
4. **Prioritize and Simplify**: Balance the desire for complete understanding
with the need for statistical power. Focus on the most important
interactions first.
5. **Document, Document, Document**: Clearly state all assumptions about
potential interference in your experimental design and analysis plan. This
promotes transparency and facilitates learning.
6. **Communicate**: If full coordination isn't possible, at least communicate
with other teams. Sharing experimental designs and timelines can help
mitigate some of the risks.
## Conclusion: The Business Value of Coordinated Experimentation
Coordinated experimentation isn't just a technical best practice—it delivers
tangible business value. By implementing the framework described in this
chapter, organizations can:
1. **Make Better-Informed Decisions**: Understanding true treatment effects and
interactions leads to more accurate ROI calculations and better
prioritization.
2. **Avoid Costly Mistakes**: Prevent the adoption of interventions that might
appear beneficial in isolation but have negative interactions with other
initiatives.
3. **Accelerate Learning**: A systematic approach to coordination builds
institutional knowledge more efficiently than siloed experimentation.
4. **Optimize Resource Allocation**: By understanding interaction effects,
companies can identify where complementary initiatives deliver outsized
returns or where efforts are redundant.
By embracing pre-registration and following a structured coordination framework,
business data scientists can avoid the pitfalls of uncoordinated
experimentation, leading to more reliable results, better decisions, and
ultimately, a greater impact on the business.