
Survival Analysis Step 2: Subgroup and Interaction
survival-analysis-step2.RmdWhat this step is for
This tutorial is the second survival-analysis step in
leo.ukb. After the main Cox model, the next practical
question is usually:
does the exposure-outcome association differ across groups, and if so, on what scale?
That question can lead to three related but different outputs:
- a formal multiplicative interaction test
- a subgroup table that describes the pattern of hazard ratios
- an additive-interaction summary for binary-binary contrasts
The key is not to mix them up. Step 2 is where we learn how to choose the right one and how to read the results without over-interpreting them.
Start from the data and analytic roles
We keep using survival::lung, but this time we build the
data frame around teaching roles rather than only model fitting.
lung_df <- dplyr::transmute(
survival::lung,
outcome = as.integer(status == 2),
outcome_censor = time / 365.25,
age = age,
age_high = factor(
ifelse(age >= 65, "Older", "Younger"),
levels = c("Younger", "Older")
),
sex = factor(sex, levels = c(1, 2), labels = c("Male", "Female")),
ecog_group3 = factor(
dplyr::case_when(
ph.ecog == 0 ~ "ECOG0",
ph.ecog == 1 ~ "ECOG1",
ph.ecog >= 2 ~ "ECOG2plus",
TRUE ~ NA_character_
),
levels = c("ECOG0", "ECOG1", "ECOG2plus")
),
ecog_binary = factor(
ifelse(ph.ecog >= 2, "Poor", "Good"),
levels = c("Good", "Poor")
)
)Before fitting anything, look at the structure the way you would in a real paper:
dplyr::glimpse(lung_df)
#> Rows: 228
#> Columns: 7
#> $ outcome <int> 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ outcome_censor <dbl> 0.8377823, 1.2457221, 2.7652293, 0.5749487, 2.4175222, …
#> $ age <dbl> 74, 68, 56, 57, 60, 74, 68, 71, 53, 61, 57, 68, 68, 60,…
#> $ age_high <fct> Older, Older, Younger, Younger, Younger, Older, Older, …
#> $ sex <fct> Male, Male, Male, Male, Male, Male, Female, Female, Mal…
#> $ ecog_group3 <fct> ECOG1, ECOG0, ECOG0, ECOG1, ECOG0, ECOG1, ECOG2plus, EC…
#> $ ecog_binary <fct> Good, Good, Good, Good, Good, Good, Poor, Poor, Good, P…| Variable | Type | Teaching role |
|---|---|---|
outcome |
Binary event indicator | Outcome event |
outcome_censor |
Follow-up time in years | Outcome time |
age |
Continuous exposure | Main exposure in multiplicative examples |
sex |
Binary modifier | Binary modifier for multiplicative interaction |
ecog_group3 |
3-level ECOG performance-status grouping | Multi-level modifier for subgroup display |
age_high |
Binary age grouping | Binary age variable for additive interaction |
ecog_binary |
Binary ECOG performance-status grouping | Binary ECOG modifier for additive interaction |
It is also worth checking the subgroup and joint-cell sizes before talking about effect modification.
knitr::kable(
dplyr::count(lung_df, ecog_group3, outcome),
caption = "Outcome counts across the multi-level subgroup"
)| ecog_group3 | outcome | n |
|---|---|---|
| ECOG0 | 0 | 26 |
| ECOG0 | 1 | 37 |
| ECOG1 | 0 | 31 |
| ECOG1 | 1 | 82 |
| ECOG2plus | 0 | 6 |
| ECOG2plus | 1 | 45 |
| NA | 1 | 1 |
knitr::kable(
dplyr::count(lung_df, age_high, ecog_binary),
caption = "Four joint cells for the additive-interaction example"
)| age_high | ecog_binary | n |
|---|---|---|
| Younger | Good | 107 |
| Younger | Poor | 20 |
| Younger | NA | 1 |
| Older | Good | 69 |
| Older | Poor | 31 |
The second table is especially important because additive interaction needs a binary-by-binary setup with all four cells present.
Choose the scale before fitting
The easiest way to avoid confusion is to decide what question you are asking before you pick the function.
decision_table <- data.frame(
Goal = c(
"Formal multiplicative interaction",
"Subgroup display",
"Additive interaction"
),
Exposure_types = c(
"Continuous, binary, or multi-level categorical",
"Continuous, binary, or categorical",
"Binary only"
),
Modifier_types = c(
"Continuous, binary, or multi-level categorical",
"Binary or multi-level categorical",
"Binary only"
),
Main_function = c(
"leo_cox_interaction()",
"leo_cox_subgroup()",
"leo_cox_add_interaction()"
),
Main_question = c(
"Does the association differ on the hazard-ratio scale?",
"What does the HR look like within each stratum?",
"Does the joint effect exceed what we expect under additivity?"
)
)
knitr::kable(
decision_table,
caption = "Which Step 2 tool matches which scientific question?"
)| Goal | Exposure_types | Modifier_types | Main_function | Main_question |
|---|---|---|---|---|
| Formal multiplicative interaction | Continuous, binary, or multi-level categorical | Continuous, binary, or multi-level categorical | leo_cox_interaction() | Does the association differ on the hazard-ratio scale? |
| Subgroup display | Continuous, binary, or categorical | Binary or multi-level categorical | leo_cox_subgroup() | What does the HR look like within each stratum? |
| Additive interaction | Binary only | Binary only | leo_cox_add_interaction() | Does the joint effect exceed what we expect under additivity? |
This table is the practical rule of thumb:
- multiplicative interaction is the most general option
- subgroup tables need a categorical modifier because they need explicit strata
- additive interaction is currently the most restrictive because
RERI,AP, andSrequire a binary-by-binary contrast
If a modifier is continuous, leo_cox_interaction() can
still test it formally, but leo_cox_subgroup() is only
appropriate after a clinically meaningful grouping has been defined.
Multiplicative interaction is the general formal test
In a Cox model, multiplicative interaction asks whether the exposure effect differs across levels of a modifier on the hazard-ratio scale.
Example 1: continuous exposure by binary modifier
Here we ask whether the age-outcome association differs by sex.
res_inter_sex <- leo_cox_interaction(
df = lung_df,
y_out = c("outcome", "outcome_censor"),
x_exp = "age",
x_inter = "sex",
x_cov = list(
"Crude" = NULL,
"Model A" = c("ecog_group3")
),
verbose = FALSE
)
res_inter_sex$result
#> Model Exposure Interaction N Case N Control N Person-years Interaction DF
#> 1 Crude age sex 227 164 63 190.3409 1
#> 2 Model A age sex 227 164 63 190.3409 1
#> P for interaction Exposure class Interaction class
#> 1 0.602 Continuous Binary
#> 2 0.180 Continuous BinaryHow to read this result:
-
Exposure class = Continuousmeansageis handled as a continuous exposure -
Interaction class = Binarymeanssexis treated as a binary modifier -
P for interactionis the formal heterogeneity test
In this teaching example, the adjusted P for interaction
is 0.18, so there is no clear evidence that the age-outcome
association differs by sex on the multiplicative hazard-ratio scale.
That is already a useful result. A non-significant interaction does not mean “nothing happened”; it means the data do not strongly support effect modification on this scale.
Example 2: continuous exposure by multi-level categorical modifier
The same function also works for a multi-level subgroup variable.
res_inter_ecog <- leo_cox_interaction(
df = lung_df,
y_out = c("outcome", "outcome_censor"),
x_exp = "age",
x_inter = "ecog_group3",
x_cov = list("Crude" = NULL),
verbose = FALSE
)
res_inter_ecog$result
#> Model Exposure Interaction N Case N Control N Person-years Interaction DF
#> 1 Crude age ecog_group3 227 164 63 190.3409 2
#> P for interaction Exposure class Interaction class
#> 1 0.037 Continuous Categorical (3 levels)Here the modifier is a 3-level categorical variable, and the
interaction test returns P for interaction = 0.037.
That means the age-outcome association appears to differ across ECOG groups on the multiplicative hazard-ratio scale. Because the modifier is categorical, a subgroup table is now a natural follow-up.
Subgroup analysis describes the pattern after the formal test
leo_cox_subgroup() is not a replacement for the
interaction test. It is the display layer that helps you describe what
the pattern looks like once a categorical modifier has been chosen.
res_sub_ecog <- leo_cox_subgroup(
df = lung_df,
y_out = c("outcome", "outcome_censor"),
x_exp = "age",
x_subgroup = "ecog_group3",
x_cov = list("Crude" = NULL),
verbose = FALSE
)
res_sub_ecog$result
#> Subgroup Level N Model Exposure Outcome Exposure level Case N
#> 1 ecog_group3 ECOG0 63 Crude age outcome Per unit increase 37
#> 2 ecog_group3 ECOG1 113 Crude age outcome Per unit increase 82
#> 3 ecog_group3 ECOG2plus 51 Crude age outcome Per unit increase 45
#> Control N Person-years HR 95% CI P value P for interaction
#> 1 26 60.69268 1.059 1.013, 1.107 0.011 0.037
#> 2 31 97.28131 1.004 0.980, 1.029 0.731 0.037
#> 3 6 32.36687 0.994 0.958, 1.030 0.733 0.037This subgroup table tells a more descriptive story:
- in
ECOG0, the HR per one-year increase in age is1.059 - in
ECOG1, the HR is close to null at1.004 - in
ECOG2plus, the HR is also close to null at0.994 - the matching
P for interaction = 0.037reminds us that the subgroup pattern should be interpreted through the formal interaction test
In manuscript language, the safest interpretation is:
There was evidence that the age-outcome association differed by ECOG performance status on the multiplicative scale. The age-associated hazard increase appeared stronger in
ECOG0than inECOG1orECOG2plus.
This is much better than writing “one subgroup was significant and the others were not”, which is not a proper interaction argument.
Additive interaction answers a different question
Multiplicative and additive interaction are not the same. Additive interaction asks whether the joint effect of two binary factors is more or less than what we would expect from adding their separate effects.
That is why the current teaching workflow in leo.ukb is
binary-by-binary.
Here we use:
-
ecog_binary: poorer versus better ECOG performance status -
age_high: older versus younger age
The binary age grouping is only for teaching additive interaction. In a real analysis, the cutpoint should be clinically justified in advance rather than chosen after looking at the results.
res_add <- leo_cox_add_interaction(
df = lung_df,
y_out = c("outcome", "outcome_censor"),
x_exp = "ecog_binary",
x_inter = "age_high",
x_cov = list("Crude" = NULL),
verbose = FALSE
)
res_add$result
#> Model Exposure Interaction N Case N Control N Person-years
#> 1 Crude ecog_binary age_high 227 164 63 190.3409
#> Reference group Recode applied Multiplicative interaction
#> 1 ecog_binary=Good, age_high=Younger FALSE 0.454
#> Multiplicative 95% CI P for interaction RERI RERI 95% CI AP
#> 1 0.226, 0.911 0.029 -1.475 -3.199, 0.249 -0.739
#> AP 95% CI S S 95% CI
#> 1 -1.723, 0.245 0.403 0.154, 1.056This output should be read in layers.
First, the setup columns:
-
Reference grouptells you which joint category is treated as the baseline -
Recode appliedtells you whether the function had to recode the binary variables internally so the reference structure was interpretable
Then the scale-specific measures:
-
Multiplicative interaction: interaction on the hazard-ratio scale -
RERI: relative excess risk due to interaction -
AP: attributable proportion due to interaction -
S: synergy index
The null values differ by measure:
- multiplicative interaction:
1 -
RERI:0 -
AP:0 -
S:1
In this teaching example:
- the multiplicative interaction estimate is
0.454withP for interaction = 0.029 -
RERI = -1.475andAP = -0.739both point toward negative interaction on the additive scale - but the confidence intervals for
RERI,AP, andSall include their null values
The first bullet is easy to misread unless we spell out what the
number is. Multiplicative interaction is not the
joint-group hazard ratio itself. It is:
HR11 / (HR10 × HR01)
So a value below 1 does not mean the joint group is
automatically protective or that the joint-group hazard ratio must be
below 1. It means:
the observed joint effect is smaller than what we would expect if the two separate hazard ratios combined by simple multiplication.
For a concrete example:
- if
ecog_binaryalone hadHR = 2 - and
age_highalone hadHR = 3 - then under no multiplicative interaction the joint-group HR would be
expected to be
2 × 3 = 6 - if the multiplicative interaction term were
0.454, the joint-group HR would instead be about2 × 3 × 0.454 = 2.724
That joint-group HR is still above 1; it is just weaker
than the pure multiplicative expectation. In other words, this points to
negative interaction, or antagonism, on the multiplicative scale.
So the most careful interpretation is:
The binary contrast showed evidence of interaction on the multiplicative scale, but additive-interaction estimates were imprecise and did not provide clear evidence on the additive scale.
This is a good teaching result because it shows that the answer can depend on the scale. A significant multiplicative interaction does not guarantee a clear additive interaction, and vice versa.
How to interpret common result patterns
The practical reading rules are:
-
Multiplicative interaction supported, additive interaction not supported
State that effect modification is supported on the hazard-ratio scale, while additive-interaction evidence remains uncertain.
-
Additive interaction supported, multiplicative interaction not supported
State that the joint effect exceeds additivity, even though the hazard-ratio scale does not show clear multiplicative deviation.
-
Both supported
State explicitly that the association differs on both scales.
-
Neither supported
State that there is no clear evidence of effect modification in the current data.
And one rule should always stay in place:
“one subgroup was significant and another was not” is not itself evidence of interaction.
How to write Step 2 results in a paper
Here are manuscript-style templates that match the outputs above.
For a null multiplicative interaction:
There was no clear evidence that the association between age and the event differed by sex after adjustment (
P for interaction = 0.18).
For a supported multiplicative interaction followed by a subgroup table:
The association between age and the event differed by ECOG performance status (
P for interaction = 0.037). In subgroup analyses, the hazard ratio per one-year increase in age was1.059(95% CI 1.013, 1.107) inECOG0, but was close to null inECOG1andECOG2plus.
For additive interaction:
For the binary contrast between poor performance status and older age, the multiplicative interaction term was
0.454(95% CI 0.226, 0.911), indicating that the joint effect was smaller than expected under pure multiplication of the two separate hazard ratios; this does not itself mean that the joint-group HR was below1. Additive-interaction estimates remained imprecise (RERI -1.475,95% CI -3.199, 0.249).
This wording keeps the scale explicit and avoids over-claiming.
What to take away from Step 2
Step 2 should leave you with five practical rules:
- multiplicative interaction is the most general formal test in the Cox workflow
- subgroup tables are descriptive and work best when the modifier is categorical
- additive interaction currently belongs to binary-by-binary contrasts
- the same data can look different on multiplicative and additive scales
- interpretation should follow the scale actually tested, not only the
raw subgroup
P values
Step 3 then asks a different question:
could part of the exposure-outcome association operate through a mediator?