Skip to contents

What this step is for

This tutorial is the second survival-analysis step in leo.ukb. After the main Cox model, the next practical question is usually:

does the exposure-outcome association differ across groups, and if so, on what scale?

That question can lead to three related but different outputs:

  1. a formal multiplicative interaction test
  2. a subgroup table that describes the pattern of hazard ratios
  3. an additive-interaction summary for binary-binary contrasts

The key is not to mix them up. Step 2 is where we learn how to choose the right one and how to read the results without over-interpreting them.

Start from the data and analytic roles

We keep using survival::lung, but this time we build the data frame around teaching roles rather than only model fitting.

lung_df <- dplyr::transmute(
  survival::lung,
  outcome = as.integer(status == 2),
  outcome_censor = time / 365.25,
  age = age,
  age_high = factor(
    ifelse(age >= 65, "Older", "Younger"),
    levels = c("Younger", "Older")
  ),
  sex = factor(sex, levels = c(1, 2), labels = c("Male", "Female")),
  ecog_group3 = factor(
    dplyr::case_when(
      ph.ecog == 0 ~ "ECOG0",
      ph.ecog == 1 ~ "ECOG1",
      ph.ecog >= 2 ~ "ECOG2plus",
      TRUE ~ NA_character_
    ),
    levels = c("ECOG0", "ECOG1", "ECOG2plus")
  ),
  ecog_binary = factor(
    ifelse(ph.ecog >= 2, "Poor", "Good"),
    levels = c("Good", "Poor")
  )
)

Before fitting anything, look at the structure the way you would in a real paper:

dplyr::glimpse(lung_df)
#> Rows: 228
#> Columns: 7
#> $ outcome        <int> 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1…
#> $ outcome_censor <dbl> 0.8377823, 1.2457221, 2.7652293, 0.5749487, 2.4175222, 
#> $ age            <dbl> 74, 68, 56, 57, 60, 74, 68, 71, 53, 61, 57, 68, 68, 60,
#> $ age_high       <fct> Older, Older, Younger, Younger, Younger, Older, Older, 
#> $ sex            <fct> Male, Male, Male, Male, Male, Male, Female, Female, Mal…
#> $ ecog_group3    <fct> ECOG1, ECOG0, ECOG0, ECOG1, ECOG0, ECOG1, ECOG2plus, EC…
#> $ ecog_binary    <fct> Good, Good, Good, Good, Good, Good, Poor, Poor, Good, P…
Analytic roles in the Step 2 teaching data frame
Variable Type Teaching role
outcome Binary event indicator Outcome event
outcome_censor Follow-up time in years Outcome time
age Continuous exposure Main exposure in multiplicative examples
sex Binary modifier Binary modifier for multiplicative interaction
ecog_group3 3-level ECOG performance-status grouping Multi-level modifier for subgroup display
age_high Binary age grouping Binary age variable for additive interaction
ecog_binary Binary ECOG performance-status grouping Binary ECOG modifier for additive interaction

It is also worth checking the subgroup and joint-cell sizes before talking about effect modification.

knitr::kable(
  dplyr::count(lung_df, ecog_group3, outcome),
  caption = "Outcome counts across the multi-level subgroup"
)
Outcome counts across the multi-level subgroup
ecog_group3 outcome n
ECOG0 0 26
ECOG0 1 37
ECOG1 0 31
ECOG1 1 82
ECOG2plus 0 6
ECOG2plus 1 45
NA 1 1
knitr::kable(
  dplyr::count(lung_df, age_high, ecog_binary),
  caption = "Four joint cells for the additive-interaction example"
)
Four joint cells for the additive-interaction example
age_high ecog_binary n
Younger Good 107
Younger Poor 20
Younger NA 1
Older Good 69
Older Poor 31

The second table is especially important because additive interaction needs a binary-by-binary setup with all four cells present.

Choose the scale before fitting

The easiest way to avoid confusion is to decide what question you are asking before you pick the function.

decision_table <- data.frame(
  Goal = c(
    "Formal multiplicative interaction",
    "Subgroup display",
    "Additive interaction"
  ),
  Exposure_types = c(
    "Continuous, binary, or multi-level categorical",
    "Continuous, binary, or categorical",
    "Binary only"
  ),
  Modifier_types = c(
    "Continuous, binary, or multi-level categorical",
    "Binary or multi-level categorical",
    "Binary only"
  ),
  Main_function = c(
    "leo_cox_interaction()",
    "leo_cox_subgroup()",
    "leo_cox_add_interaction()"
  ),
  Main_question = c(
    "Does the association differ on the hazard-ratio scale?",
    "What does the HR look like within each stratum?",
    "Does the joint effect exceed what we expect under additivity?"
  )
)

knitr::kable(
  decision_table,
  caption = "Which Step 2 tool matches which scientific question?"
)
Which Step 2 tool matches which scientific question?
Goal Exposure_types Modifier_types Main_function Main_question
Formal multiplicative interaction Continuous, binary, or multi-level categorical Continuous, binary, or multi-level categorical leo_cox_interaction() Does the association differ on the hazard-ratio scale?
Subgroup display Continuous, binary, or categorical Binary or multi-level categorical leo_cox_subgroup() What does the HR look like within each stratum?
Additive interaction Binary only Binary only leo_cox_add_interaction() Does the joint effect exceed what we expect under additivity?

This table is the practical rule of thumb:

  • multiplicative interaction is the most general option
  • subgroup tables need a categorical modifier because they need explicit strata
  • additive interaction is currently the most restrictive because RERI, AP, and S require a binary-by-binary contrast

If a modifier is continuous, leo_cox_interaction() can still test it formally, but leo_cox_subgroup() is only appropriate after a clinically meaningful grouping has been defined.

Multiplicative interaction is the general formal test

In a Cox model, multiplicative interaction asks whether the exposure effect differs across levels of a modifier on the hazard-ratio scale.

Example 1: continuous exposure by binary modifier

Here we ask whether the age-outcome association differs by sex.

res_inter_sex <- leo_cox_interaction(
  df = lung_df,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "age",
  x_inter = "sex",
  x_cov = list(
    "Crude" = NULL,
    "Model A" = c("ecog_group3")
  ),
  verbose = FALSE
)

res_inter_sex$result
#>     Model Exposure Interaction   N Case N Control N Person-years Interaction DF
#> 1   Crude      age         sex 227    164        63     190.3409              1
#> 2 Model A      age         sex 227    164        63     190.3409              1
#>   P for interaction Exposure class Interaction class
#> 1             0.602     Continuous            Binary
#> 2             0.180     Continuous            Binary

How to read this result:

  • Exposure class = Continuous means age is handled as a continuous exposure
  • Interaction class = Binary means sex is treated as a binary modifier
  • P for interaction is the formal heterogeneity test

In this teaching example, the adjusted P for interaction is 0.18, so there is no clear evidence that the age-outcome association differs by sex on the multiplicative hazard-ratio scale.

That is already a useful result. A non-significant interaction does not mean “nothing happened”; it means the data do not strongly support effect modification on this scale.

Example 2: continuous exposure by multi-level categorical modifier

The same function also works for a multi-level subgroup variable.

res_inter_ecog <- leo_cox_interaction(
  df = lung_df,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "age",
  x_inter = "ecog_group3",
  x_cov = list("Crude" = NULL),
  verbose = FALSE
)

res_inter_ecog$result
#>   Model Exposure Interaction   N Case N Control N Person-years Interaction DF
#> 1 Crude      age ecog_group3 227    164        63     190.3409              2
#>   P for interaction Exposure class      Interaction class
#> 1             0.037     Continuous Categorical (3 levels)

Here the modifier is a 3-level categorical variable, and the interaction test returns P for interaction = 0.037.

That means the age-outcome association appears to differ across ECOG groups on the multiplicative hazard-ratio scale. Because the modifier is categorical, a subgroup table is now a natural follow-up.

Subgroup analysis describes the pattern after the formal test

leo_cox_subgroup() is not a replacement for the interaction test. It is the display layer that helps you describe what the pattern looks like once a categorical modifier has been chosen.

res_sub_ecog <- leo_cox_subgroup(
  df = lung_df,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "age",
  x_subgroup = "ecog_group3",
  x_cov = list("Crude" = NULL),
  verbose = FALSE
)

res_sub_ecog$result
#>      Subgroup     Level   N Model Exposure Outcome    Exposure level Case N
#> 1 ecog_group3     ECOG0  63 Crude      age outcome Per unit increase     37
#> 2 ecog_group3     ECOG1 113 Crude      age outcome Per unit increase     82
#> 3 ecog_group3 ECOG2plus  51 Crude      age outcome Per unit increase     45
#>   Control N Person-years    HR       95% CI P value P for interaction
#> 1        26     60.69268 1.059 1.013, 1.107   0.011             0.037
#> 2        31     97.28131 1.004 0.980, 1.029   0.731             0.037
#> 3         6     32.36687 0.994 0.958, 1.030   0.733             0.037

This subgroup table tells a more descriptive story:

  • in ECOG0, the HR per one-year increase in age is 1.059
  • in ECOG1, the HR is close to null at 1.004
  • in ECOG2plus, the HR is also close to null at 0.994
  • the matching P for interaction = 0.037 reminds us that the subgroup pattern should be interpreted through the formal interaction test

In manuscript language, the safest interpretation is:

There was evidence that the age-outcome association differed by ECOG performance status on the multiplicative scale. The age-associated hazard increase appeared stronger in ECOG0 than in ECOG1 or ECOG2plus.

This is much better than writing “one subgroup was significant and the others were not”, which is not a proper interaction argument.

Additive interaction answers a different question

Multiplicative and additive interaction are not the same. Additive interaction asks whether the joint effect of two binary factors is more or less than what we would expect from adding their separate effects.

That is why the current teaching workflow in leo.ukb is binary-by-binary.

Here we use:

  • ecog_binary: poorer versus better ECOG performance status
  • age_high: older versus younger age

The binary age grouping is only for teaching additive interaction. In a real analysis, the cutpoint should be clinically justified in advance rather than chosen after looking at the results.

res_add <- leo_cox_add_interaction(
  df = lung_df,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "ecog_binary",
  x_inter = "age_high",
  x_cov = list("Crude" = NULL),
  verbose = FALSE
)

res_add$result
#>   Model    Exposure Interaction   N Case N Control N Person-years
#> 1 Crude ecog_binary    age_high 227    164        63     190.3409
#>                      Reference group Recode applied Multiplicative interaction
#> 1 ecog_binary=Good, age_high=Younger          FALSE                      0.454
#>   Multiplicative 95% CI P for interaction   RERI   RERI 95% CI     AP
#> 1          0.226, 0.911             0.029 -1.475 -3.199, 0.249 -0.739
#>       AP 95% CI     S     S 95% CI
#> 1 -1.723, 0.245 0.403 0.154, 1.056

This output should be read in layers.

First, the setup columns:

  • Reference group tells you which joint category is treated as the baseline
  • Recode applied tells you whether the function had to recode the binary variables internally so the reference structure was interpretable

Then the scale-specific measures:

  • Multiplicative interaction: interaction on the hazard-ratio scale
  • RERI: relative excess risk due to interaction
  • AP: attributable proportion due to interaction
  • S: synergy index

The null values differ by measure:

  • multiplicative interaction: 1
  • RERI: 0
  • AP: 0
  • S: 1

In this teaching example:

  • the multiplicative interaction estimate is 0.454 with P for interaction = 0.029
  • RERI = -1.475 and AP = -0.739 both point toward negative interaction on the additive scale
  • but the confidence intervals for RERI, AP, and S all include their null values

The first bullet is easy to misread unless we spell out what the number is. Multiplicative interaction is not the joint-group hazard ratio itself. It is:

HR11 / (HR10 × HR01)

So a value below 1 does not mean the joint group is automatically protective or that the joint-group hazard ratio must be below 1. It means:

the observed joint effect is smaller than what we would expect if the two separate hazard ratios combined by simple multiplication.

For a concrete example:

  • if ecog_binary alone had HR = 2
  • and age_high alone had HR = 3
  • then under no multiplicative interaction the joint-group HR would be expected to be 2 × 3 = 6
  • if the multiplicative interaction term were 0.454, the joint-group HR would instead be about 2 × 3 × 0.454 = 2.724

That joint-group HR is still above 1; it is just weaker than the pure multiplicative expectation. In other words, this points to negative interaction, or antagonism, on the multiplicative scale.

So the most careful interpretation is:

The binary contrast showed evidence of interaction on the multiplicative scale, but additive-interaction estimates were imprecise and did not provide clear evidence on the additive scale.

This is a good teaching result because it shows that the answer can depend on the scale. A significant multiplicative interaction does not guarantee a clear additive interaction, and vice versa.

How to interpret common result patterns

The practical reading rules are:

  1. Multiplicative interaction supported, additive interaction not supported

    State that effect modification is supported on the hazard-ratio scale, while additive-interaction evidence remains uncertain.

  2. Additive interaction supported, multiplicative interaction not supported

    State that the joint effect exceeds additivity, even though the hazard-ratio scale does not show clear multiplicative deviation.

  3. Both supported

    State explicitly that the association differs on both scales.

  4. Neither supported

    State that there is no clear evidence of effect modification in the current data.

And one rule should always stay in place:

“one subgroup was significant and another was not” is not itself evidence of interaction.

How to write Step 2 results in a paper

Here are manuscript-style templates that match the outputs above.

For a null multiplicative interaction:

There was no clear evidence that the association between age and the event differed by sex after adjustment (P for interaction = 0.18).

For a supported multiplicative interaction followed by a subgroup table:

The association between age and the event differed by ECOG performance status (P for interaction = 0.037). In subgroup analyses, the hazard ratio per one-year increase in age was 1.059 (95% CI 1.013, 1.107) in ECOG0, but was close to null in ECOG1 and ECOG2plus.

For additive interaction:

For the binary contrast between poor performance status and older age, the multiplicative interaction term was 0.454 (95% CI 0.226, 0.911), indicating that the joint effect was smaller than expected under pure multiplication of the two separate hazard ratios; this does not itself mean that the joint-group HR was below 1. Additive-interaction estimates remained imprecise (RERI -1.475, 95% CI -3.199, 0.249).

This wording keeps the scale explicit and avoids over-claiming.

What to take away from Step 2

Step 2 should leave you with five practical rules:

  1. multiplicative interaction is the most general formal test in the Cox workflow
  2. subgroup tables are descriptive and work best when the modifier is categorical
  3. additive interaction currently belongs to binary-by-binary contrasts
  4. the same data can look different on multiplicative and additive scales
  5. interpretation should follow the scale actually tested, not only the raw subgroup P values

Step 3 then asks a different question:

could part of the exposure-outcome association operate through a mediator?