Skip to contents

What this step is for

This tutorial is the Step 3.0 survival-analysis article in leo.ukb. After a main Cox model and any subgroup or interaction work, a different question may arise:

does part of the exposure-outcome association operate through a mediator?

That is a mediation question, not a subgroup question and not an interaction question.

First separate mediation from interaction

The current survival-analysis workflow in leo.ukb answers three different questions:

  1. Step 1: is there a main exposure-outcome association?
  2. Step 2: does that association differ across groups, and on which scale?
  3. Step 3.0: is part of that association transmitted through a mediator?

This distinction matters because:

  • interaction is about effect heterogeneity
  • mediation is about a pathway decomposition

In other words, mediation does not replace Step 1 or Step 2. It builds on them.

Start from the data and analytic roles

We use a synthetic example with a rare event, a binary exposure, a binary mediator, and several covariates.

set.seed(20260323)
n <- 500
age <- rnorm(n, 60, 8)
bmi <- rnorm(n, 26, 4)
smoking <- factor(sample(c("Never", "Ever"), n, replace = TRUE))
exposure_num <- rbinom(n, 1, plogis(-0.2 + 0.02 * (age - 60)))
mediator_num <- rbinom(n, 1, plogis(-0.5 + 0.9 * exposure_num + 0.02 * (bmi - 26)))
hazard_rate <- exp(
  -5.8 + 0.55 * exposure_num + 0.40 * mediator_num +
    0.02 * (age - 60) + 0.03 * (bmi - 26) + 0.20 * (smoking == "Ever")
)
event_time <- rexp(n, rate = hazard_rate)
censor_time <- rexp(n, rate = 0.08)

med_df <- data.frame(
  outcome = as.integer(event_time <= censor_time),
  outcome_censor = pmax(pmin(event_time, censor_time), 0.2),
  exposure = factor(exposure_num, levels = c(0, 1), labels = c("Low risk", "High risk")),
  mediator = factor(
    mediator_num,
    levels = c(0, 1),
    labels = c("Low inflammation", "High inflammation")
  ),
  age = age,
  bmi = bmi,
  smoking = smoking
)

Before fitting anything, look at the analytic roles explicitly.

dplyr::glimpse(med_df)
#> Rows: 500
#> Columns: 7
#> $ outcome        <int> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0…
#> $ outcome_censor <dbl> 2.6117515, 6.4186284, 8.6449270, 5.6413310, 12.1900483,
#> $ exposure       <fct> High risk, High risk, Low risk, High risk, Low risk, Lo…
#> $ mediator       <fct> Low inflammation, High inflammation, Low inflammation, 
#> $ age            <dbl> 59.73588, 59.19772, 45.83227, 57.49797, 46.83760, 55.76…
#> $ bmi            <dbl> 27.17832, 27.91178, 30.48299, 24.70933, 22.63096, 28.43…
#> $ smoking        <fct> Never, Never, Never, Never, Never, Ever, Ever, Ever, Ev…
Analytic roles in the Step 3 teaching data frame
Variable Type Step 3 role
outcome Binary event indicator Incident outcome event
outcome_censor Follow-up time Incident outcome follow-up time
exposure Binary exposure Exposure to be decomposed
mediator Binary mediator Candidate pathway variable
age Continuous covariate Adjustment covariate
bmi Continuous covariate Adjustment covariate
smoking Binary covariate Example covariate that requires explicit x_cov_cond

Inspect the analysis-ready data before fitting

As in Step 1 and Step 2, it helps to inspect the analysis-ready cohort before moving to the model.

knitr::kable(utils::head(med_df, 6))
outcome outcome_censor exposure mediator age bmi smoking
0 2.611752 High risk Low inflammation 59.73588 27.17832 Never
0 6.418628 High risk High inflammation 59.19772 27.91178 Never
0 8.644927 Low risk Low inflammation 45.83227 30.48299 Never
0 5.641331 High risk Low inflammation 57.49797 24.70933 Never
0 12.190048 Low risk Low inflammation 46.83760 22.63096 Never
0 20.568418 Low risk Low inflammation 55.76899 28.43722 Ever
event_summary <- data.frame(
  Metric = c("Participants", "Events", "Event rate"),
  Value = c(
    nrow(med_df),
    sum(med_df$outcome),
    sprintf("%.1f%%", 100 * mean(med_df$outcome))
  )
)

knitr::kable(
  event_summary,
  caption = "Overall event burden in the mediation teaching data"
)
Overall event burden in the mediation teaching data
Metric Value
Participants 500
Events 30
Event rate 6.0%
knitr::kable(
  dplyr::count(med_df, exposure, mediator),
  caption = "Exposure-mediator joint counts"
)
Exposure-mediator joint counts
exposure mediator n
Low risk Low inflammation 163
Low risk High inflammation 105
High risk Low inflammation 102
High risk High inflammation 130

These simple checks already answer three important questions:

  • is the event rate low enough that yreg = "auto" will reasonably choose the Cox path?
  • does the mediator distribution differ across exposure groups in a direction consistent with a possible pathway?
  • do the covariates look interpretable enough to define a reproducible evaluation setting later?

Choose the mediation setup before fitting

Mediation analysis is easier to defend when the design choices are explicit before the model is run.

setup_table <- data.frame(
  Component = c(
    "Exposure contrast",
    "Mediator type",
    "Outcome model",
    "Covariate evaluation profile"
  ),
  Current_rule = c(
    "Binary factor by default, or numeric with explicit x_exp_a0 and x_exp_a1",
    "Binary mediator -> logistic; continuous/score-like mediator -> linear",
    "yreg = 'auto' uses survCox when event rate <= 10%, otherwise survAFT_weibull",
    "x_cov_cond should be explicit when x_cov contains non-numeric or binary numeric variables"
  ),
  Why_it_matters = c(
    "All direct and indirect effects are defined relative to this comparison",
    "The mediator model determines how the pathway is represented",
    "The final ratio scale depends on the actual outcome model",
    "The decomposition is always evaluated at a specific covariate profile"
  )
)

knitr::kable(
  setup_table,
  caption = "Choose the mediation setup before fitting"
)
Choose the mediation setup before fitting
Component Current_rule Why_it_matters
Exposure contrast Binary factor by default, or numeric with explicit x_exp_a0 and x_exp_a1 All direct and indirect effects are defined relative to this comparison
Mediator type Binary mediator -> logistic; continuous/score-like mediator -> linear The mediator model determines how the pathway is represented
Outcome model yreg = ‘auto’ uses survCox when event rate <= 10%, otherwise survAFT_weibull The final ratio scale depends on the actual outcome model
Covariate evaluation profile x_cov_cond should be explicit when x_cov contains non-numeric or binary numeric variables The decomposition is always evaluated at a specific covariate profile

Know the method boundary before fitting

leo_cox_mediation() currently wraps regmedint::regmedint() and keeps two model settings parallel:

  • mreg controls the mediator model, i.e. x_med ~ x_exp + x_cov
  • yreg controls the outcome model, i.e. time ~ x_exp + x_med + x_cov
  • when interaction = TRUE, the outcome model becomes time ~ x_exp + x_med + x_exp:x_med + x_cov

The current wrapper is most appropriate when:

  • the outcome is time-to-event
  • the exposure is binary, or numeric with explicit x_exp_a0 and x_exp_a1
  • the mediator is binary or continuous
  • the temporal ordering is scientifically defensible

Two boundaries are especially important in UK Biobank-style analyses that rely heavily on baseline measurements:

  1. if the final yreg (the outcome-model setting here) is survCox, the decomposition relies on the rare-event approximation; in plain terms, this means the split is easier to defend as an approximate result when events are relatively rare, rather than as an exact estimate
  2. if exposure and mediator are measured at the same baseline visit, the data usually cannot show that exposure came before mediator, so the result is better written as decomposition evidence that is consistent with an assumed pathway; without external evidence on temporal ordering, it should not be written up as a strong causal claim

Fit the simplest mediation workflow

As in the Cox tutorials, define the adjustment hierarchy first.

med_models <- list(
  "Model A" = c("age"),
  "Model B" = c("age", "bmi")
)

res_med <- leo_cox_mediation(
  df = med_df,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "exposure",
  x_med = "mediator",
  x_cov = med_models,
  verbose = FALSE
)

res_med$result
#>      Model Effect code                        Effect        Scale Exposure
#> 1  Model A         cde      Controlled direct effect Hazard ratio exposure
#> 2  Model A        pnde    Pure natural direct effect Hazard ratio exposure
#> 3  Model A        tnie Total natural indirect effect Hazard ratio exposure
#> 4  Model A        tnde   Total natural direct effect Hazard ratio exposure
#> 5  Model A        pnie  Pure natural indirect effect Hazard ratio exposure
#> 6  Model A          te                  Total effect Hazard ratio exposure
#> 7  Model A          pm           Proportion mediated   Proportion exposure
#> 8  Model B         cde      Controlled direct effect Hazard ratio exposure
#> 9  Model B        pnde    Pure natural direct effect Hazard ratio exposure
#> 10 Model B        tnie Total natural indirect effect Hazard ratio exposure
#> 11 Model B        tnde   Total natural direct effect Hazard ratio exposure
#> 12 Model B        pnie  Pure natural indirect effect Hazard ratio exposure
#> 13 Model B          te                  Total effect Hazard ratio exposure
#> 14 Model B          pm           Proportion mediated   Proportion exposure
#>    Mediator Outcome     Exposure contrast Mediator reference   N Case N
#> 1  mediator outcome Low risk -> High risk  High inflammation 500     30
#> 2  mediator outcome Low risk -> High risk               <NA> 500     30
#> 3  mediator outcome Low risk -> High risk               <NA> 500     30
#> 4  mediator outcome Low risk -> High risk               <NA> 500     30
#> 5  mediator outcome Low risk -> High risk               <NA> 500     30
#> 6  mediator outcome Low risk -> High risk               <NA> 500     30
#> 7  mediator outcome Low risk -> High risk               <NA> 500     30
#> 8  mediator outcome Low risk -> High risk  High inflammation 500     30
#> 9  mediator outcome Low risk -> High risk               <NA> 500     30
#> 10 mediator outcome Low risk -> High risk               <NA> 500     30
#> 11 mediator outcome Low risk -> High risk               <NA> 500     30
#> 12 mediator outcome Low risk -> High risk               <NA> 500     30
#> 13 mediator outcome Low risk -> High risk               <NA> 500     30
#> 14 mediator outcome Low risk -> High risk               <NA> 500     30
#>    Non-event N Total follow-up Estimate        95% CI P value Mediator model
#> 1          470        5470.759    1.872  0.639, 5.479   0.253       logistic
#> 2          470        5470.759    1.443  0.690, 3.018   0.329       logistic
#> 3          470        5470.759    1.058  0.891, 1.258   0.518       logistic
#> 4          470        5470.759    1.557  0.738, 3.281   0.245       logistic
#> 5          470        5470.759    0.981  0.817, 1.179   0.842       logistic
#> 6          470        5470.759    1.528  0.743, 3.140   0.249       logistic
#> 7          470        5470.759    0.160 -0.356, 0.676   0.543       logistic
#> 8          470        5470.759    1.938  0.661, 5.678   0.228       logistic
#> 9          470        5470.759    1.470  0.698, 3.096   0.311       logistic
#> 10         470        5470.759    1.054  0.870, 1.276   0.593       logistic
#> 11         470        5470.759    1.600  0.761, 3.366   0.215       logistic
#> 12         470        5470.759    0.968  0.791, 1.185   0.753       logistic
#> 13         470        5470.759    1.549  0.753, 3.185   0.234       logistic
#> 14         470        5470.759    0.144 -0.402, 0.689   0.606       logistic
#>    Outcome model
#> 1        survCox
#> 2        survCox
#> 3        survCox
#> 4        survCox
#> 5        survCox
#> 6        survCox
#> 7        survCox
#> 8        survCox
#> 9        survCox
#> 10       survCox
#> 11       survCox
#> 12       survCox
#> 13       survCox
#> 14       survCox

How to read the result table

The mediation output is easier to read if you separate the columns into roles:

  • Effect code and Effect identify which decomposition quantity is being shown
  • Scale reminds you that pm is a proportion, whereas the other rows are ratio measures
  • Exposure contrast shows the x_exp_a0 -> x_exp_a1 comparison
  • Mediator reference is populated for the cde row and echoed in $evaluation
  • Mediator model and Outcome model show the actual mreg and yreg selected

The scientific point is simple:

mediation estimates are not free-floating numbers; they are conditional on a chosen exposure contrast, mediator reference, covariate profile, and model pair

Teach the core settings in leo.ukb language

regmedint uses a compact set of evaluation settings and model settings. leo_cox_mediation() keeps the same scientific meaning while presenting them in more readable leo.ukb language.

Question leo.ukb parameter What it means in practice
Which exposure contrast is being compared? x_exp_a0, x_exp_a1 The exposure comparison used to define the mediation effect, such as Low risk -> High risk or 8.0 -> 9.0.
At what mediator value is the controlled direct effect evaluated? x_med_cde The mediator reference value used for cde.
Under what covariate profile are the effects evaluated? x_cov_cond The covariate setting at which the effects are computed.
How is the mediator modeled? mreg The mediator model for x_med ~ x_exp + x_cov.
How is the survival outcome modeled? yreg The outcome model for time ~ x_exp + x_med + x_cov.
Should the outcome model include x_exp:x_med? interaction If TRUE, the outcome model allows the effect of x_exp to vary with x_med.

In other words:

  • x_exp_a0, x_exp_a1, x_med_cde, and x_cov_cond define where the mediation effects are evaluated
  • mreg, yreg, and interaction define which models generate that decomposition

How mreg = "auto" behaves

The current leo.ukb rule is:

  • binary factor / character / logical mediator -> logistic
  • two-valued integer-like numeric mediator such as 0/1 -> logistic
  • numeric mediator with more than two observed values, or non-integer two-valued coding such as 0.2/0.8 -> linear
  • multi-level categorical mediator stored as factor -> not currently supported

This is the practical interpretation:

  • use auto freely for binary mediators
  • use auto for true continuous biomarkers
  • for ordered scores, such as a low-grade inflammation index from -7 to 7, keep the variable numeric and usually let it behave as a continuous mediator
  • if the scientific question is threshold-based, dichotomize first and then use a logistic mediator model

Decode the abbreviations before interpreting anything

data.frame(
  `Effect code` = c("cde", "pnde", "tnie", "tnde", "pnie", "te", "pm"),
  Meaning = c(
    "Controlled direct effect",
    "Pure natural direct effect",
    "Total natural indirect effect",
    "Total natural direct effect",
    "Pure natural indirect effect",
    "Total effect",
    "Proportion mediated"
  ),
  `Practical reading` = c(
    "Direct effect when the mediator is fixed at the chosen reference value.",
    "Direct pathway not operating through the mediator.",
    "Indirect pathway operating through the mediator.",
    "Direct pathway allowing the mediator to vary naturally.",
    "Indirect pathway under the pure natural definition.",
    "Overall exposure-outcome effect.",
    "Share of the total effect attributed to the mediator pathway."
  )
)
#>   Effect.code                       Meaning
#> 1         cde      Controlled direct effect
#> 2        pnde    Pure natural direct effect
#> 3        tnie Total natural indirect effect
#> 4        tnde   Total natural direct effect
#> 5        pnie  Pure natural indirect effect
#> 6          te                  Total effect
#> 7          pm           Proportion mediated
#>                                                         Practical.reading
#> 1 Direct effect when the mediator is fixed at the chosen reference value.
#> 2                      Direct pathway not operating through the mediator.
#> 3                        Indirect pathway operating through the mediator.
#> 4                 Direct pathway allowing the mediator to vary naturally.
#> 5                     Indirect pathway under the pure natural definition.
#> 6                                        Overall exposure-outcome effect.
#> 7           Share of the total effect attributed to the mediator pathway.

For most applied readers, three rows are especially important:

  • te: how large the overall association is
  • tnie: how large the mediator pathway appears to be
  • pm: what fraction of the total effect may be operating through the mediator

Turn the result into a clinically readable summary

key_rows <- dplyr::filter(
  res_med$result,
  Model == "Model B",
  `Effect code` %in% c("te", "tnie", "pm")
)
key_rows
#>     Model Effect code                        Effect        Scale Exposure
#> 1 Model B        tnie Total natural indirect effect Hazard ratio exposure
#> 2 Model B          te                  Total effect Hazard ratio exposure
#> 3 Model B          pm           Proportion mediated   Proportion exposure
#>   Mediator Outcome     Exposure contrast Mediator reference   N Case N
#> 1 mediator outcome Low risk -> High risk               <NA> 500     30
#> 2 mediator outcome Low risk -> High risk               <NA> 500     30
#> 3 mediator outcome Low risk -> High risk               <NA> 500     30
#>   Non-event N Total follow-up Estimate        95% CI P value Mediator model
#> 1         470        5470.759    1.054  0.870, 1.276   0.593       logistic
#> 2         470        5470.759    1.549  0.753, 3.185   0.234       logistic
#> 3         470        5470.759    0.144 -0.402, 0.689   0.606       logistic
#>   Outcome model
#> 1       survCox
#> 2       survCox
#> 3       survCox

te_row <- dplyr::filter(key_rows, `Effect code` == "te")
tnie_row <- dplyr::filter(key_rows, `Effect code` == "tnie")
pm_row <- dplyr::filter(key_rows, `Effect code` == "pm")
pm_ci <- as.numeric(trimws(strsplit(pm_row$`95% CI`, ",", fixed = TRUE)[[1]]))
ratio_label <- if (te_row$`Outcome model` == "survCox") "HR" else "TR"
ratio_noun <- if (te_row$`Outcome model` == "survCox") "hazard" else "time"
te_pct <- 100 * (te_row$Estimate - 1)
pm_pct <- 100 * pm_row$Estimate

data.frame(
  Quantity = c("Total effect", "Indirect effect through inflammation", "Proportion mediated"),
  `Model B summary` = c(
    sprintf("%s %.3f (%s)", ratio_label, te_row$Estimate, te_row$`95% CI`),
    sprintf("%s %.3f (%s)", ratio_label, tnie_row$Estimate, tnie_row$`95% CI`),
    sprintf("%.1f%% (%.1f%% to %.1f%%)", 100 * pm_row$Estimate, 100 * pm_ci[1], 100 * pm_ci[2])
  )
)
#>                               Quantity         Model.B.summary
#> 1                         Total effect HR 1.549 (0.753, 3.185)
#> 2 Indirect effect through inflammation HR 1.054 (0.870, 1.276)
#> 3                  Proportion mediated 14.4% (-40.2% to 68.9%)

Because the event rate in this example is only 6.0%, yreg = "auto" selected survCox, so the ratio scale is a hazard ratio.

In manuscript-style language, the Model B result can be read as:

  • the High risk group has an overall hazard ratio that is about 54.9% higher than the Low risk group (te = 1.549)
  • the inflammation pathway contributes only a modest indirect effect (tnie = 1.054)
  • the point estimate for the proportion mediated is 14.4%, suggesting that part of the overall association could be operating through the inflammation pathway

The critical scientific qualifier is uncertainty. Here the confidence interval for pm is wide and crosses zero, so the careful interpretation is not “mediation has been established”, but:

the estimates are compatible with partial mediation through inflammation, but the pathway effect remains too imprecise to support a strong mechanistic claim on its own

Inspect the evaluation settings directly

leo_cox_mediation() keeps a compact evaluation summary in $evaluation.

res_med$evaluation
#>     model exposure mediator outcome   n case_n non_event_n followup_total
#> 1 Model A exposure mediator outcome 500     30         470       5470.759
#> 2 Model B exposure mediator outcome 500     30         470       5470.759
#>   event_rate     exposure_contrast exposure_contrast_value x_exp_a0 x_exp_a1
#> 1       0.06 Low risk -> High risk          0.000 -> 1.000        0        1
#> 2       0.06 Low risk -> High risk          0.000 -> 1.000        0        1
#>   mediator_reference         x_med_cde x_med_cde_internal     mreg    yreg
#> 1  High inflammation High inflammation                  1 logistic survCox
#> 2  High inflammation High inflammation                  1 logistic survCox
#>   pm_unstable
#> 1        TRUE
#> 2        TRUE
#>                                                                                                                       pm_note
#> 1 Total effect 95% CI includes the null value; proportion mediated (pm) may be unstable and should be interpreted cautiously.
#> 2 Total effect 95% CI includes the null value; proportion mediated (pm) may be unstable and should be interpreted cautiously.
#>   interaction covariates             x_cov_cond
#> 1        TRUE        age             age=59.523
#> 2        TRUE   age, bmi age=59.523; bmi=25.464

This table is especially useful when you want to audit:

  • which x_exp_a0 -> x_exp_a1 contrast was used
  • which mediator reference was used for the controlled direct effect
  • which covariate profile was used for x_cov_cond
  • which mreg and yreg were used
  • whether the outcome model included x_exp:x_med

Non-numeric covariates need explicit x_cov_cond

When the selected adjustment set contains a non-numeric covariate, the evaluation profile must be given explicitly. The same applies to binary numeric covariates such as 0/1 indicators, because the default median can otherwise create impossible values such as 0.5.

res_med_smoking <- leo_cox_mediation(
  df = med_df,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "exposure",
  x_med = "mediator",
  x_cov = "smoking",
  x_cov_cond = list(smoking = "Never"),
  verbose = FALSE
)

res_med_smoking$evaluation
#>     model exposure mediator outcome   n case_n non_event_n followup_total
#> 1 model_1 exposure mediator outcome 500     30         470       5470.759
#>   event_rate     exposure_contrast exposure_contrast_value x_exp_a0 x_exp_a1
#> 1       0.06 Low risk -> High risk          0.000 -> 1.000        0        1
#>   mediator_reference         x_med_cde x_med_cde_internal     mreg    yreg
#> 1  High inflammation High inflammation                  1 logistic survCox
#>   pm_unstable
#> 1        TRUE
#>                                                                                                                       pm_note
#> 1 Total effect 95% CI includes the null value; proportion mediated (pm) may be unstable and should be interpreted cautiously.
#>   interaction covariates    x_cov_cond
#> 1        TRUE    smoking smoking=Never

This is not just more transparent. It is required whenever x_cov contains a factor or binary numeric covariate, because the evaluation profile has to map to a real person that the reader can reconstruct.

Continuous mediators, including score-like mediators, are also supported

The same workflow can also be used with a continuous mediator.

med_df_cont <- dplyr::mutate(
  med_df,
  mediator_cont = rnorm(
    n,
    mean = 0.5 * as.integer(exposure == "High risk") + 0.02 * (age - 60),
    sd = 1
  )
)

res_med_cont <- leo_cox_mediation(
  df = med_df_cont,
  y_out = c("outcome", "outcome_censor"),
  x_exp = "exposure",
  x_med = "mediator_cont",
  x_cov = "age",
  mreg = "linear",
  verbose = FALSE
)

res_med_cont$result
#>     Model Effect code                        Effect        Scale Exposure
#> 1 model_1         cde      Controlled direct effect Hazard ratio exposure
#> 2 model_1        pnde    Pure natural direct effect Hazard ratio exposure
#> 3 model_1        tnie Total natural indirect effect Hazard ratio exposure
#> 4 model_1        tnde   Total natural direct effect Hazard ratio exposure
#> 5 model_1        pnie  Pure natural indirect effect Hazard ratio exposure
#> 6 model_1          te                  Total effect Hazard ratio exposure
#> 7 model_1          pm           Proportion mediated   Proportion exposure
#>        Mediator Outcome     Exposure contrast Mediator reference   N Case N
#> 1 mediator_cont outcome Low risk -> High risk              0.233 500     30
#> 2 mediator_cont outcome Low risk -> High risk               <NA> 500     30
#> 3 mediator_cont outcome Low risk -> High risk               <NA> 500     30
#> 4 mediator_cont outcome Low risk -> High risk               <NA> 500     30
#> 5 mediator_cont outcome Low risk -> High risk               <NA> 500     30
#> 6 mediator_cont outcome Low risk -> High risk               <NA> 500     30
#> 7 mediator_cont outcome Low risk -> High risk               <NA> 500     30
#>   Non-event N Total follow-up Estimate        95% CI P value Mediator model
#> 1         470        5470.759    1.562  0.760, 3.213   0.225         linear
#> 2         470        5470.759    1.694  0.801, 3.584   0.168         linear
#> 3         470        5470.759    0.917  0.775, 1.086   0.315         linear
#> 4         470        5470.759    1.554  0.747, 3.233   0.238         linear
#> 5         470        5470.759    1.000  0.849, 1.178   0.999         linear
#> 6         470        5470.759    1.554  0.755, 3.201   0.231         linear
#> 7         470        5470.759   -0.252 -0.849, 0.345   0.407         linear
#>   Outcome model
#> 1       survCox
#> 2       survCox
#> 3       survCox
#> 4       survCox
#> 5       survCox
#> 6       survCox
#> 7       survCox

For a continuous mediator, the cde row shows a numeric mediator reference because the controlled direct effect is evaluated at x_med_cde.

This is also the workflow we would usually recommend for an ordered score-like mediator. For example, if a low-grade inflammation index ranges from -7 to 7, the first question is scientific rather than computational:

  • does the index behave like a severity score, where one-unit changes are interpretable enough for a linear approximation?
  • or is the real question about a thresholded state that should be defined as binary first?

For most ordered severity scores, the cleaner first pass is to keep the score numeric and fit a linear mediator model.

Add a paper-style pathway figure

For a single mediator, a compact pathway figure often communicates the result faster than a dense table alone.

leo_cox_mediation_plot(
  x = res_med,
  model = "Model B",
  exposure_label = "Metabolic\nrisk",
  mediator_label = "Inflammation",
  outcome_label = "Incident\noutcome",
  language = "en",
  palette = "jama"
)

This figure is useful because it keeps the main teaching elements in one place:

  • left box: the exposure contrast
  • middle box: the proposed mediator
  • right box: the incident outcome
  • top text: the indirect pathway through the mediator
  • middle text: the estimated proportion of the total effect attributed to that pathway
  • bottom text: the direct pathway not explained by the mediator

One subtle but important point is that the mediated pathway shown here is the full X -> M -> Y pathway. The figure does not print a separate M -> Y coefficient because regmedint reports causal decomposition quantities such as tnie, pnde, and te, rather than a simple product of two regression coefficients.

Compare the built-in journal palettes

The same fitted result can be displayed with several built-in journal-style palettes.

for (pal in c("jama", "jco", "lancet", "nejm")) {
  leo_cox_mediation_plot(
    x = res_med,
    model = "Model B",
    exposure_label = "Metabolic\nrisk",
    mediator_label = "Inflammation",
    outcome_label = "Incident\noutcome",
    language = "en",
    palette = pal
  )
}

How to interpret common result patterns

In applied survival mediation analysis, the most common interpretation patterns are:

  1. te is clearly away from the null, tnie is modest, and pm is positive but imprecise: compatible with partial mediation, but not decisive
  2. te and tnie are both clearly away from the null, and pm is precise: stronger evidence that the pathway may explain a meaningful share of the total effect
  3. te is weak or crosses the null: pm can become unstable and should be treated cautiously
  4. yreg switches to AFT: all non-pm effect rows should now be interpreted as time ratios rather than hazard ratios

This is why Step 3 is not only about producing a decomposition table. It is also about learning when the estimates are scientifically interpretable and when they are still too unstable for a strong claim.

How to write this in a paper

A defensible reporting order is usually:

  1. report the main exposure-outcome association first
  2. then present the mediation analysis as a pathway decomposition
  3. state the exposure contrast, mediator reference, and adjustment profile
  4. report which yreg was used; if it was survCox, mention the rare-event approximation and make clear that this means the decomposition is only being reported as an approximate split when events are relatively rare, and if it was AFT, remember that the ratio scale is a time ratio
  5. if exposure and mediator are measured at the same baseline visit, write the result up as decomposition evidence that is consistent with an assumed pathway, and say explicitly that the data alone do not establish exposure before mediator unless temporal ordering is supported externally

In other words, mediation should usually refine the biological story after the main Cox result is already established.

What to take away from Step 3

This step should leave you with six practical rules:

  1. mediation asks a pathway question, not a heterogeneity question
  2. the data should be checked before fitting, just as in Step 1 and Step 2
  3. x_exp_a0, x_exp_a1, x_med_cde, and x_cov_cond define where the mediation effects are evaluated
  4. mreg controls x_med ~ x_exp + x_cov, yreg controls time ~ x_exp + x_med + x_cov, and interaction = TRUE adds x_exp:x_med to the outcome model
  5. yreg = "auto" chooses survCox for event proportion <= 10% (which means the decomposition then relies on the rare-event approximation and is better read as approximate when events are relatively rare) and survAFT_weibull otherwise
  6. factor mediators should not be forced into mreg = "linear", and the final mediation result should always be interpreted with the actual outcome model, uncertainty, and the study’s temporal ordering in mind