Prior-Data Conflict Diagnostics

Author

Ndoh Penn

Published

May 17, 2026

Why Conflict Diagnostics Matter

Before updating a prior with trial data it is essential to check whether the two are compatible. Substantial prior-data conflict indicates one of three things: the prior was mis-specified, the data are anomalous, or the model is wrong. Regulators increasingly require evidence that conflict was assessed and addressed.

bayprior provides four complementary univariate diagnostics and one multivariate diagnostic, following Box (1980) and related work. The conflict module supports four data types: binary, continuous, Poisson/count, and survival.

Supported Data Types

Data type	`type =` argument	Conjugate update	Typical endpoint
Binary	`"binary"`	Beta–Binomial	Response rate, ORR
Continuous	`"continuous"`	Normal–Normal	Mean difference, HbA1c
Poisson / count	`"poisson"`	Gamma–Poisson	Adverse event rate
Survival	`"survival"`	Gamma–Exponential	Hazard rate, OS, PFS

Univariate Conflict: `prior_conflict()`

Binary Data

prior <- elicit_beta(
  mean = 0.30, sd = 0.10, method = "moments",
  label = "Response rate"
)

# No-conflict: data consistent with prior
cd_none <- prior_conflict(
  prior        = prior,
  data_summary = list(type = "binary", x = 13, n = 40),
  alpha        = 0.05
)
print(cd_none)

# Mild conflict
cd_mild <- prior_conflict(
  prior        = prior,
  data_summary = list(type = "binary", x = 20, n = 40)
)
print(cd_mild)

# Severe conflict
cd_severe <- prior_conflict(
  prior        = prior,
  data_summary = list(type = "binary", x = 35, n = 40)
)
print(cd_severe)

The Four Diagnostics Explained

1. Box’s Prior Predictive P-value

\[p_{\text{Box}} = 2\Phi\!\left(-\left|\frac{\hat\theta - \mu_\pi} {\sqrt{\sigma_\pi^2 + \text{SE}^2}}\right|\right)\]

Interpretation: \(p < 0.05\) flags conflict at the 5% level.

2. Surprise Index

\[S = \left|\frac{\hat\theta - \mu_\pi} {\sqrt{\sigma_\pi^2 + \text{SE}^2}}\right|\]

Interpretation: \(S > 2\) moderate; \(S > 3\) high surprise.

3. KL Divergence

\[\text{KL}(P \| Q) = \log\frac{\sigma_Q}{\sigma_P} + \frac{\sigma_P^2 + (\mu_P - \mu_Q)^2}{2\sigma_Q^2} - \frac12\]

Interpretation: \(> 1\) indicates large information distance.

4. Bhattacharyya Overlap Coefficient

\[\text{BC} = \exp\!\left( -\frac{(\mu_P - \mu_Q)^2}{4(\sigma_P^2 + \sigma_Q^2)}\right) \cdot \left(\frac{2\sigma_P\sigma_Q} {\sigma_P^2 + \sigma_Q^2}\right)^{1/2}\]

Interpretation: 1 = identical distributions; 0 = no overlap.

Threshold Summary

Diagnostic	No conflict	Mild conflict	Severe conflict
Box p-value	>= 0.05	0.01-0.05	< 0.01
Surprise index	< 2	2-3	> 3
KL divergence	< 0.5	0.5-1	> 1
Bhattacharyya overlap	> 0.6	0.3-0.6	< 0.3

Prior-Likelihood-Posterior Overlay

plot_prior_likelihood(
  prior,
  data_summary   = list(type = "binary", x = 13, n = 40),
  show_posterior = TRUE
)

No conflict: prior and likelihood overlap substantially.

plot_prior_likelihood(
  prior,
  data_summary   = list(type = "binary", x = 35, n = 40),
  show_posterior = TRUE
)

Severe conflict: prior (blue) and likelihood (orange) widely separated.

Continuous Endpoint Conflict

prior_norm <- elicit_normal(
  mean = 0.0, sd = 0.3, method = "moments",
  label = "Log odds ratio"
)

# No conflict: observed log OR consistent with prior
prior_conflict(
  prior_norm,
  data_summary = list(type = "continuous", x = 0.15, sd = 0.20, n = 80)
)

Poisson / Count Data Conflict

Poisson/count data arises for endpoints such as adverse event rates, where the observed data consists of a count of events over an exposure period (person-time). The conjugate update uses a Gamma-Poisson model:

\[\text{Prior: Gamma}(\alpha, \beta) \quad\Rightarrow\quad \text{Posterior: Gamma}(\alpha + x,\; \beta + n)\]

where \(x\) = observed events and \(n\) = total person-time.

# Gamma prior on adverse event rate
# Expert believes the AE rate is ~0.15 events per person-year (SD 0.06)
prior_ae <- elicit_gamma(
  mean      = 0.15,
  sd        = 0.06,
  method    = "moments",
  label     = "Adverse event rate (per person-year)",
  expert_id = "Safety Expert"
)
print(prior_ae)

# Observed: 12 AEs over 100 person-years => rate 0.12 — consistent with prior
cd_pois_none <- prior_conflict(
  prior        = prior_ae,
  data_summary = list(type = "poisson", x = 12, n = 100),
  alpha        = 0.05
)
print(cd_pois_none)

# Observed: 40 AEs over 100 person-years => rate 0.40 — much higher than prior
cd_pois_conf <- prior_conflict(
  prior        = prior_ae,
  data_summary = list(type = "poisson", x = 40, n = 100)
)
print(cd_pois_conf)

plot_prior_likelihood(
  prior_ae,
  data_summary   = list(type = "poisson", x = 40, n = 100),
  show_posterior = TRUE
)

Survival / Time-to-Event Data Conflict

Survival data conflict arises for endpoints like OS and PFS hazard rates. The observed data consists of event counts over a total follow-up time (person-months). The conjugate update uses a Gamma-Exponential model:

\[\text{Prior: Gamma}(\alpha, \beta) \quad\Rightarrow\quad \text{Posterior: Gamma}(\alpha + d,\; \beta + T)\]

where \(d\) = observed events and \(T\) = total follow-up time.

An Exponential prior (constant hazard) is equivalent to \(\text{Gamma}(1, \lambda)\).

# Exponential prior on hazard rate
# Expert believes median OS ≈ 20 months => hazard ≈ 0.05
prior_hz <- elicit_exponential(
  mean      = 0.05,
  method    = "moments",
  label     = "OS hazard rate",
  expert_id = "Clinical Expert"
)
print(prior_hz)

# Observed: 20 deaths over 400 person-months => hazard 0.05 — consistent
cd_surv_none <- prior_conflict(
  prior        = prior_hz,
  data_summary = list(type = "survival", x = 20, n = 400),
  alpha        = 0.05
)
print(cd_surv_none)

# Observed: 60 deaths over 400 person-months => hazard 0.15 — much higher
cd_surv_conf <- prior_conflict(
  prior        = prior_hz,
  data_summary = list(type = "survival", x = 60, n = 400)
)
print(cd_surv_conf)

plot_prior_likelihood(
  prior_hz,
  data_summary   = list(type = "survival", x = 60, n = 400),
  show_posterior = TRUE
)

Also works with a Gamma prior on the hazard rate:

prior_hz_gamma <- elicit_gamma(
  mean = 0.05, sd = 0.02, method = "moments",
  label = "OS hazard rate (Gamma prior)"
)
prior_conflict(
  prior_hz_gamma,
  data_summary = list(type = "survival", x = 20, n = 400)
)

Compatibility Validation

bayprior warns when the prior family is atypical for the chosen data type, helping users avoid inadvertent misspecification:

# Warning: Beta prior with Poisson data is unusual
prior_beta <- elicit_beta(mean = 0.15, sd = 0.06, method = "moments",
                           label = "Rate")
# The app shows a warning notification; the function still runs with
# a Normal approximation
cd_warn <- prior_conflict(
  prior_beta,
  data_summary = list(type = "poisson", x = 15, n = 100)
)

Mixture Prior Conflict

All diagnostics work transparently with mixture priors via a Normal approximation from the mixture’s fit_summary:

e1 <- elicit_beta(mean = 0.25, sd = 0.08, method = "moments",
                  expert_id = "E1", label = "ORR")
e2 <- elicit_beta(mean = 0.40, sd = 0.12, method = "moments",
                  expert_id = "E2", label = "ORR")
mix <- aggregate_experts(list(E1 = e1, E2 = e2), weights = c(0.5, 0.5))

prior_conflict(mix, list(type = "binary", x = 18, n = 40))

Multivariate Conflict: `conflict_mahalanobis()`

For trials with co-primary endpoints, the Mahalanobis distance provides an omnibus multivariate conflict test:

conflict_mahalanobis(
  prior_means = c(0.35, 0.60),
  prior_cov   = matrix(c(0.010, 0.003, 0.003, 0.015), 2, 2),
  obs_means   = c(0.55, 0.58),
  obs_cov     = matrix(c(0.008, 0.002, 0.002, 0.010), 2, 2) / 50,
  labels      = c("Response rate", "OS rate"),
  alpha       = 0.05
)


Marginal z-scores per parameter:
Response rate       OS rate 
        1.984        -0.162

Conflict Severity Classification

Severity	Recommended action
None	Proceed — prior is consistent with data
Mild	Consider reporting both prior-weighted and likelihood-only estimates
Severe	Revise prior or use a robust/power prior; report sensitivity

References

Box, G. E. P. (1980). Sampling and Bayes’ inference in scientific modelling and robustness. JRSS-A, 143, 383–430.
Bhattacharyya, A. (1943). On a measure of divergence between two statistical populations. Bulletin of the Calcutta Mathematical Society, 35, 99–109.