Technical Appendix

Author

Max Ghenis

Target population

This analysis applies to adults aged 30-70 years (the primary analysis uses a 40-year-old baseline) drawn from the general population rather than specifically high-risk groups, with estimates derived primarily from US and European cohorts. Individuals with nut allergies (~1-2% of the population) are excluded. The model does not cover secondary prevention (existing CVD), the very elderly (80+), or non-Western populations.

Monte Carlo uncertainty propagation algorithm

The analysis uses a hierarchical forward sampling model with standardized prior draws (numerically equivalent to “non-centered parameterization” in Bayesian MCMC but without the inference interpretation — there is no likelihood). Since there is no outcome likelihood, inference is direct Monte Carlo sampling from priors rather than MCMC.

Priors: Nutrient effects follow \beta_{nutrient,pathway} \sim \text{Normal}(\mu_{meta}, \sigma_{meta}) from Table 2; the hierarchical shrinkage scale follows \tau_{pathway} \sim \text{HalfNormal}(0.015); standardized deviations follow z_{nut,pathway} \sim \text{Normal}(0, 1); and the confounding fraction follows c \sim \text{Beta}(1.5, 6.0).

Justification for τ ~ HalfNormal(0.015): Earlier versions let nut-specific residual adjustments do too much work. The smaller scale parameter 0.015 constrains those deviations to remain modest after nutrient composition is already accounted for. This still allows walnuts to retain a small CVD-specific edge without letting food-specific bonuses dominate the model.

Model: 1. Compute nutrient-predicted effect: \theta_{nutrients} = \sum_{n} \beta_n \cdot \text{composition}_{nut,n} 2. Add hierarchical deviation via standardized draws: \theta_{true} = \theta_{nutrients} + \tau \cdot z 3. Apply HR-centered Jensen correction: \theta_{hr} = \theta_{true} - \tfrac{1}{2}\left(\sum_n \sigma_n^2 x_{nut,n}^2 + \tau^2\right), so E[\exp(\theta_{hr})] = \exp(\sum_n \mu_n x_{nut,n}). 4. Shrink nut-specific adjustment toward the null by the evidence tier: a_{\text{shrunk}} = 1 + (a - 1)(1 - s_{\text{tier}}), with s_{\text{strong}} = 0.15, s_{\text{moderate}} = 0.30, s_{\text{limited}} = 0.50. 5. Apply nut-specific adjustment: RR_{adjusted} = RR_{hr}^{a_{\text{shrunk}}} - On log scale: \theta_{adjusted} = a_{\text{shrunk}} \times \theta_{hr} - For protective effects (RR < 1, equivalently \theta < 0), adjustments a > 1 amplify the effect (make RR smaller). - Worked example (walnut CVD, strong evidence). Nutrient-predicted \theta_{hr} \approx -0.05. Nominal adjustment a = 1.10. After 15% strong-tier shrinkage, a_{\text{shrunk}} \approx 1.085. Adjusted \theta = 1.085 \times -0.05 = -0.054, so RR_{\text{adjusted}} = \exp(-0.054) \approx 0.947 (slightly stronger CVD protection than the nutrient model predicts alone). After the 20% confounding shrinkage the sample-mean RR lands near 0.96, consistent with Table 4. 6. Apply confounding: \theta_{causal} = c \cdot \theta_{adjusted} 7. Convert to RR: RR_{pathway} = \exp(\theta_{causal})

Tiered publication-bias shrinkage: Steps 3–4 import two layers from the newer Optiqal framework. HR-centering ensures the aggregate pre-adjustment RR has the expected mean \exp(\sum_n \mu_n x_{nut,n}) rather than one inflated by the half-variance of the log-RR; after the nut-specific adjustment multiplication in step 5, a residual Jensen gap of order \tfrac{1}{2}\mathrm{Var}(a_{\text{shrunk}})(\sum_n \mu_n x_{nut,n})^2 remains that the base case leaves uncorrected. The numerical effect is small (under 0.15 pp on RR, worst case walnut CVD) but the gap is noted for completeness. Tiered shrinkage addresses the well-documented gap between initial effect sizes and replicated effects as evidence quality falls: strong-evidence nuts retain 85% of the nominal central estimate for the nut-specific residual, moderate 70%, limited 50%. Following optiqal’s convention, only the central estimate is shrunk; the adjustment SD is left intact so that uncertainty reflects replication risk rather than the attenuation factor itself. Nutrient priors (Table 2) are already drawn from meta-analyses, so they are assumed pre-shrunk and this layer only touches the nut-specific residual.

Monte Carlo sampling: The model draws 10,000 forward samples from priors (no MCMC is needed since there is no likelihood). A fixed seed of 42 ensures reproducibility; runtime is well under one second using vectorized NumPy with no external inference library required.

Lifecycle integration: For each of the 10,000 samples, the model extracts pathway-specific RRs (already confounding-adjusted), computes age-weighted mortality reduction using CDC life tables, applies the smoothed EQ-5D quality-weight trajectory by age, and computes undiscounted QALYs alongside cost-discounted lifetime costs.

Nut-specific adjustment priors

These adjustment factors are priors used in the hierarchical model. The adjustment is applied as an exponent on the RR scale: RR_{adjusted} = RR_{nutrients}^{a}. On the log-RR scale, this is multiplicative: \log(RR_{adjusted}) = a \times \log(RR_{nutrients}). In the current version, these adjustments are deliberately small. They encode modest residual evidence beyond nutrient composition rather than large food-specific bonuses.

Derivation of adjustment values

Adjustments capture residual effects from nut-specific RCTs after accounting for nutrient composition. The derivation for walnut’s CVD adjustment illustrates the method:

In the PREDIMED trial — 2008 pilot biomarker study (Ros et al. 2008); primary CVD-endpoint results (Estruch et al. 2018) — the nut intervention arm (which included 15g walnuts, 7.5g almonds, and 7.5g hazelnuts daily, not walnuts alone) showed approximately 30% reduction in major cardiovascular events. Attributing much of that residual to walnuts specifically is too aggressive: the mixed nut arm does not separately identify walnut-specific causal effects. The revised model therefore treats walnut’s residual edge as modest, using a CVD adjustment of 1.10 with a wide SD (0.12) rather than the much larger multiplier used previously.

Almonds serve as the reference nut (adjustment = 1.00) because their RCT effects are well-explained by nutrient composition (vitamin E, fiber, MUFA). This ensures adjustments represent genuine “beyond-nutrient” effects rather than artifacts.

Independence from nutrient priors: The nutrient priors (Table 2) use effect estimates from studies that pool across food sources (e.g., Naghshi 2021 for ALA includes fish and plant sources). The nut-specific adjustments use residual effects from nut-only RCTs, avoiding double-counting.

Nut	CVD Adj	Cancer Adj	Other Adj	Evidence	Rationale
Walnut	1.10 (0.12)	1.00 (0.08)	1.00 (0.08)	Strong	Modest residual CVD edge beyond nutrients
Pistachio	1.04 (0.07)	1.00 (0.08)	1.00 (0.08)	Moderate	Small lipid edge, otherwise neutral
Almond	1.00 (0.05)	1.00 (0.06)	1.00 (0.05)	Strong	Reference nut
Pecan	1.03 (0.08)	1.00 (0.10)	1.00 (0.10)	Moderate	Small residual CVD effect only
Macadamia	1.02 (0.08)	1.00 (0.10)	1.00 (0.10)	Moderate	MUFA profile, but weak residual evidence
Peanut	0.98 (0.06)	1.00 (0.06)	1.00 (0.06)	Strong	Slightly below tree nuts on CVD
Hazelnut	1.03 (0.07)	1.00 (0.08)	1.00 (0.08)	Moderate	Part of PREDIMED nut arm (Estruch et al. 2018); lipid improvements in Orem et al. (2013)
Cashew	0.97 (0.10)	1.00 (0.10)	1.00 (0.10)	Limited	Mixed RCT evidence, near-neutral

Note on cancer adjustments: Previous versions applied a 10% cancer penalty to peanuts based on aflatoxin concerns. However, US FDA regulations limit aflatoxin to <20 ppb, and epidemiological studies show no excess cancer risk in US peanut consumers (Wu and Khlangwiset 2010). The cancer adjustment is now set to 1.00 (neutral). Similarly, macadamia and pecan cancer adjustments are set to 1.00 given insufficient evidence for deviation from nutrient predictions.

Nuts with limited evidence (macadamia, pecan, cashew) receive higher SD values to reflect greater uncertainty.

Confounding prior derivation

I adopt a skeptical Beta(1.5, 6.0) prior with mean 0.2 and 95% interval 0.02-0.53 (roughly 2-53%). This prior reflects healthy-user bias in nutrition cohorts, weak Mendelian-randomization support, and the gap between biomarker changes and hard outcomes.

Three evidence sources inform this choice:

Source	Implied Causal %	Interpretation
LDL pathway calibration	Low double digits	Mechanistic floor
Mendelian randomization	Near zero to small	Mostly null, but weak instruments
Substitution / Golestan evidence	Small to moderate	Prevents the prior from collapsing to zero

The LDL pathway still provides a floor rather than a ceiling, but the model no longer treats broader pathway stories as strong enough to justify a 50% causal prior. Sensitivity analysis across 10-33% causal priors is presented in the main text.

Cost-effectiveness model

Data sources

The cost-effectiveness model draws on CDC National Vital Statistics (2021) life tables for age-specific mortality (NVSR Vol 72 No 12), cause-of-death fractions from Table 6 of Xu et al. (2024), age-varying health-related quality of life weights derived from Sullivan & Ghushchyan (2006) US EQ-5D index, a 3% annual discount rate for costs, and per-nut retail prices retrieved from nuts.com on 2026-04-19 ((Ghenis 2026)). The raw price snapshot (one row per nut, with product URL, package size, and price) lives at src/whatnut/data/raw/retail_prices/retail_prices.csv; whatnut.data_build.retail_prices reads that CSV, validates the row-level math, and writes the per-nut median to nuts.yaml.

Lifecycle model

For a 40-year-old beginning daily nut consumption, the current model estimates 0.03-0.15 additional life years (0.4-1.8 months) across nut types, corresponding to 0.02-0.10 QALYs under 0% health discounting. ICERs range from approximately $92,106-$453,297 per QALY across nut types.

E-value analysis

Per VanderWeele and Ding (2017), the E-value quantifies the minimum strength of association an unmeasured confounder would need with both exposure and outcome to fully explain an observed association.

For a protective exposure with hazard ratio HR, I first convert to relative risk RR = 1/HR, then calculate:

E\text{-value} = RR + \sqrt{RR \times (RR - 1)}

For HR = 0.78:

RR = 1/0.78 = 1.28
E\text{-value} = 1.28 + \sqrt{1.28 \times 0.28} = 1.28 + 0.60 = 1.88

An unmeasured confounder would need RR ≥ 1.88 with both nut consumption and mortality to fully explain the observed effect.

Pathway-specific mortality effects

From Aune et al. (2016):

Cause of Death	Relative Risk	95% CI	Deaths in Meta-Analysis
CHD	0.71	0.63-0.80	20,381
CVD	0.79	0.70-0.88	—
Cancer	0.87	0.80-0.93	21,353
Other	0.90	0.85-0.95	Assumed

Note: CHD = coronary heart disease; CVD = cardiovascular disease (broader category). The model’s “CVD pathway” incorporates both CHD and broader cardiovascular effects.

Age-varying cause fractions

Cause-of-death proportions vary by age, extracted from Table 6 of Xu et al. (2024). whatnut.data_build.cdc_cause_fractions parses the 2021 rows for All causes, Diseases of heart, Cerebrovascular diseases, and Malignant neoplasms; CVD below is defined as heart + cerebrovascular (ICD-10 I00–I09 + I11 + I13 + I20–I51 + I60–I69), narrower than the full ICD-10 I00–I99 block by ~3–5 pp. Rows below report fractions at the lower bound of each NVSR 10-year age group (e.g., “40” is the 35–44 band):

Age (group start)	CVD	Cancer	Other
25	5.8%	4.4%	89.8%
35	11.9%	9.0%	79.1%
45	18.6%	15.5%	65.8%
55	21.7%	22.6%	55.7%
65	23.3%	24.7%	52.0%
75	25.8%	19.9%	54.4%
85+	33.0%	10.9%	56.2%

In Aune 2016’s meta-analytic estimates above, CVD carries the lowest (most protective) prior RR (~0.75); the post-shrinkage model RRs in Table 4 are much closer to null.

Comparison with direct meta-analysis sampling

An alternative approach samples cause-specific relative risks directly from meta-analysis estimates (e.g., log-normal distributions based on Aune et al. (2016)) rather than deriving them from nutrient composition. This simpler approach yields broadly comparable results because the nutrient-derived priors are calibrated to match meta-analysis estimates.

The nutrient-derived approach used in this analysis provides several advantages over direct meta-analysis sampling. First, it offers mechanistic interpretability by attributing effects to specific nutrients (ALA, fiber, magnesium). Second, poorly-evidenced nuts shrink toward nutrient-predicted effects through principled hierarchical shrinkage. Third, each prior is traceable to independent meta-analyses, ensuring transparency. Fourth, compositional differences drive differential estimates across nut types—for example, the model can distinguish walnuts from almonds based on ALA content rather than relying on a single pooled nut estimate.

Both approaches use forward Monte Carlo sampling (no MCMC is needed since there is no likelihood function). The nutrient-derived approach is preferred for its mechanistic transparency and ability to differentiate nuts based on composition.

Limitations

The causal fraction estimate remains uncertain even after shrinkage, and the true value could be lower or modestly higher than the base case. Most source studies come from Western populations (US, Europe, Australia), limiting generalizability. The model still assumes sustained 28g/day intake and does not explicitly model calorie substitution, adherence decay, or personalized baseline risk.

References

Aune, Dagfinn, NaNa Keum, Edward Giovannucci, et al. 2016. “Nut Consumption and Risk of Cardiovascular Disease, Total Cancer, All-Cause and Cause-Specific Mortality: A Systematic Review and Dose-Response Meta-Analysis of Prospective Studies.” BMC Medicine 14: 207. https://doi.org/10.1186/s12916-016-0730-3.

Estruch, Ramón, Emilio Ros, Jordi Salas-Salvadó, et al. 2018. “Primary Prevention of Cardiovascular Disease with a Mediterranean Diet Supplemented with Extra-Virgin Olive Oil or Nuts.” New England Journal of Medicine 378 (25): e34. https://doi.org/10.1056/NEJMoa1800389.

Ghenis, Max. 2026. Retail Price Snapshot for Raw Shelled Tree Nuts and Peanuts, Nuts.com, April 2026. https://nuts.com/.

Orem, Asim, Fulya Balaban Yucesan, Cihan Orem, et al. 2013. “Hazelnut-Enriched Diet Improves Cardiovascular Risk Biomarkers Beyond a Lipid-Lowering Effect in Hypercholesterolemic Subjects.” Journal of Clinical Lipidology 7 (2): 123–31. https://doi.org/10.1016/j.jacl.2012.10.005.

Ros, Emilio, Ismael Núñez, Ana Pérez-Heras, et al. 2008. “Effects of a Mediterranean Diet Supplemented with Nuts on Cardiovascular Risk Factors.” Archives of Internal Medicine 168 (22): 2449–58. https://doi.org/10.1001/archinte.168.22.2449.

VanderWeele, Tyler J, and Peng Ding. 2017. “Sensitivity Analysis in Observational Research: Introducing the e-Value.” Annals of Internal Medicine 167 (4): 268–74. https://doi.org/10.7326/M16-2607.

Wu, Felicia, and Pornchai Khlangwiset. 2010. “Aflatoxin B1 and Hepatocellular Carcinoma: A Potential Public Health Concern.” Critical Reviews in Toxicology 40 (10): 790–803. https://doi.org/10.3109/10408444.2010.508723.

Xu, Jiaquan, Sherry L Murphy, Kenneth D Kochanek, and Elizabeth Arias. 2024. “Deaths: Final Data for 2021.” National Vital Statistics Reports (Hyattsville, MD) 73 (8). https://www.cdc.gov/nchs/data/nvsr/nvsr73/nvsr73-08.pdf.