Statistical analyses

Downsampled microbiome functional profile and taxonomic composition data, metabolite and quantitative clinical phenotype measurements were assessed between and within groups using non-parametric tests (MWU and Spearman test) corrected for multiple testing using the Benjamini–Hochberg approach. All tests undertaken as part of the univariate biomarker analyses involved comparing only two groups. The main exception was the comparison between the three study centers where we applied a Kruskal–Wallis test. Non-parametric directional standardized effect sizes were likewise taken as the Cliff’s delta and Spearman rho, respectively. Classification models were built using multivariate O-PLS-DA using the ropls R package. ROC analysis was performed using the ROCR package. To control for influence of covariates associated with disease severity, including sex, smoking, dietary indices and drug treatment, a post hoc test approach was adopted as outlined above. R packages, including lmtest, orddom, ropls, ROCR, circlize, ggplot2, PCMCR using R version 4.0.2 and RStudio versions 1.4.1717 and 1.2.5033, were used for various analyses.

Reporting Summary

Further information on research design is available in the Nature Research Reporting Summary linked to this article.