Abstract
Evidence linking parental inflammatory bowel disease (IBD) with autism in children is inconclusive. We conducted four complementary studies to investigate associations between parental IBD and autism in children, and elucidated their underlying etiology. Conducting a nationwide population-based cohort study using Swedish registers, we found evidence of associations between parental diagnoses of IBD and autism in children. Polygenic risk score analyses of the Avon Longitudinal Study of Parents and Children suggested associations between maternal genetic liability to IBD and autistic traits in children. Two-sample Mendelian randomization analyses provided evidence of a potential causal effect of genetic liability to IBD, especially ulcerative colitis, on autism. Linkage disequilibrium score regression did not indicate a genetic correlation between IBD and autism. Triangulating evidence from these four complementary approaches, we found evidence of a potential causal link between parental, particularly maternal, IBD and autism in children. Perinatal immune dysregulation, micronutrient malabsorption and anemia may be implicated.
Main
Autism spectrum disorder (autism) is a chronic neurodevelopmental condition with a highly variable clinical manifestation1. Beyond the core phenotypic expressions of autism (social communication difficulties and restricted interests/repetitive behaviors), emerging evidence suggests that almost half of autistic individuals present with gastrointestinal symptoms (median prevalence 47%, in a review of studies published between 1980 and 20172). In addition, a recent study of 48,762 autistic children and 243,810 controls in the United States (US), suggested that children with autism were approximately 47% more likely to be diagnosed with Crohn’s disease (Crohn’s) and 94% more likely to be diagnosed with ulcerative colitis (UC) compared with controls3.
Crohn’s and UC are the major subtypes of inflammatory bowel disease (IBD), a chronic condition associated with immune system dysregulation, intestinal microbiome alterations, micronutrient malabsorption and anemia4,5,6. There is evidence suggesting that these characteristics of IBD might be perinatal factors associated with autism7,8,9,10. On this basis, a potential link between parental IBD and autism in children could be hypothesized. Evidence so far is inconclusive, with only one out of the four registry-based studies in the field11,12,13 indicating an association between maternal UC and autism in children14. Moreover, the underlying etiology of any associations is unclear.
We conducted four complementary studies (Fig. 1 and Table 1) to investigate: (1) associations between parental diagnoses of IBD and autism in children in a nationwide cohort in Sweden; (2) genetic correlation between IBD and autism using genome-wide association study (GWAS) summary statistics; (3) polygenic associations between maternal genetic liability to IBD and autistic traits in children in a large UK birth cohort; and (4) potential causal effects of genetic liability to IBD on autism and the possibility of reverse causation using bidirectional two-sample Mendelian randomization (MR).
Results
Study 1: Parental IBD diagnoses and autism in children
Nationwide population-based registers are powerful resources in etiological epidemiology, as they offer large intergenerational samples, prospectively collected data, and minimal loss to follow up and selection bias (Table 1). ‘Psychiatry Sweden’ is a comprehensive national register linkage including individual-level health, demographic, and socioeconomic data across Sweden.
In a sample of 2,324,227 children born to 1,282,494 mothers and 1,285,719 fathers from ‘Psychiatry Sweden,’ we assessed the associations between parental IBD diagnosis and autism in children (Online Methods, Extended Data Fig. 1, Extended Data Fig. 2, Extended Data Fig. 3, Supplementary Tables S1 and S2). Using logistic regression, we assessed the associations between parental IBD diagnoses and autism in children. We ran crude models (Model 1), as well as models adjusted for covariates that have been previously identified as associated with autism in the Swedish registers, including parental age at delivery15, migrant status16, education level, family income quintile at birth17, parents’ history of psychiatric diagnosis18 prior to the birth of the child, and child’s sex, birth year, and birth order (Model 2). In order to avoid potential bias from assortative mating in Model 2, we additionally mutually adjusted for maternal and paternal IBD diagnoses (Model 3)19. Maternal IBD diagnosis was associated with autism in children in crude and adjusted models (any IBD diagnosis: odds ratio (OR)MODEL3 = 1.32; 95% confidence intervals (CIs): 1.25 to 1.40; Table 2). Similar results were observed in analyses of maternal UC and Crohn’s diagnoses and autism in children (Table 2). The paternal IBD associations with autism were weaker (ORMODEL3 = 1.09; 95% CIs 1.02 to 1.17) than the maternal associations (Table 2). Results of the analysis were not sensitive to the choice of parental psychiatric history variable (broad psychiatric history versus parental diagnoses of autism specifically) or exclusion procedures that aimed to control for neurodeveloplmental outcomes that we assumed to have a genetic cause, though point estimates were lower in analyses restricted to parental IBD diagnoses prior to the index person’s birth (any maternal IBD diagnosis: ORMODEL3 = 1.20; 95% CIs: 1.09 to 1.32; Supplementary Table S3). Point estimates for associations of parental IBD diagnoses to autism without intellectual disabilities (IDs) were higher than those for autism with ID, although CIs overlapped (Supplementary Table S4).
Study 2: Genetic correlation between IBD and autism
Linkage disequilibrium score regression (LDSC) allows the estimation of the genetic correlation between complex traits such as IBD and autism by utilizing GWAS summary data20,21 (Table 1).
Using the latest GWAS summary statistics on IBD (Ncases = 25,042; Ncontrols = 34,915) (ref. 22), Crohn’s (Ncases = 12,194; Ncontrols = 28,072) (ref. 22), UC (Ncases = 12,366; Ncontrols = 33,609) (ref. 22), and autism (Ncases = 18,381; Ncontrols = 27,969) (ref. 23), we performed LDSC. We found no evidence of a genetic correlation between genetic liability to autism and IBD, UC, or Crohn’s (Table 3). Heritability scores (z scores: 8.34–11.75), chi-squares (1.20–1.53), and intercepts (1.01–1.12) satisfied the conditions to provide reliable LDSC estimates (Supplementary Table S5).
Study 3: Polygenic risk for IBD and broad autistic traits
Polygenic Risk Score (PRS) approaches enable the estimation of an individual’s underlying genetic liability to a complex trait. PRSs require individual-level genotype data, and are calculated as the sum of the individual’s risk alleles, weighted by the effect sizes of each variant identified in the GWAS of the trait24. In the context of the present study, individual-level data from the Avon Longitudinal Study of Parents and Children (ALSPAC) were used25,26,27. PRS approaches are particularly important for the triangulation of evidence from traditional observational approaches, since they allow the refinement of the exposure used in the context of the observational study (that is, they can potentially overcome misclassification bias of an observational study (Table 1)28.
In 7,348 mothers and 7,503 children of the ALSPAC cohort, we calculated PRSs for IBD, Crohn’s, and UC, using the latest available GWAS summary data22, and assessed associations with an available measure of broad autistic traits, autism mean factor score29 (Methods, Extended Data Fig. 4).
Maternal PRS for IBD and broad autistic traits in children
Maternal polygenic risk for UC and Crohn’s was associated with a higher autism factor mean score in the child (UC: βPRS = 0.02; 95%CIs: 0.003 to 0.05; P = 0.03; Crohn’s: βPRS = 0.03; 95%CIs: 0.01 to 0.05; P = 0.004). Similar results were found across other P-value thresholds (0.50–0.05). The effect size of the association between maternal polygenic risk for IBD and autism factor mean score, was comparable with that of UC and Crohn’s, although CIs crossed the null (βPRS = 0.02; 95%CIs: −0.004 to 0.040; P = 0.1; R2 = 0.06; Table 4, Extended Data Fig. 5, Supplementary Table S6).
Child’s PRS for IBD and broad autistic traits
There was no evidence of associations between a child’s PRS for IBD, UC, Crohn’s, and autism mean factor score in children (IBD: βPRS = 0.003; 95%CIs: −0.02 to 0.02; P = 0.79; R2 = 0.05; UC: βPRS = 0.001; 95%CIs: −0.02 to 0.02; P = 0.89; R2 = 0.05; Crohn’s: βPRS = 0.007; 95%CIs: −0.01 to 0.03; P = 0.49; R2 = 0.05; Table 4, Extended Data Fig. 6, Supplementary Table S7).
Study 4: Causal effect of genetic liability to IBD on autism
MR is a causal inference approach that can overcome limitations of observational and PRS approaches (Table 1). MR is based on the principles of instrumental variables analyses, utilizing germline genetic variants as instruments for exposures to assess their causal effects on outcomes of interest30,31,32. Since genetic variants are randomly assorted at meiosis and fixed at conception, the method is effective in minimizing confounding and reverse causation bias that hampers observational studies32,33. In contrast to PRS approaches that estimate associations, under certain assumptions that the instruments should satisfy, MR can generate unbiased causal effect estimates. The core assumptions of MR are outlined in the Methods.
Within a two-sample MR33 framework, we extracted common genetic variants robustly associated (P ≤ 5.0−8) with IBD, Crohn’s, and UC using the latest available GWAS summary data22, and assessed their causal effects on 18,381 autism cases and 27,969 controls of the PGC and the iPSYCH consortia23 (Online Methods, Extended Data Fig. 7, Supplementary Table S8). MR analyses were additionally performed using a subsample of the iPSYCH excluding all ID cases (Ncases = 11,203; Ncontrols = 22,555; Methods, Extended Data Fig. 8, Supplementary Table S9).
The mean F statistics of the IBD, UC, and Crohn’s instruments were 67, 68, and 70, respectively, suggesting adequate strength34. There was evidence of a causal effect of genetic liability to UC on risk of autism (IVWOR = 1.04; 95% CIs: 1.01 to 1.07; P = 0.006). Evidence for the effect of genetic liability to IBD and Crohn’s on autism risk was weaker, although the magnitude and direction of the effect estimates was comparable with the UC results (Table 5).
The magnitude and direction of causal effect estimates were consistent across all sensitivity analyses, and there was no evidence to suggest the influence of horizontal pleiotropy (Supplementary Table S10). Results of analyses with instruments extracted from the autism GWAS excluding ID cases were comparable with our primary effect estimates (Supplementary Table S11).
Causal effects of genetic liability to autism on risk of IBD
We assessed the possibility of reverse causation by performing bidirectional two-sample MR. We extracted common genetic variants associated (P ≤ 5.0−7) with autism, as well as autism without ID23, and assessed their potential causal effects on IBD (Ncases = 25,042; Ncontrols = 34,915), UC (Ncases = 12,366; Ncontrols = 33,609), and Crohn’s (Ncases = 12,194; Ncontrols = 28,072) (ref. 22) (Online Methods, Extended Data Figs. 6 and 7, Supplementary Tables S8 and S9). The mean F statistic of the autism instruments was 28, suggesting adequate strength. There was no evidence of a causal effect of genetic liability to autism on risk of IBD, UC, or Crohn’s (Table 5). The estimates were consistent across sensitivity analyses, with overlapping confidence intervals, and were unlikely to be influenced by horizontal pleiotropy (Supplementary Table S12). Repeating our analyses with instruments extracted from the autism GWAS excluding all ID cases yielded similar results (Supplementary Table S13).
Discussion
We used four complementary approaches to investigate the associations between parental diagnoses and genetic liability to IBD and autism in children. On conducting a nationwide register-based cohort study in Sweden we found evidence of associations between parental diagnoses of IBD and autism in children. Importantly, the maternal effect sizes were larger than the paternal sizes, without overlapping CIs. PRS analyses in the ALSPAC birth cohort suggested associations between maternal genetic liability to IBD and autism traits in children, while two-sample MR studies provided evidence of a potential causal effect of genetic liability to IBD on autism risk. There was no evidence to suggest a genetic correlation between autism and IBD, as indicated by LDSC analyses.
A number of studies have investigated the potential associations between parental autoimmune conditions and autism. Several parental autoimmune conditions have been previously identified as linked to autism in children, including rheumatoid arthritis35 and psoriasis36. In the case of IBD, evidence from previous studies is inconclusive. In contrast to studies to date, the use of four distinct study designs is a notable strength of our approach. Using study designs with different strengths and sources of bias (Table 1) allowed the triangulation of our findings, rather than relying on arbitrary P-value thresholds28,37. The Swedish nationwide register-based cohort study of over two million parent–child pairs is the largest to date on parental IBD and autism in children. In addition, the present study benefited from the longest to date follow-up period (1987–2016), as well as exposure and outcome ascertainment from both inpatient and outpatient specialist care.
The ALSPAC cohort containing genotype data for over 7,000 mothers and children, as well as broad autistic trait measures for over 13,000 children, is one of the richest resources for the investigation of the potential polygenic associations between maternal polygenic risk for IBD and autism in children. Finally, in the MR analyses, we used the largest GWAS data available for all conditions and conducted several sensitivity analyses to test the robustness of our findings.
Considering study limitations in the Swedish registers, the possibility of measurement error in IBD diagnoses cannot be excluded. However, this is likely to be nondifferential in relation to our study outcome and would therefore bias our findings towards the null. Second, while PRSs were based on large GWAS samples, it was not possible for us to investigate the variance explained by the PRSs in our target sample. However, based on previous studies38,39, it could be expected that our PRSs potentially explain little variance in the phenotype (≈1.5–3.0%), a limitation that could be overcome with future larger GWAS. Third, the autism mean factor score used in the present analyses was derived from individual measures that were not primarily intended to assess autism. However, the score has been found predictive of a clinical autism diagnosis (measured independent of the variables contributing to the derivation of the mean factor score) and presents associations with autism PRS in ALSPAC, as suggested by previous studies29,40. Fourth, in two-sample MR analyses investigating the effects of genetic liability to autism on risk of IBD, we used a relaxed instrument inclusion P-value threshold, this could potentially result in including weak instruments and therefore bias the causal effect estimates. The F statistic of the autism instruments in our analyses suggested that weak instrument bias is unlikely. Fifth, although we performed a series of sensitivity analyses to assess the robustness of the causal effect estimates, the possibility of horizontal pleiotropy influencing the present findings cannot entirely be ruled out, especially considering emerging evidence on the genetic architecture of IBD, implicating immune and endocrine-related genes41. Sixth, using GWAS data, we could only investigate the possible contribution of common variants acting under an additive model and not any contribution from rare variation which is found to be implicated in autism42,43. Finally, an important consideration is that the present study has been conducted using samples and GWAS data of predominantly European ancestry individuals. Although a proportion of index children in the registry-based study had at least one parent of non-European descent (10%), the use of European ancestry summary and individual-level genetic data in LDSC, PRS, and MR analyses, was unavoidable considering the largest available GWAS on autism and IBD has been conducted in European ancestry samples. The increasing representation of ancestrally diverse populations in biobanks and health registers will allow future studies to build on the present findings.
Overall, our findings suggest larger maternal effect sizes than paternal in the registry-based study, in combination with the identified associations between maternal, but not child’s, PRS for IBD and child’s autism factor mean score, which could potentially indicate in-utero effects. This could be further supported considering we did not find evidence of a genetic correlation between autism and IBD. Specifically, based on liability-threshold models of inheritance44,45,46,47 (and assuming that liability to IBD is normally distributed in the population), it could be hypothesized that liability to IBD will be expressed after a threshold has been exceeded, depending on a synergy of genetic variation, environmental factors, and chance. Mothers close to the threshold, but not exceeding it, could be expected to express subphenotypic manifestations of IBD, such as immunological alterations, micronutrient deficiencies, or anemia. These subphenotypic manifestations could influence fetal development. In fact, several immune pathways have been implicated in both Crohn’s and UC (which are strongly genetically correlated: rg = 0.5; P = 2.0 × 10−13 (ref. 20)), including T-helper 1, T-helper 2, and T-helper 17 cytokines48, which are increasingly identified as linked to perinatal complications49,50,51, as well as autism52,53,54. Similarly, micronutrient malabsorption and anemia during pregnancy have been found to be associated with autism in children9,10. The availability of genotype and biospecimen data in autism family cohorts such as the Simons Simplex Collection and the Simons Foundation Powering Autism Research (SPARK)55,56, is expected to allow the integration of genomic, immune, and gut microbiome profiling approaches to elucidate the potential etiology and biological pathways underlying the identified associations.
In conclusion, triangulating evidence from a nationwide register-based cohort study, genetic correlation, PRS analyses, and MR, we found evidence suggesting associations between parental, particularly maternal, diagnoses of IBD, and autism in children. Links between maternal genetic liability to IBD and autism in children may reflect the influence of the maternal genotype on the prenatal/intrauterine environment. Investigating the mechanisms behind these findings may provide valuable insights into the origins of autism.
Methods
Throughout the text, the terms autism and autistic people/individuals are used, in line with recent evidence suggesting that these terms are preferred in the autistic community and are less stigmatizing58,59.
Study 1: Swedish cohort study
We used individual-level data from ‘Psychiatry Sweden’ to investigate whether parental IBD diagnosis is associated with autism diagnosis in children. ‘Psychiatry Sweden’ is a comprehensive national register linkage, with approval from the Stockholm regional ethical review committee (DNR 2010/1185-31/5, 2016/987-32). In line with the standards of all register-based research in Sweden and in keeping with the specific ethical approval for ‘Psychiatry Sweden’, informed patient consent was not required for the analysis of the anonymized data.
All children born in Sweden from 1 January 1987 to 31 December 2010 (N = 2,837,045) were eligible index persons, with follow up to 31 December 2016. Exclusion criteria were: children born outside Sweden (n = 292,023), children not registered in the Medical Birth Register (n = 74,240), children resident in Sweden for under 5 years (n = 23,495), children of multiple pregnancy (n = 67,309), children who were adopted (n = 2,425), children who received a diagnosis of autism or ID who also had a documented genetic/metabolic condition known to cause neurodevelopmental disorders (for example, trisomies) (n = 7,873), or incomplete parental records (n = 45,453) (ref. 60). The study population included 2,324,227 children born to 1,282,494 mothers and 1,285,719 fathers (Extended Data Fig. 1).
The National Patient Register (NPR) includes inpatient care records beginning in 1973, outpatient physician visits in specialist care from 1997, outpatient psychiatric diagnoses from 2006, and children and adolescent psychiatric care from 2011. Autism was identified in the National Patient Register (NPR) using ICD-9 and ICD-10 codes (Extended Data Fig. 9). Lifetime history of parental IBD, Crohn’s disease (Crohn’s) and ulcerative colitis (UC) were identified using ICD-9 and ICD-10 codes in the NPR (Extended Data Fig. 9). We used parental lifetime IBD diagnosis as the primary exposure. This approach was considered appropriate since data from outpatient specialist care were not originally included in the NPR and these were added starting in the late 1990s. Extended Data Fig. 9 illustrates the frequency of IBD diagnoses (for mothers and fathers of the study cohort) in NPR from 1987 to 2010.
Using STATA/MP17, we estimated the odds ratios and 95% CIs of the association of mother’s and father’s diagnosis of IBD (any IBD, Crohn’s, or UC) with autism in children using generalized estimating logistic models with robust standard errors accounting for clustering of multiple children born to the same parents.
Model 1 was unadjusted. Model 2 was adjusted for parental age at delivery15, migrant status16, education level, family income quintile at birth17, parents’ history of psychiatric diagnosis prior to the birth of the child, and child’s sex, birth year, and birth order (Supplementary Table S14 for collinearity diagnostics of covariates included in the models). Model 3 was additionally mutually adjusted for maternal and paternal IBD diagnoses to avoid bias from assortative mating19. Additionally, we investigated associations between any parental IBD diagnoses and autism in children with and without ID separately, since these groups may have distinct genetic and environmental risk factors18,61,62,63 and outcomes64,65. Due to the number of analyses run in the study, we applied a Bonferroni correction to account for multiple testing (0.05/42 = 0.0012). We compared the results of three sensitivity analyses with the results of the main analysis. First, we restricted parental IBD diagnoses to those recorded prior to the birth of the index person. Second, we adjusted Models 2 and 3 for parental lifetime autism diagnoses specifically, instead of the broad definition of parental psychiatric history used in the main analysis. Finally, we repeated the analyses without exclusion of the 7,873 children who had a documented genetic/metabolic condition assumed to be causing their neurodevelopmental disorder.
Study 2: LDSC
We used LDSC to estimate the genetic correlation between genetic liability to autism and IBD, Crohn’s, and UC.
LDSC allows the estimation of the genetic correlation between polygenic traits using GWAS summary statistics by capitalizing on patterns of linkage disequilibrium among common genetic variants20. We used the latest available GWAS summary data on autism (Ncases = 18,381; Ncontrols = 27,969) (ref. 23), IBD (Ncases = 25,042; Ncontrols = 34,915) (ref. 22). Crohn’s (Ncases = 12,194; Ncontrols = 28,072) (ref. 22), and UC (Ncases = 12,366; Ncontrols = 33,609) (ref. 22). Detailed information on study samples and case definition can be found in the original publications.
We followed the suggested protocol for LDSC analyses (https://github.com/bulik/ldsc/wiki). Using the LDSC (LD score) v.1.0.1 software in Python v.2.7.18, we estimated genetic correlations using pre-computed LD scores from the 1000 Genomes project European data66 (from: https://data.broadinstitute.org/alkesgroup/LDSCORE/eurwld_chr.tar.bz) with an unconstrained intercept term to account for any sample overlap and population stratification.
Ethics committee approval was not required for this analysis of publicly available GWAS summary statistics.
Study 3: Polygenic Risk Score analyses in the ALSPAC cohort
Discovery sample
Common genetic variants, corresponding alleles, effect sizes, and P values were extracted to calculate PRSs from the GWAS summary data of IBD22, UC22, and Crohn’s22 described above.
Target sample
ALSPAC is a UK prospective birth cohort study based in Bristol and surrounding areas, which recruited pregnant women with expected delivery dates from 1 April 1991 to 31 December 1992; 14,541 women were initially enrolled, with 14,062 children born, and 13,988 children alive at 1 year of age. Detailed information on the cohort is available elsewhere25,26,27. A fully searchable study data dictionary is available at: http://www.bristol.ac.uk/alspac/researchers/our-data/. Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees.
Genetic data
10,015 ALSPAC mothers were genotyped on the Illumina Human660W quad genome-wide single nucleotide polymorphism (SNP) genotyping platform at the Center National Genotypage, and genotypes were identified using Illumina GenomeStudio. A total of 9,912 ALSPAC children were genotyped on the Illumina HumanHap550 quad chip genotyping platforms by 23andme subcontracting the Wellcome Trust Sanger Institute, Cambridge, UK, and the Laboratory Corporation of America, Burlington, NC, United States.
PLINK v1.07 was used for quality control filtering67. Specifically, individuals were excluded on the basis of the following filters: (1) gender mismatches; (2) undetermined X chromosome heterozygosity; (3) over 3% missingness (children); over 5% missingness (mothers); (4) evidence of crypted relatedness (>10% of shared alleles identical by descent in children and >12.5% of shared alleles identical by descent in mothers); (5) non-European ancestry, assessed by multidimensional scaling analysis compared with HapMap 2 individuals. SNPs were excluded on the basis of the following filters: (1) minor allele frequency < 1%; (2) call rate < 95%, (3) Hardy–Weinberg equilibrium (HWE) P < 5.0 × 10−7. Maternal and offspring genotype data were combined and imputed using Impute v.2.2.2 against 1000 Genomes reference panel (v.1, phase 3, December 2013 release).
After quality control and excluding participants who had withdrawn consent, genetic data were available for 7,921 mothers and 7,977 children of European ancestry. Consent for biological samples was collected in accordance with the Human Tissue Act (2004).
Broad autistic traits: autism factor mean score
We used a measure of broad autistic traits previously estimated in ALSPAC as the mean score of seven factors derived from a factor analysis of 93 measures related to autism in ALSPAC29. The measure was available in 13,103 children, and strongly predictive of the autism diagnosis measured independently via school records, record linkage, and parental reports29. Other autism trait measures or diagnoses were not used, as there were fewer genotyped mothers and children with these measures.
Calculation of PRSs in ALSPAC and statistical analysis
PRSs were calculated using PLINK v.1.9, applying the method described by the Psychiatric Genomics Consortium (PGC)68. SNPs with mismatching alleles between the discovery and target dataset were removed. The Major Histocompatibility Complex (MHC) region was removed (25–34 Mb), except for one SNP representing the strongest signal within the region. Using ALSPAC data as the reference panel, SNPs were clumped with an r2 of 0.25 and a physical distance threshold of 500 kB. The optimal P-value threshold for PRS is dependent on discovery and target sample sizes, as well as SNP inclusion parameters (for example, r2) (refs. 24,69). For this reason, we calculated PRS for each participant across 13 P-value thresholds (P < 5.0 × 10−8 to P < 0.5), standardized by subtracting the mean and dividing by the standard deviation. We defined PRS corresponding to P-value threshold 0.05 as our primary exposure, based on a previous ALSPAC study70. This threshold has been found to have sufficient predictive ability for IBD and its subtypes39. We could not directly assess the predictive power and optimal P-value threshold of the PRSs in our target sample, as there were few UC (n = 12) and Crohn’s cases (n = 16).
After constructing PRSs for IBD, UC, and Crohn’s in mothers and children, we performed linear regressions using STATA/MP 15 to examine associations with the standardized autism factor mean score in childhood. Analyses were adjusted for child’s sex and the first ten principal components of the ALSPAC genotype data to avoid population stratification bias24.
Study 4: Two-sample MR
We performed two-sample MR to assess bidirectional causal links between genetic liability to autism and IBD and its subtypes, and vice versa.
MR can be implemented as an instrumental variable approach, utilizing common genetic variants as instruments for exposures of interest, allowing assessment of causal effects and their direction on outcomes. MR relies on the following assumptions: (1) there must be a robust association between the common genetic variants and the exposure (that is, no horizontal pleiotropy, the phenomenon in which the genetic variant influences multiple phenotypes through biologically distinct pathways); (2) the variants should operate on the outcome entirely via the exposure; and (3) the variants should not be associated with any confounders of associations between exposure and outcome71. In this study, we applied two-sample MR, in which the effect sizes and standard errors of the instruments for the exposure and the outcome were extracted from separate GWASs conducted in independent samples from the same underlying population33.
Genetic instruments
Genetic instruments were extracted from the overlapping set of SNPs between the autism23, IBD22, UC22, and Crohn’s22 GWASs. This ensured that all selected genetic instruments would be present in the outcome GWAS.
GWAS summary data were restricted to a common set of SNPs and then clumped in PLINK 1.90 using the 1000 Genomes66 phase 3 European ancestry reference panel, and an r2 = 0.01, within a 10,000 Kb window. Among the independent variants, instruments were defined using a genome-wide significance threshold of P ≤ 5 ×10−8. The only exception was autism, as only two independent and genome-wide significant variants were identified. We therefore relaxed the P-value threshold to 5 × 10−7 to improve statistical power, as used previously72. Extended Data Fig. 7 illustrates the process of instrument definition, and Supplementary Table S8 contains information on the genetic instruments used.
Harmonization
We harmonized the alleles of the outcome on the exposure, to ensure SNP-exposure and SNP-outcome effects correspond to the same allele. Variants identified as palindromic were removed, as the effect allele frequencies in the IBD, UC, and Crohn’s GWASs were not provided. Supplementary tables S15 and S16 contain details of the harmonized datasets.
Inverse Variance Weighted MR
The primary MR analysis was the inverse variance weighted (IVW) method which provides an overall causal effect estimate of the exposure on the outcome, estimated as a meta-analysis of the ratios of the SNP-outcome effect to the SNP-exposure effect weighted by each SNP’s relative precision73.
Sensitivity analyses
We assessed the strength of the instruments by estimating the mean F statistic. As a rule of thumb, the IVW is unlikely to suffer from weak instrument bias if mean F > 10 (ref. 34).
We assessed the consistency of the IVW causal effect estimates using sensitivity analyses, including MR Egger regression73, weighted median74, and weighted mode75 (Supplementary Table 20).
The autism GWAS used in our primary analyses included a proportion of autism cases with ID23. We tested the consistency of the causal effect estimates using GWAS summary data on a subsample of the iPSYCH cohort76 excluding all intellectual disability cases (Ncases = 11,203; Ncontrols = 22,555). Extended Data Fig. 8 visualizes the process of instrument definition, and Supplementary Tables S9, S17 and S18 contain details on the instruments used and the harmonized datasets.
Two-sample MR analyses were performed using the TwoSampleMR R package77 in R v.3.5.1. Ethics committee approval was not required for this analysis of GWAS summary statistics.
Reporting summary
Further information on research design is available in the Nature Research Reporting summary linked to this article.
Data availability
Swedish registry data: Individual-level data from ‘Psychiatry Sweden’ were used and under ethics approval from the Stockholm regional ethical review committee (DNR 2010/1185-31/5, 2016/987-32). Due to the sensitive nature of the data, data are not publicly available. Data must remain in the country, according to national laws and registry regulations. Access is restricted to projects approved by the Swedish ethical review authority (https://etikprovningsmyndigheten.se/) and in agreement with the register holders. See https://www.registerforskning.se/en/ for guidance on how to conduct Swedish register-based research. Since there is no central access point for public authority data in Sweden, this process may require coordination with multiple register holders (for example, Statistics Sweden, The National Board of Health and Welfare) and requires, in our experience, at least 1 year from the time of ethical approval, depending on workload for each register holder. GWAS summary data: GWAS summary data for IBD, UC, Crohn’s, and autism used in the LDSC, PRS and MR analyses, are publicly available (IBD: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST004001-GCST005000/GCST004131/; UC: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST004001-GCST005000/GCST004133/; Crohn’s: http://ftp.ebi.ac.uk/pub/databases/gwas/summary_statistics/GCST004001-GCST005000/GCST004132/; autism: https://www.med.unc.edu/pgc/download-results/). Restrictions apply to the availability of the GWAS summary data for autism without IDs, in order to ensure that there is no conflict with ongoing projects, collaborations and iPSYCH’s data-sharing policies. Data can be accessed after correspondence with the iPSYCH: https://ipsych.dk/. Researchers will be asked to prepare a short application, briefly describing the proposed study, and responses will typically be within 2 weeks. ALSPAC data: Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. Individual-level data from the ALSPAC birth cohort are not publicly available for reasons of clinical confidentiality. Data can be accessed after application to the ALSPAC Executive Team who will respond within 10 working days. Application instructions and data use agreements are available at http://www.bristol.ac.uk/alspac/researchers/access/. The minimum dataset for MR analyses is available in Supplementary Tables 8, 9, 15, 16, 17, and 18
Code availability
Analyses were conducted using established protocols for each analytic approach used in the present study. Specifically, in the case of LDSC, the protocol described at: https://github.com/bulik/ldsc/wiki/Heritability-and-Genetic-Correlation was used. In the case of PRS calculation, the approach described at: https://www.nature.com/articles/nature13595 was applied. Finally, for two-sample MR, the approach described at: https://mrcieu.github.io/TwoSampleMR/articles/introduction.html was applied.