The members of the Systemic Lupus Erythematosus Genetics Consortium who participated are listed at the end of the article.
These Authors contributed equally to this work.
John B Harley, Marta E Alarcón-Riquelme, Lindsey A Criswell, Patrick M Gaffney, Chaim O Jacob, Robert P Kimberly, Kathy L M Sivils, Betty P Tsao, Timothy J Vyse and Carl D Langefeld.
Systemic lupus erythematosus (SLE) is a clinically heterogeneous disease affecting multiple organ systems and characterized by autoantibody formation to nuclear components. Although genetic variation within the major histocompatibility complex (MHC) is associated with SLE, its role in the development of clinical manifestations and autoantibody production is not well defined. We conducted a meta-analysis of four independent European SLE case collections for associations between SLE sub-phenotypes and MHC single-nucleotide polymorphism genotypes, human leukocyte antigen (HLA) alleles and variant HLA amino acids. Of the 11 American College of Rheumatology criteria and 7 autoantibody sub-phenotypes examined, anti-Ro/SSA and anti-La/SSB antibody subsets exhibited the highest number and most statistically significant associations. HLA-DRB1*03:01 was significantly associated with both sub-phenotypes. We found evidence of associations independent of MHC class II variants in the anti-Ro subset alone. Conditional analyses showed that anti-Ro and anti-La subsets are independently associated with HLA-DRB1*0301, and that the HLA-DRB1*03:01 association with SLE is largely but not completely driven by the association of this allele with these sub-phenotypes. Our results provide strong evidence for a multilevel risk model for HLA-DRB1*03:01 in SLE, where the association with anti-Ro and anti-La antibody-positive SLE is much stronger than SLE without these autoantibodies.
Systemic lupus erythematosus (SLE; OMIM 152700) is a complex autoimmune disease that can affect multiple organ systems. Processes involving both the innate and adaptive immune systems contribute to its development.
There is overwhelming evidence of a genetic component to SLE risk with higher concordance rates observed between monozygotic twins (20–40%) compared with dizygotic twins (2–5%).
For this study, we collected genetic and sub-phenotype data from 3070 SLE cases of European descent characterized in four genetic association studies of SLE. These SLE cases were previously examined in a large meta-analysis that examined the association between MHC genetic variation and SLE susceptibility.
We examined the 11 ACR classification criteria
The most associated marker (in terms of
The most associated amino acid (AA) was at position 77 in HLA-DRB1 with the common AA threonine having a protective effect (
Owing to the correlation between the most associated SNPs with known associated
A simple stepwise regression analysis including only AA variants indicated associations with Thr77, Leu67 and Gln96 in
Owing to the extended LD, an analysis of the MHC using stepwise regression to find evidence for multiple independently associated variants can lead to many models depending on the first marker conditioned on (used as a covariate for further association analysis). This was discussed previously
There are two main extended MHC haplotypes associated with SLE in northern Europeans that contain the class II alleles
The most strongly associated marker with the anti-La autoantibody sub-phenotype was the SNP rs2894254, in the class III region (
As with the analysis of anti-Ro, we used the BIC as an aid to model comparison. The model including AA variation has the lowest BIC (model C in
We observed significant effects for the
Thus far, we have observed strong evidence of association between
When conditioning on anti-La as a covariate,
We have provided strong evidence for the association between
If the association between
We found a significant difference in
We also found a significant increase in dosage between anti-Ro-negative cases and anti-Ro positive cases (
We found evidence against the hypothesis that the increase in dosage is additive over the three disease levels (
We found a significant difference in
We also found a significant increase in dosage between anti-La-negative and anti-La-positive cases (
We found evidence against the hypothesis that the increase in dosage is additive over the three disease levels (
Our study was large enough to determine whether the frequency of
Our results confirm, in the largest SLE sub-phenotype genetic association study to date, that the often replicated genetic association at
We do not find conclusive evidence that variant HLA AAs explain the majority of the MHC association signal in anti-Ro and anti-La autoantibody subsets in SLE. This is largely due to the confounding effects of extended LD displayed by the associated DRB1*03:01 and to a lesser extent, the DRB1*15:01 haplotypes in our study cohorts. These results contrast with those of a recent study in anti-CCP-positive rheumatoid arthritis, where five HLA AA variants were suggested to largely explain the MHC association with disease status.
Limitations of the present study include the heterogeneity in autoantibody testing procedures and sub-phenotype data collection between the four studies. As a result, data were tabulated and analysed in an essentially binary format (that is, individual cases were classified as positive, negative or missing for each trait), to allow meta-analysis. However, in so doing, a degree of noise is inevitable, which would reduce our power to detect true association signals particularly in the less common sub-phenotypes. We were also limited by the imputation required to analyse a consistent set of SNPs across studies and the reliance on HLA imputation. In addition, we are constrained in our conclusions on differences in results for anti-Ro and anti-La antibody subsets given the much smaller sample size available for the anti-La phenotype. Thus, we have confined some of our analyses to the most robust association; that of
In both anti-Ro and anti-La sub-phenotypes, we find evidence of secondary independent associations in the class II region of the MHC after conditioning on
Although we do find evidence of an independent class III association with anti-Ro, there is some uncertainty. We find a significant association with the class III SNP rs3130781 conditional on the AA Thr77-DRB1. However, when conditioning on the markers in model C in
The results of the present study while enlightening are confounded by the strong and extended LD present on the principally associated
This study is a meta-analysis of four studies taken from work described in a previous paper.
We only analysed SNPs that passed QC in our previous paper,
We imputed HLA genotypes using HLA*IMP V2.
AA sequences for each HLA allele were extracted from the European Bioinformatics Institute HLA database (
We had data at 338 AA positions that had variable AAs (HLA-A =67, HLA-B = 75, HLA-C =71, HLA-DPB1 = 21, HLA-DQA1 = 41, HLA-DQB1 =61, HLA-DRB1 = 52). Owing to multiple possible AAs at each position, we actually had 1255 possible position/AA variants in total.
We analysed the data with the statistical computing language R
We examined the 11 ACR criteria
Owing to numerous single-marker associations within the extended LD of the MHC, we used conditional analyses to narrow these associations to those with the best evidence for strength and independence. All analyses utilized logistic regression with ancestry and project covariates (see above) and were halted when the evidence for association with a new term was
A simple forward stepwise approach can lead to over-fitting (selecting many correlated markers) and the results may be misleading because of selected markers potentially tagging two or more independently associated markers.
Given the high degree of correlation between the associated variants identified from the model searches described above, we conducted a haplotype analysis of these variants using PLINK
In the MHC, a Bonferroni adjustment for multiple testing is inappropriate because of the extensive LD and hence correlated variants. In order to determine the number of independent variants, we performed a PC analysis of all SNPs. In our data, we found that 374 PCs had eigenvalues >1 and these PCs explained 96% of the variance. Thus, we used a multiple-testing threshold of
We fitted a linear regression model with dosage for
Furthermore, we tested the hypothesis that the increase in dosage is additive over the three disease levels (Healthy-Control Case sub-phenotype negative/Case sub-phenotype positive). This is achieved by fitting a model with an additive effect for dosage over the three phenotype levels. This additive model is nested within our model used to test independence of sub-phenotype association with SLE-case/healthy control, so we performed a likelihood ratio test. A rejection of this additive model, in favour of the three-level factor model (described in the previous paragraph), is evidence that the change in dosage over sub-phenotype within cases is different than the change in dosage between healthy controls and SLE without the sub-phenotype.
We thank the original study participants and their families for their contributions to this research, along with clinical colleagues who facilitated data collection. We thank Alexander Dilthey for his advice during the HLA imputation. We also thank the investigators of IMAGEN (John D Rioux, Philippe Goyette, Timothy J Vyse, Lennart Hammarström, Michelle MA Fernando, Todd Green, Philip L De Jager, Sylvain Foisy, Joanne Wang, Paul IW de Bakker, Stephen Leslie, Gilean McVean, Leonid Padyukov, Lars Alfredsson, Vito Annese, David A Hafler, Qiang Pan-Hammarström, Ritva Matell, Stephen J Sawcer, Alastair D Compston, Bruce AC Cree, Daniel B Mirel, Mark J Daly, Tim W Behrens, Lars Klareskog, Peter K Gregersen, Jorge R Oksenberg and Stephen L Hauser).
A full list of the investigators who contributed to the generation of the Wellcome Trust Case-Control Consortium data is available from the WTCCC website (see
This study was founded by Swedish Research Council, Instituto de Salud Carlos III (PI12/02558) partly financed by FEDER funds of the EU, and the BIOLUPUS RNP funded by the European Science Foundation to MEA-R; American College of Rheumatology Rheumatology Research Foundation Physician Scientist Development Award and National Institutes of Health, National Center for Advancing Translational Sciences through UCSF-CTSI Grant KL2TR000143 to SAC.
The ARUK funded a Clinical Scientist Fellowship for MFS (ref 18239) and the Arthritis Research UK funded DLM under (ref 17761/PI TJV). MEA-R was funded by the Swedish Research Council and Instituto de Salud Carlos III grant number PS09/00129 cofinanced through FEDER funds of the European Union and the Consejería de Salud de Andalucía PI0012.
The IMAGEN consortium was supported by Grant AI067152 from the National Institutes of Allergy and Infectious Diseases.
Funding for the Wellcome Trust Case-Control Consortium project was provided by the Wellcome Trust under award 076113 and 085475.
Cord blood samples were collected by V L Nimgaonkar’s group at the University of Pittsburgh, as part of a multi-institutional collaborative research project with J Smoller, MD DSc and P Sklar, MD PhD (Massachusetts General Hospital; grant MH 63420).
Support for the Illumina MHC Panel study was provided by the NIH (AR052300, AR02175, AR22804, AR62277, AR42460, AI024717, AI083194, AR62277, AI082714, AI53747, AI31584, DE15223, RR20143, PR094002, AI62629, AR48940, AR19084, AR043274, AI063274, AI40076, AR052125, HG006828, AR048929, and AR049084), research grants from the US Department of Veterans Affairs, US Department of Defense (PR094002), American College of Rheumatology, Alliance for Lupus Research, Rheuminations, the Lupus Foundation of Minnesota and the Mary Kirkland Center for Lupus Research. This study was performed in part in the General Clinical Research Center, Moffitt Hospital, University of California San Francisco, with funds provided by the National Center for Research Resources, 5 M01 RR-00079, US Public Health Service.
A full list of the investigators who contributed to the generation of the WTCCC data is available from
Online Mendelian in MAN (OMIM):
The authors declare no conflict of interest.
Individual studies with number of genotyped SNPs, number of SLE cases and sample sizes for anti-Ro/La within cases
| Study case collection | Sample sizes (+/−/missing) | ||
|---|---|---|---|
| Illumina HumanHap550 | 2380 | 1123 | Anti-Ro: 319/796/8 |
| Illumina HumanHap317 | 1522 | 398 | Anti-Ro: 36/107/225 |
| Illumina Combined MHC panel | 2360 | 917 | Anti-Ro:158/454/305 |
| Illumina custom panel | 1230 | 632 | Anti-Ro: 168/446/18 |
Abbreviations: MHC, major histocompatibility complex; SLE, systemic lupus erythematosus; SNP, single-nucleotide polymorphism. The last column denotes the number of SLE cases who were positive, negative or had missing data for each sub-phenotype.
Number of SNPs on the genotyping platform located on chromosome 6 between 26 000 and 34 000 kb.
See original paper
Forward stepwise regression models for anti-Ro
| Marker | Estimates from the multiple regression model
| Single marker
| ||||
|---|---|---|---|---|---|---|
| OR (95% CI) | Class | Gene | OR (95% CI) | |||
| rs3129962 (A<G) | 2.44 (2.08–2.94) | 9.47 × 10−27 | Class III | BTNL2 | 2.32 (1.98–2.71) | 2.02 × 10 −25 |
| rs9271731 (A<G) | 1.54 (1.30–1.85) | 9.56 × 10−07 | Class II | DRB1-DQA1 | 1.27 (1.09–1.49) | 2.58 × 10 −03 |
| rs3957146 (G<A) | 0.52 (0.39–0.69) | 5.70 × 10−06 | Class II | DQB1-DQA2 | 0.38 (0.29–0.50) | 3.10 × 10 −12 |
| Thr77 DRB1 | 0.49 (0.41–0.60) | 2.72 × 10−13 | Class II | DRB1 | 0.45 (0.39–0.52) | 5.26 × 10 −25 |
| rs9271731 (A<G) | 1.63 (1.37–1.95) | 4.50 × 10−08 | Class II | DRB1-DQA1 | 1.27 (1.09–1.49) | 2.58 × 10 −03 |
| rs3130781 (G<A) | 1.44 (1.22–1.71) | 1.76 × 10−05 | Class III | DPCR1 | 1.95 (1.69–2..25) | 2.19 × 10 −20 |
| | 0.56 (0.42–0.73) | 2.49 × 10−05 | Class II | DQA1 | 0.42 (0.33–0.55) | 1.67 × 10 −10 |
| | 2.22 (1.88–2.66) | 1.11 × 10−20 | Class II | DRB1 | 2.22 (1.91–2.59) | 9.29 × 10 −25 |
| | 1.54 (1.28–1.85) | 4.11 × 10−06 | Class II | DRB1 | 1.32 (1.13–1.57) | 7.42 × 10 −04 |
| rs9275582 (A<G) | 0.61 (0.50–0.75) | 2.99 × 10−06 | Class II | DQB1-DQA2 | 0.45 (0.38–0.55) | 6.12 × 10 −16 |
| | 2.38 (2.02–2.80) | 1.21 × 10−25 | Class II | DRB1 | 2.22 (1.91–2.59) | 9.29 × 10 −25 |
| | 1.63 (1.36–1.95) | 9.71 × 10−08 | Class II | DRB1 | 1.32 (1.13–1.57) | 7.42 × 10 −04 |
| | 0.53 (0.41–0.70) | 4.84 × 10−06 | Class II | DQB1 | 0.42 (0.33–0.55) | 1.67 × 10 −10 |
| Thr77 DRB1 | 0.29 (0.24–0.36) | 4.00 × 10−32 | Class II | DRB1 | 0.45 (0.39–0.52) | 5.26 × 10 −25 |
| Leu67 DRB1 | 0.64 (0.53–0.77) | 2.81 × 10−06 | Class II | DRB1 | 1.05 (0.93–1.20) | 4.22 × 10 −01 |
| Gln96 DRB1 | 1.47 (1.23–1.76) | 2.29 × 10−05 | Class II | DRB1 | 1.29 (1.11–1.51) | 1.28 × 10 −03 |
Abbreviations: BIC, bayesian information criterion; CI, confidence interval; HLA, human leukocyte antigen; OR, odds ratio; SNP, single-nucleotide polymorphism. SNPs have their minor and major alleles noted in brackets (A<G where A is the minor allele, for example), the OR is with respect to the minor allele. For BIC see Materials and Methods.
Forward stepwise regression models for anti-La
| Marker | Estimates from the multiple regression model
| Single marker
| |||||
|---|---|---|---|---|---|---|---|
| OR | 95% CI | Class | Gene | OR | |||
| rs2894254 (C<A) | 3.38 | 2.74–4.16 | 3.40 × 10−30 | Class III | 3.38 | 3.40 × 10 −30 | |
| HLA-DRB1*03:01 | 2.50 | 2.00–3.13 | 1.40 × 10−15 | 3.15 | 3.31 × 10 −28 | ||
| rs9268832 (C<A) | 1.64 | 1.32–2.04 | 6.53 × 10−06 | Class II | 2.31 | 2.46 × 10 −17 | |
| Thr77 DRB1 | 0.40 | 0.32–0.50 | 1.33 × 10−15 | 0.32 | 2.40 × 10 −28 | ||
| rs2227139 (C<A) | 1.64 | 1.32–2.04 | 6.47 × 10−06 | Class II | 2.32 | 1.86 × 10 −17 | |
Abbreviations: BIC, bayesian information criterion; CI, confidence interval; OR, odds ratio. SNPs have their minor and major alleles noted in brackets (A<G where A is the minor allele, for example), the OR is with respect to the minor allele. For BIC see Materials and Methods.
Allele frequencies for
| Status | Anti-La( +), | Anti-La( −), |
|---|---|---|
| Anti-Ro( +) | 0.41 (259) | 0.26 (418) |
| Anti-Ro( −) | 0.28 (22) | 0.18 (1781) |
Frequencies for the
Multi-level model for
| Phenotype | Effect (change in dosage) | 95% CI | |
|---|---|---|---|
| Anti-Ro( −)/control | 0.10 | 0.08–0.13 | 1.97 × 10 −14 |
| Anti-Ro( +)/anti-Ro(−) | 0.27 | 0.22–0.31 | 2.97 × 10 −33 |
| Anti-La( −)/control | 0.13 | 0.11–0.15 | 3.57 × 10 −25 |
| Anti-La( +)/anti-La( −) | 0.41 | 0.35–0.47 | 2.45 × 10 −39 |
Abbreviation: CI, confidence interval. Effect is the change in dosage for