Conceived and designed the experiments: CM Lill, MB McQueen, JPA Ioannidis, L Bertram. Performed the experiments: CM Lill, JT Roehr, S Bagade, B-M Schjeide, E Meissner, U Zauft, NC Allen, KJ Anderson, G Beecham, D Berg, JM Biernacka, A Brice, AL DeStefano, CB Do, N Eriksson, SA Factor, MJ Farrer, T Foroud, T Gasser, T Hamza, JA Hardy, P Heutink, C Klein, JC Latourelle, DM Maraganore, ER Martin, M Martinez, RH Myers, H Payami, WK Scott, M Sharma, AB Singleton, K Stefansson, T Toda, JY Tung, J Vance, NW Wood, CP Zabetian, 23andMe, GEO-PD, IPDGC, Parkinson's Disease GWAS, WTCC2. Analyzed the data: CM Lill, JT Roehr, MB McQueen, FK Kavvoura, L Bertram. Wrote the paper: CM Lill, JPA Ioannidis, L Bertram. Helped write the manuscript: E Meissner, MJ Farrer, T Foroud, T Gasser, C Klein, DM Maraganore, H Payami, AB Singleton, M Sharma, F Zipp, H Lehrach. Helped analyze the data: S Bagade, T Liu, M Schilling, CB Do, N Eriksson, T Hamza, EM Hill-Burns, MA Nalls, N Pankratz, W Satake, M Sharma. Interpretation of results: CM Lill, JPA Ioannidis, L Bertram. Study coordination: CM Lill, T Foroud, JA Hardy, H Payami, AB Singleton, P Young, RE Tanzi, MJ Khoury, F Zipp, H Lehrach, JPA Ioannidis, L Bertram. Literature searches and data entry: CM Lill, S Bagade, B-M Schjeide, E Meissner, U Zauft, N Allen.
¶ Memberships of the consortia are provided in Text S1.
More than 800 published genetic association studies have implicated dozens of potential risk loci in Parkinson's disease (PD). To facilitate the interpretation of these findings, we have created a dedicated online resource, PDGene, that comprehensively collects and meta-analyzes all published studies in the field. A systematic literature screen of ∼27,000 articles yielded 828 eligible articles from which relevant data were extracted. In addition, individual-level data from three publicly available genome-wide association studies (GWAS) were obtained and subjected to genotype imputation and analysis. Overall, we performed meta-analyses on more than seven million polymorphisms originating either from GWAS datasets and/or from smaller scale PD association studies. Meta-analyses on 147 SNPs were supplemented by unpublished GWAS data from up to 16,452 PD cases and 48,810 controls. Eleven loci showed genome-wide significant (
The genetic basis of Parkinson's disease is complex, i.e. it is determined by a number of different disease-causing and disease-predisposing genes. Especially the latter have proven difficult to find, evidenced by more than 800 published genetic association studies, typically showing discrepant results. To facilitate the interpretation of this large and continuously increasing body of data, we have created a freely available online database (“PDGene”:
Parkinson's disease (PD) is the second most common neurodegenerative disease with a prevalence of ∼1% over 60 years of age
During the last few years, genome-wide association studies (GWAS)
| GWAS | Design GWAS (Follow-up) | Population GWAS (Follow-up) | # SNPs | # PD GWAS (Follow-up) | # CTRL GWAS (Follow-up) | “Featured” genetic loci |
| Maraganore, 2005 (ref. 9) | Family-based (case-control) | USA-LEAPS (USA) | 198,345 | 443 (332) | 443 (332) | |
| Fung, 2006 (ref. 10) | Case-control (-) | USA-NINDS | 408,803 | 267 (-) | 270 (-) | |
| Pankratz, 2009 (ref. 11) | Case-control (-) | USA-PROGENI/GenePD (-) | 328,189 | 857 (-) | 867 (-) | |
| Simon-Sanchez, 2009 (ref. 12) | Case-control (case-control) | USA-NINDS, Germany(USA, Germany, UK) | 463,185 | 1,745 (3,452) | 4,047 (4,756) | |
| Satake, 2009 (ref. 13) | Case-control (case-control) | Japan (Japan) | 435,470 | 1,078 (993) | 2,628 (15,753) | |
| Edwards, 2010 (ref. 14) | Case-control (-) | USA-HIHG (-) | 491,376 | 604 (-) | 619 (-) | |
| Hamza, 2010 (ref. 15) | Case-control (-) | USA-NGRC (-) | 811,597 | 2,000 (-) | 1,986 (-) | |
| Spencer, 2011 (ref. 16) | Case-control (case-control) | UK-WTCCC2 (France) | 1,733,533 | 1,705 (1,039) | 5,175 (1,984) | |
| Saad, 2011 (ref. 17) | Case-control (case-control) | France (UK-WTCCC2, Australia) | 492,929 | 1,039 (3,232) | 1,984 (7,064) | |
| Simon-Sanchez, 2011 (ref. 18) | Case-control (case-control) | Netherlands | 514,799 | 772 (-) | 2024 (-) |
The overview is based on content on the PDGene website (
The results of this research synopsis are based on a freeze of the PDGene database content on March 31st 2011 (available upon request from the authors). At that time, PDGene included details on 828 individual studies across more than 50 different countries and six continents reporting on 3,382 polymorphisms in 890 genetic loci. Data for more than 2,000 SNPs were supplemented by results derived from up to three publicly available GWAS datasets
| Caucasian ethnicity | |||||||||||
| Locus | Polymorphism | Location (hg18) | MAF | Allele contrast | N datasets | N samples | OR (95% CI) | HuGENet | BF | ||
| N370S | chr1:153451576 | 0.01 | G vs. A | 15 | 44,851 | 3.51 (2.55–4.83) | 1.44×10−14 | 38 (0–66) | A | 6.6 | |
| chr1:154105678 | chr1:154105678 | 0.02 | T vs. C | 6 | 17,300 | 1.73 (1.48–2.02) | 2.35×10−12 | 0 (0–52) | B* | 8.2 | |
| PARK16 | rs947211 | chr1:204019288 | 0.23 | A vs. G | 12 | 69,262 | 0.91 (0.88–0.94) | 8.00×10−10 | 0 (0–66) | A | 6.8 |
| rs2390669 | chr2:168800188 | 0.13 | C vs. A | 14 | 35,159 | 1.19 (1.12–1.25) | 1.37×10−09 | 18 (0–56) | A | 4.9* | |
| rs11711441 | chr3:184303969 | 0.14 | A vs. G | 25 | 46,502 | 0.86 (0.82–0.91) | 9.20×10−10 | 18 (0–50) | A | 6.8 | |
| rs11248060 | chr4:954359 | 0.12 | T vs. C | 10 | 57,716 | 1.21 (1.15–1.27) | 3.04×10−12 | 11 (0–52) | A | 9.2 | |
| rs11724635 | chr4:15346199 | 0.43 | C vs. A | 26 | 46,586 | 0.88 (0.84–0.91) | 1.87×10−10 | 43 (10–64) | A | 7.5 | |
| rs356219 | chr4:90856624 | 0.41 | G vs. A | 31 | 79,494 | 1.29 (1.25–1.33) | 6.06×10−65 | 16 (0–46) | A | 61.0 | |
| rs7077361 | chr10:15601549 | 0.12 | C vs. T | 11 | 61,036 | 0.88 (0.84–0.92) | 1.51×10−08 | 0 (0–55) | A | 5.7 | |
| rs1491942 | chr12:38907075 | 0.21 | G vs. C | 21 | 34,123 | 1.17 (1.13–1.22) | 6.44×10−15 | 0 (0–38) | A | 11.8 | |
| rs10847864 | chr12:121892551 | 0.39 | T vs. G | 23 | 38,367 | 1.15 (1.11–1.18) | 4.37×10−17 | 0 (0–35) | A | 14.4 | |
| H1H2 | chr17:42131818–41149582 | 0.20 | H2 vs. H1 | 37 | 50,389 | 0.78 (0.75–0.80) | 7.97×10−52 | 0 (0–29) | A | 48.1 | |
| Asian ethnicity | |||||||||||
| Locus | Polymorphism | Location (hg18) | MAF | Allele contrast | N datasets | N samples | OR (95% CI) | HuGENet | BF | ||
| PARK16 | rs823156 | chr1:204031263 | 0.17 | G vs. A | 5 | 22,870 | 0.74 (0.68–0.81) | 2.09×10−12 | 0 (0–58) | A | 9.2 |
| rs4538475 | chr4:15347035 | 0.38 | G vs. A | 3 | 20,393 | 0.80 (0.75–0.86) | 9.53×10−10 | 0 (-) | A | 6.8 | |
| rs6532194 | chr4:90999925 | 0.40 | T vs. C | 5 | 22,844 | 1.29 (1.20–1.39) | 4.91×10−11 | 31 (0–74) | A | 8.0 | |
| rs34778348 | chr12:39043595 | 0.04 | A vs. G | 13 | 10,441 | 2.23 (1.89–2.63) | 2.97×10−21 | 0 (0–53) | B* | 15.2 | |
Whenever multiple polymorphisms showed genome-wide significant association in the same locus, only the variant with the smallest
The PDGene meta-analyses of the 867 core polymorphisms were based on a median of 7,680 subjects (interquartile range 4,612–16,726). Additional meta-analyses were performed after stratification for Caucasian and Asian ancestry (for details on sample size and included ethnicities for individual meta-analyses see
This summary combines association results from 7,123,986 random-effects meta-analyses based on the March 31st 2011 datafreeze of the PDGene database. Results are plotted as −log10
One-hundred-three meta-analyses across 12 genetic loci (
The above list includes an intronic polymorphism in
Study-specific allelic odds ratios (ORs, black squares) and 95% confidence intervals (CIs, lines) were calculated for each included dataset. The summary OR and CI was calculated using the DerSimonian Laird random-effects model (grey diamond)
In addition to using random-effects models, we also performed exploratory fixed-effect meta-analyses on all eligible polymorphisms. These analyses did not reveal genome-wide significant effect sizes for any additional locus, except
To estimate the epidemiologic credibility of associations with polymorphisms showing sub-genome-wide significant association with PD (
There was strong epidemiologic support in both assessments for all loci showing genome-wide significant association. This included several additional polymorphisms in these same loci that only showed sub-genome-wide significant association. However, there was no additional sub-genome-wide significantly associated locus that received unequivocally strong support from both credibility assessments (
The PDGene database represents a comprehensive, regularly updated and freely available online research synopsis of genetic association studies in PD. Detailed summaries of the most compelling findings are provided within an easy-to-use, dedicated online framework, displaying forest plots, cumulative meta-analyses, and an up-to-date ranking of “Top Results”. To allow comparison of PDGene results with association findings from other complex diseases and to facilitate their interpretation with respect to functional genetics data, all meta-analysis results have been ported as a customized track onto the UCSC Genome Browser. This will also allow for a integration and visualization
To the best of our knowledge, our study represents the most comprehensive research synopsis in the field of PD genetics. In addition, it represents the first disease-specific genetic database that allows a systematic and exhaustive inclusion of GWAS data, and may serve as a model for similar databases in other complex genetic diseases. Owing to our multi-pronged data retrieval and analysis protocol we were able to perform meta-analyses on the vast majority of PD risk-gene candidates, including those “featured” as top association results in all published GWAS. In particular, this includes the five novel loci recently featured in the recent GWAS meta-analysis
Of particular interest are loci with unusually large effect sizes. While most loci in PDGene have only small effects on PD risk (with ORs ranging from 1.10 to 1.35, which are typical for complex diseases), for some loci much larger ORs were estimated (i.e.
Interestingly, the meta-analysis results of
The strength of our approach is further exemplified by the identification of genome-wide significant association between disease risk and a SNP in
In summary, we have created a continuously updated online resource for genetic association studies in the field of PD. Synthesizing essentially all available data in the field led to the identification of
Note that the following section only provides a brief summary of the methods applied to our study. A much more detailed description can be found in
For inclusion in PDGene, a study has to meet three criteria: 1) It must evaluate the association between a bi-allelic genetic polymorphism (minor allele frequency ≥0.01 in the healthy control population of at least one study) and Parkinson's disease (PD) risk in datasets comprised of both affected (defined as clinically and/or neuropathologically diagnosed “Parkinson's disease”) and unaffected individuals; 2) it must be published in a peer-reviewed journal; 3) it must be published in English. For this manuscript, we also included data on ten SNPs generated in the GEO-PD Consortium datasets
In brief, genetic association data of the following studies were excluded from the meta-analyses (see
Our literature searches until March 31st, 2011, yielded 27,210 articles, which were screened for eligibility using the title, abstract, or full-papers, as necessary. Additional screening of bibliographies in reviews, published meta-analyses, and original genetic association studies were also performed. Overall, full text versions of 1,534 articles were obtained. Following the inclusion and exclusion criteria outlined above, 828 articles were included in PDGene until March 31st 2011 (also see
Random-effects allelic meta-analyses
This is of particular importance in meta-analyses of published association data and was carefully addressed here: First, we added
We obtained individual-level genotype data for all publicly available PD GWAS datasets from NCBI's “dbGAP” database (a total of three
After completion of all data-management and analysis steps, all study-specific variables, genotype data (except for GWAS), and meta-analysis plots are posted on a dedicated, publicly available, online adaptation of the PDGene database using the same software and code as our databases for Alzheimer's disease
The database software can easily be ported to other genetically complex diseases and will be made available on a collaborative basis to interested researchers upon request.
QQ plots showing the distribution of expected versus observed P-values for the GWAS-only meta-analysis results. Analyses were performed using the METAL software (ref.
(TIF)
Click here for additional data file.
Forest plots of allelic meta-analyses for SNPs showing genome-wide significant association (P<5×10−8) with PD susceptibility in the March 31st 2011 datafreeze. Study-specific allelic odds ratios (ORs, black squares) and 95% confidence intervals (CIs, lines) were calculated for each included dataset. The summary OR and CI was calculated using random-effects models (grey diamond). Whenever multiple polymorphisms showed genome-wide significant association in the same locus, only the variant with the smallest P-value is listed here for meta-analysis results after stratification for Caucasian and Asian ancestries. For a complete list of meta-analyses performed for the datafreeze, see
(PDF)
Click here for additional data file.
Locus plot of the ITGA8 region on chromosome 10p13 (15346353–15801533 bp, hg18). The figure displays association results for ∼1,400 SNPs in the
(TIF)
Click here for additional data file.
Forest plots of fixed-effect meta-analyses for SNP rs6723108 in the ACMSD/TMEM163 locus and chr6:32609909 in the HLA locus. Symbols are the same as for
(TIF)
Click here for additional data file.
Overview of all 867 polymorphisms meta-analyzed in the March 31st 2011 datafreeze using random-effects allelic models. Random-effects allelic meta-analyses were performed on polymorphisms for which four or more independent datasets were available. Meta-analyses after stratification for different ethnic descent were performed if at least three independent datasets were available in the respective stratum (applicable only to samples of European and Asian descent). Each nominally significant meta-analysis result (
(XLS)
Click here for additional data file.
Investigation of the extent of statistical inflation assuming sample overlaps of 1%, 5%, and 10% across cases and controls in datasets originating from the same countries. Hypothetical sample overlap across datasets was assumed between different candidate-gene/replication studies and between candidate-gene/replication studies and GWAS datasets if they originated from the same country. These analyses were performed applying random-effects models and adding the sum of weighted co-variances of overlapping datasets to the overall study variance (see ref.
(DOC)
Click here for additional data file.
Supplementary material. This file includes supplementary methods and references as well as the list of members of the GWAS consortia, the GEO-PD Consortium, and consortia-specific acknowledgements.
(PDF)
Click here for additional data file.
23andMe acknowledges Elizabeth Dorfman, Amy K. Kiefer, Emily M. Drabant, Uta Francke, Joanna L. Mountain, David Hinds, and Anne Wojcicki from 23andMe, as well as Samuel M. Goldman, Caroline M. Tanner, and J. William Langston from the Parkinson's Institute, Sunnyvale, CA, USA. We also acknowledge the contribution of Mitsutoshi Yamamoto, Nobutaka Hattori, and Miho Murata for sample collection in the Japanese GWAS 1.0
CB Do, N Eriksson, and JY Tung are employed by 23andMe and own stock options in the company. MJ Farrer and Mayo Foundation received royalties from H.Lundbeck A/S and Isis Pharmaceuticals. In addition, MJ Farrer has received an honorarium for a seminar at Genzyme. T Gasser has received consultancy fees from Cephalon and Merck-Serono, grants from Novartis, payments for lectures including service on speakers' bureaus from Boehringer Ingelheim, Merck-Serono, UCB, and Valean, and holds patents NGFN2 and KASPP. JA Hardy has received consulting fees or honoraria from Eisai and his institute has received consulting fees or honoraria from Merck-Serono. DM Maraganore has received extramural research funding support from the National Institutes of Health (2R01 ES10751), the Michael J. Fox Foundation (Linked Efforts to Accelerate Parkinson Solutions Award, Edmond J. Safra Global Genetics Consortia Award), and from Alnylam Pharmaceuticals and Medtronic (observational studies of Parkinson's disease). DM Maraganore has also received intramural research funding support from the Mayo Clinic and from NorthShore University Health System. DM Maraganore filed a provisional patent for a method to predict Parkinson's disease. This provisional patent is unlicensed. He also filed a provisional patent for a method to treat neurodegenerative disorders. That provisional patent has been licensed to Alnylam Pharmaceuticals and DM Maraganore has received royalty payments in total of less than $20,000. K Stefansson has received grants from deCODE.
The main funding for this study was provided by the Michael J. Fox Foundation for Parkinson's Disease (MJFF) with additional financial support by the Cure Alzheimer's Fund (CAF), the National Alliance for Research on Schizophrenia and Depression (NARSAD), Prize4Life, and EMD Serono (all to L Bertram). CM Lill was supported by a fellowship from the Deutscher Akademischer Austauschdienst (DAAD) and Fidelity Biosciences Research Initiative (FBRI). L Bertram is also supported by the German Ministry for Education and Research (BMBF). JPA Ioannidis was supported through the Tufts Clinical and Translational Science Institute (Tufts CTSI) under funding from the National Institute of Health/National Center for Research Resources (UL1 RR025752). Points of view or opinions in this paper are those of the authors and do not necessarily represent the official position or policies of the Tufts CTSI. M Sharma was supported by the Michael J. Fox Foundation. The NeuroGenetics Research Consortium GWAS [15] was funded by the Edmond J. Safra Michael J. Fox Foundation Global Genetics Consortium Initiative and NIH R01 NS 036960. The work of the International Parkinson's Disease Genomics Consortium (IPDGC) was supported in part by the Intramural Research Programs of the National Institute on Aging, National Institute of Neurological Disorders and Stroke, National Institute of Environmental Health Sciences, National Human Genome Research Institute, National Institutes of Health, Department of Health and Human Services: project numbers Z01 AG000949-02 and Z01-ES101986. In addition the work of the IPDGC was supported by the U.S. Department of Defense, award number W81XWH-09-2-0128. Portions of the work of the IPDGC utilized the high-performance computational capabilities of the Biowulf Linux cluster at the National Institutes of Health, Bethesda, Md. (