GWA Test Driver

Summary Report

Genome-wide combined case-control study of rheumatoid arthritis in the NARAC and EIRA datasets

SNP data are from a genome-wide combined case-control study of rheumatoid arthritis in the North American Rheumatoid Arthritis Consortium (NARAC) and the Swedish Epidemiological Investigation of Rheumatoid Arthritis (EIRA). This is a study of 317,503 single-nucleotide polymorphisms (SNPs) in 1,522 case subjects with rheumatoid arthritis and 1,850 matched control subjects on Illumina Infinium HumanHap550, HumanHap300 and HumanHap240S arrays. The primary analyses "were performed on the combined data set from NARAC and EIRA with the use of structured association within homogeneous clusters derived through identity-by-state similarity, implemented in the PLINK tool set as a Cochran-Mantel-Haenszel stratified analysis".

Robert M. Plenge, Mark Seielstad, Leonid Padyukov, Annette T. Lee, Elaine F. Remmers, Bo Ding, Anthony Liew, Houman Khalili, Alamelu Chandrasekaran, Leela R.L. Davies, Wentian Li, Adrian K.S. Tan, Carine Bonnard, Rick T.H. Ong, Anbupalam Thalamuthu, Sven Pettersson, Chunyu Liu, Chao Tian, Wei V. Chen, John P. Carulli, Evan M. Beckman, David Altshuler, Lars Alfredsson, Lindsey A. Criswell, Christopher I. Amos, Michael F. Seldin, Daniel L. Kastner, Lars Klareskog, and Peter K. Gregersen (2007). TRAF1-C5 as a Risk Locus for Rheumatoid Arthritis - A Genomewide Study. NEJM Volume 357:1199-1209.

Dynamic Power Plot and Table

This report tabulates and plots EDR estimates, both uncorrected and corrected for multiple testing, at pre-selected combinations of sample size and significance level.

The end user can request additional power records to be calculated and dynamically added to the table by filling out the text fields at the bottom of the table and clicking the submit button. This will add the user-specified power record to the table and update the plot.

family-wise significance level sample size EDR[1], uncorrected for multiple testing EDR, Bonferroni[2] EDR, FDR[3] EDR, mix-o-matic FP[4]
0.001 1000 0.120737 0.002159 0.013518 0.015739
0.001 2000 0.384568 0.056265 0.135438 0.145653
0.001 3000 0.64805 0.269364 0.401544 0.415084
0.001 3372 0.726076 0.383668 0.511967 0.524434
0.001 4000 0.828411 0.581157 0.680344 0.689459
0.001 5000 0.926217 0.823019 0.864604 0.868384
0.01 1000 0.247128 0.00454 0.030394 0.034222
0.01 2000 0.539928 0.080341 0.199435 0.211046
0.01 3000 0.758563 0.316618 0.479176 0.491751
0.01 3372 0.815662 0.431097 0.582136 0.593227
0.01 4000 0.886615 0.61911 0.730713 0.738484
0.01 5000 0.951081 0.839064 0.88547 0.888688
0.05 1000 0.404556 0.00762 0.055594 0.059396
0.05 2000 0.680897 0.102953 0.265957 0.274467
0.05 3000 0.845276 0.3544 0.546776 0.554746
0.05 3372 0.884058 0.467689 0.640985 0.647806
0.05 4000 0.930138 0.647301 0.771431 0.776074
0.05 5000 0.970114 0.850863 0.902361 0.904293

NOTE: The runtime of an additional power record calculation depends on a number of factors, including the number of p-values in the dataset, the number of other users simultaneously requesting other calculations, etc. The expected runtime with no competition with other users is less than 1 minute per requested record.

NOTE: The '?' character in an EDR field indicates that the power calculation did not complete. See software specification for further detail and a description of situations where this might happen (e.g. during the calculation of the FDR-corrected significance level if there is little or no signal).

NOTE: For users interested in cutting and pasting the power table directly into a MS Excel spreadsheet, we have provided a demo video.


[1] Gadbury GL, Page GP, Edwards J, Kayo T, Prolla TA, Weindruch R, Permana PA, Mountz JD, Allison DB. Power and sample size estimation in high dimensional biology. Statistical Methods in Medical Research (2004) 13:325-338. DOI

[2] Bonferroni, C. E. 1936. Teoria statistica delle classi e calcolo delle probabilità. Publicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze 8, 3-62.

[3] Benjamini, Y., and Hochberg, Y. (1995), Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing, Journal of the Royal Statistical Society, Ser. B, 57, 289-300. JSTOR

[4] Allison, D. B., Gadbury, G. L., Heo, M., Fern?ndez, J. R., Lee, C.-K., Prolla, T. A. and Weindruch, R. (2002). A mixture model approach for the analysis of microarray gene expression data. Comput. Statist. Data Anal. 39 1-20. DOI