The genome-wide association (GWA) studies have been successful to investigate generic variants associated with some targeted phenotypes. In general, many GWA methods only consider association of a single SNP and provide the list of the most significant SNPs or related genes due to computational burden.

However, complex diseases often result from compound action of multiple risk factors and therefore the single-SNP-based analysis may miss the genetic variants that affect risk effects jointly but have scarce individual effects. Also, the locus heterogeneity, which implies that alleles at different loci target the same diseases in different individuals, would increase difficulty in replication of association of a single marker [1]. Furthermore, a large number of statistical tests may result in high false positive associations [2]. To resolve these issues, it was suggested to utilize prior biological knowledge or known pathway information, and thus to incorporate a set of related SNPs, which leads a smaller number of tests. This approach was motivated by the gene set analysis (GSA), widely used in the analysis of microarray data. GSA focuses on gene sets rather than individual genes, and combines weak signals from a number of individual genes in a set, when individual genes are weakly associated with the traits. In this way, GSA increases a power of detecting disease-related genes and helps to interpret underlying genetic background and has been popularized.

GSA can be classified into non-parametric or parametric approach. The most popular non-parametric GSA method is gene set enrichment analysis (GSEA) [3]. GSEA uses the enrichment score which represents whether the members of gene set tend to occur toward top or bottom in ranked gene list based on a correlation. It permutes the phenotype label and repeats calculating the enrichment score for the test. This requires very expensive computational efforts.

On the other hands, the parametric GSA can reduce computing time by assuming a specific distribution. A hypergeometric distribution-based test [4, 5] is a typical choice for the parametric method, and binomial, normal, and chi-square distributions are also widely used [6–8].

There are several prior works for applying GSA methods to GWA data [1, 2, 7, 9–15]. For simplicity, we call all these methods as GSA-GWA. We address two issues regarding GSA-GWA. The first issue is that there has not been a widely agreed and accepted theory on how to combine the measures of multiple SNPs into one single gene-level measure, and moreover how to combine the gene-level measures into one single gene-set level measure. In original GSA, the gene-level measure is typically a fold-change or a correlation to represent the effect of a single gene. In GWA data, however, it is often required to calculate association measures of genes by combining the SNP-level measures. The SNP-level measures include p-values, or chi-square test statistics from the univariate SNP-to-phenotype association tests. Once the SNP-level measure is decided, the gene-level summary statistics are then derived as the highest SNP-level statistics [10], the sum of SNP-level statistics [9], or the combined p-value [1].

However, there are some substantial limitations in current GSA-GWA methods. First, in deriving the summary statistics the correlation among the SNP-level association measures has not been taken appropriately into account which is expected to play an important role. The SNP-level association measures are usually correlated because the linkage equilibrium (LD) exists among SNPs. If this correlation is not correctly adjusted, the resulting gene-set-level measure would be inflated [1]. Unfortunately, many GSA-GWA methods have not considered the LD structures adequately.

Second, the computational burden is heavy. Once having the gene-level association measures, it is possible to apply different GSA methods to get various gene-set-level statistics and evaluate their performances. However, as explained later, the majority of GSA-GWA methods implement non-parametric permutation to calculate the observed significance, which takes a heavy computing time.

There have been several efforts to resolve these limitations. As the pioneering work of GSA-GWA, GSEA [3] was extended to GWA data by Wang *et al*. [10], which has been implemented in GenGen package [11] It repeats permutation of sample label and calculation of gene set statistics 100~1,000 times [2, 10, 12–14]. This permutation-based testing can preserve a correlation among the SNP-level measures, but this is very computationally expensive in genome-wide scale.

In order to reduce computing time, some GSA-GWA studies use a parametric test. Peng *et al.*[1] used various kinds of the parametric test such as Fisher’s combination test, Sidak’s combination test, Simes’ combination test, and a FDR-based test under the independence assumption of the SNP-level p-values. A GLOSSI method developed by Chai *et al.*[9] used Fisher’s combination test under the assumption of correlated p-values.

Recently, Nam *et al*. [15] proposed the Z-statistic method that compares a specific gene-set to others. This method is the extension of the parametric analysis of gene set enrichment (PAGE) [7], which is the parametric and competitive GSA for microarray data. PAGE uses the mean of the association measures in a set as a summary measure and assumes that it follows a normal distribution by the central limit theorem when the number of genes is large.

However, these parametric methods including the Z-statistic method do not consider the LD structures adequately and assume no correlation between SNP-level p-values. In order to overcome these limitations of current GSA-GWA, we propose SNP-PRAGE, a SNP-based parametric robust analysis of gene-set enrichment, which is based on a simple normality assumption. SNP-PRAGE estimates the LD information among SNPs based on haploblock-wise covariance structure to consider the correlation among SNP-level measures without taking the permutation step.

We compare our method to other GSA-GWA methods via the simulation study in terms of size, power and computing time. We also demonstrate SNP-PRAGE using two GWA data sets: hypertension data of 8,842 samples from the Korean population and bipolar disorder data of 4,806 samples from the Wellcome Trust Case Control Consortium (WTCCC).