close

Вход

Забыли?

вход по аккаунту

?

Genome-wide association study of rheumatoid arthritis in KoreansPopulation-specific loci as well as overlap with European susceptibility loci.

код для вставкиСкачать
ARTHRITIS & RHEUMATISM
Vol. 63, No. 4, April 2011, pp 884–893
DOI 10.1002/art.30235
© 2011, American College of Rheumatology
Genome-Wide Association Study of Rheumatoid Arthritis
in Koreans
Population-Specific Loci as Well as Overlap With European Susceptibility Loci
Jan Freudenberg,1 Hye-Soon Lee,2 Bok-Ghee Han,3 Hyoung Do Shin,4 Young Mo Kang,5
Yoon-Kyoung Sung,2 Seung-Cheol Shim,6 Chan-Bum Choi,2 Annette T. Lee,1
Peter K. Gregersen,1 and Sang-Cheol Bae2
Objective. To perform a genome-wide association
study (GWAS) in Koreans in order to identify susceptibility loci for rheumatoid arthritis (RA).
Methods. We generated high-quality genotypes
for 441,398 single-nucleotide polymorphisms (SNPs) in
801 RA cases and 757 controls. We then tested 79
markers from 46 loci for replication in an independent
sample of 718 RA cases and 719 controls.
Results. Genome-wide significance (P < 5 ⴛ
10–08) was attained by markers from the major histocompatibility complex region and from the PADI4 gene.
The replication data showed nominal association signals (P < 5 ⴛ 10–02) for markers from 11 of the 46
replicated loci, greatly exceeding random expectation.
Genes that were most significant in the replication stage
and in the combined analysis include the known European RA loci BLK, AFF3, and CCL21. Thus, in addition
to the previously associated STAT4 alleles, variants at
these three loci may contribute to RA not only among
Europeans, but also among Asians. In addition, we
observed replication signals near the genes PTPN2,
FLI1, ARHGEF3, LCP2, GPR137B, TRHDE, and CGA1.
Based on the excess of small P values in the replication
stage study, we estimate that more than half of these loci
are genuine RA susceptibility genes. Finally, we systematically analyzed the presence of association signals in
Koreans at established European RA loci, which showed
a significant enrichment of European RA loci among the
Korean RA loci.
Conclusion. Genetic risk for RA involves both
population-specific loci as well as many shared genetic
susceptibility loci in comparisons of Asian and European populations.
Supported by the American College of Rheumatology Research and Education Foundation (Within Our Reach research grant
to Dr. Gregersen), the Ministry for Health and Welfare, Republic of
Korea (Korea Healthcare Technology R&D project grants A010252
and A084794 to Drs. H.-S Lee and S.-C. Bae), and the Eileen Ludwig
Greenland Center for Rheumatoid Arthritis.
1
Jan Freudenberg, MD, Annette T. Lee, PhD, Peter K.
Gregersen, MD: Feinstein Institute for Medical Research and North
Shore–Long Island Jewish Health System, Manhasset, New York;
2
Hye-Soon Lee, MD, PhD, Yoon-Kyoung Sung, MD, PhD, MPH,
Chan-Bum Choi, MD, PhD, Sang-Cheol Bae, MD, PhD, MPH:
Hanyang University Hospital for Rheumatic Diseases, Seoul, South
Korea; 3Bok-Ghee Han, PhD: Korea National Institute of Health,
Seoul, South Korea; 4Hyoung Do Shin, DVM, PhD: Sogang University, Seoul, South Korea; 5Young Mo Kang, MD, PhD: Kyungpook
National University School of Medicine, Daegu, South Korea; 6SeungCheol Shim, MD, PhD: Eulji University Hospital, Daejeon, South
Korea.
Address correspondence to Peter K. Gregersen, MD,
Robert S. Boas Center for Genomics and Human Genetics, The
Feinstein Institute for Medical Research, 350 Community Drive,
Manhasset, NY 11030 (e-mail: pgregers@nshs.edu); or to Sang-Cheol
Bae, MD, PhD, MPH, Hanyang University Hospital for Rheumatic
Diseases, Seoul 133-792, South Korea (e-mail: scbae@hanyang.ac.kr).
Submitted for publication July 22, 2010; accepted in revised
form December 30, 2010.
A large and growing list of genetic associations
with rheumatoid arthritis (RA) has emerged from
genome-wide association studies (GWAS) performed in
the last few years (1–6). The lists of putative risk genes
have pointed to both the adaptive and innate immune
systems as potential sources of biologic variation that
predispose to disease, with surface and intracellular
signaling molecules as well as cytokines making a major
contribution. The first 2 confirmed non–major histocompatibility complex (non-MHC) associations involved the
PADI4 locus in Asian populations (7) and PTPN22 in
Europeans (8). Intriguingly, neither of these associations
crosses over these 2 ethnic groups. The associations with
884
GWAS OF RA IN KOREANS
PADI4 are extremely weak or absent in most European
studies (4). Conversely, the PTPN22 risk allele, a causative amino acid change from arginine to tryptophan at
codon 620 (R620W), is simply not found in Asian
populations. PTPN22 encodes an intracellular phosphatase that plays a critical role in setting thresholds for
receptor signaling in both T cells and B cells. Extensive
resequencing of PTPN22 in Asian RA populations has
failed to find evidence of any additional risk variants
in this population (9). The PADI4 locus encodes a
peptidyl deaminase that is directly involved in the citrullination of proteins, thereby generating a major autoantigen that is the target of a humoral response that is
quite specific to RA in all major ethnic groups; nevertheless, associations at this locus are largely limited to
Asians.
In contrast, other genetic associations appear to
be common across Asian and European RA patients;
among them are associations at the HLA–DRB1 locus
(10) and STAT4 (11), although the specific HLA alleles
involved differ somewhat among these and other
ethnic groups. In order to explore more comprehensively the genetic differences and overlap between
European and Asian RA, we undertook a GWAS
of RA in the Korean population, with further replication
of the most strongly associated markers. Our data
revealed a complex picture of both shared and
population-specific genetic risk, as well as evidence for a
large background of modest risk that may be common to
both populations.
PATIENTS AND METHODS
Population sample. RA patients analyzed for the
GWAS (n ⫽ 801) were taken from a panel of 1,128 Korean
RA patients who were consecutively enrolled at the outpatient clinic of Hanyang University Hospital for Rheumatic
Diseases in Seoul, as described previously (11). A total of
757 controls were likewise taken from a panel of 1,022
ethnically matched controls recruited at the same location.
All patients analyzed in the GWAS were seropositive for
either anti–cyclic citrullinated peptide (anti-CCP) antibodies
(89.6%) or rheumatoid factor (95.8%). All RA patients were
of Korean nationality and met the American College of
Rheumatology 1987 classification criteria for RA (12).
Written informed consent was obtained from all study
participants.
RA cases for the replication stage were recruited from
3 centers in South Korea: Kyungpook National University
School of Medicine, Eulji University Hospital, and Hanyang
University Hospital for Rheumatic Diseases. All replication
cases were positive for anti-CCP antibodies and for rheumatoid factor. Controls for the replication study were
obtained from the DNA BioBank of the Korea National
885
Institute of Health. Clinical and demographic characteristics
of the RA cases and controls for the GWAS and the
replication study are detailed in Supplementary Table 1
(available on the Arthritis & Rheumatism web site at http://
onlinelibrary.wiley.com/journal/10.1002/(ISSN)1529-0131
and on the author’s web site at http://www.biorep.org/
supplementary/freudenberg2010/index.html).
The study was approved by the Institutional Review
Board of Hanyang University Hospital.
Genotyping. Genotyping for the GWAS stage was
carried out at the Feinstein Institute for Medical Research,
using Illumina HapMap 550v3 or 660W genotyping platforms.
Data were imported into GenomeStudio software for initial
review and quality control. SNP markers common to these 2
platforms were combined and subjected to further quality
control analysis, as described below, leaving a set of 441,398
available for analysis.
For the replication stage, genotyping was performed at
Hanyang University Hospital for Rheumatic Diseases at a
multiplex level using the Illumina Golden Gate genotyping
system. Replication SNPs were required to show a genotype
quality score of 0.25, a minimum call rate of 98%, no duplicate
errors, and a Hardy-Weinberg disequilibrium test result
greater than P ⬎ 0.01.
Statistical analysis. Data were analyzed using the
program packages Plink (13), Haploview (14), and EigenStrat
(15) and the statistical software R. GWAS genotype data
were subjected to quality control filtering based on SNP
genotype call rates (⬎90% completeness), minor allele frequency (⬎1%), and Hardy-Weinberg equilibrium (P ⬎ 10–06).
Subjects with more than 10% missing genotype data and
outlier samples (deviating ⬎6 SEM on any of the major
10 principal components) were excluded. In addition,
we excluded samples showing evidence of relatedness to
another sample or possible DNA contamination (Plink
PI_HAT ⬎0.05). Finally, SNPs with differential missingness
with respect to the presence or absence of RA or with
respect to haplotypes formed with neighboring SNP alleles
were excluded (Plink tests of missing by phenotype or by
genotype P ⬍ 10–06). The remaining SNPs had a nonmissing
data rate of 99.8%. Power calculations were performed with a
genetic power calculator (16). False discovery rates for markers in the replication stage were estimated by the program
Q-value (17).
To formally evaluate the overlap of Korean and European RA loci, we used a method that we recently proposed for
the category-based analysis of GWAS data; it is described in
more detail elsewhere (18). This method builds on a partitioning of SNPs into separate genetic loci as provided by linkage
disequilibrium blocks from the HapMap database (19) in order
to minimize redundant association signals. In the present
study, we defined candidate loci based on RA association in a
European meta-analysis (4). Then, we calculated the odds
ratios (ORs) for these candidate loci to harbor at least 1 SNP
association in the Korean GWAS data. Thus, the odds that a
European RA locus would harbor an associated SNP was
divided by the odds that any other locus would harbor an
associated SNP. This OR statistic was normalized using permutation of the affected/unaffected status. The resulting normalized enrichment score necessarily depends on the threshold
for which SNPs are called “associated,” but it does not depend
Figure 1. Manhattan plot of allele association tests of all single-nucleotide polymorphisms that passed stringent quality control in 801 rheumatoid arthritis (RA) cases and 757
controls. Genome-wide significance was attained in the major histocompatibility complex region on chromosome 6 and at the PADI4 gene on chromosome 1. GWAS ⫽ genome-wide
association study.
886
FREUDENBERG ET AL
GWAS OF RA IN KOREANS
887
on factors such as locus size, SNP density, or linkage disequilibrium (18).
RESULTS
Findings of the GWAS of RA in Koreans. SNPs
were genotyped on the Illumina 550K genotyping
platform. After stringent quality control, a total of
441,398 SNPs with a minor allele frequency ⬎1% were
available for comparison in 801 RA cases and 757
controls. Principal components analysis did not reveal
any population stratification or population outliers
(Supplementary Figure S1; available online at http://
www.biorep.org/supplementar y/freudenberg2010/
index.html). Accordingly, association analyses of SNPs
with RA showed an estimated chi-square inflation factor
␭1,000 of 1.04, indicating little genome-wide stratification
between cases and controls.
As expected, the most significant differences
between cases and controls were found in the MHC
region near the HLA–DRB1 gene, as shown in Figure 1.
The 2 most significant SNPs in the MHC were located
near the DRB1 locus: rs7765379 (P ⫽ 4.9 ⫻ 10–23, OR
2.51) and rs13192471 (P ⫽ 1.1 ⫻ 10–20, OR 2.1). The
latter SNP was also the most significant marker in a
recent GWAS for RA in the Japanese population (6).
Both these SNPs also showed strong associations with
the same alleles in European RA patients (4). In addition, 215 markers in the MHC regions showed associations for the threshold P ⬍ 10–03 (Supplementary Figure
S2; available online at http://www.biorep.org/
supplementary/freudenberg2010/index.html). Further
analyses with denser marker maps will be required to
tease apart this broad MHC signal and to determine
whether additional signals that are independent of
HLA–DRB1 are located in this region.
After exclusion of SNPs from the MHC region
(chromosome 6:26–35 Mb), case–control differences in
remaining markers still showed a deviation from random
expectation, as shown in Figure 2A. Because this deviation was most prominent for markers with smaller P
values (e.g., P ⬍ 10–03), we consider it unlikely that this
finding is the result of technical artifacts or stratification.
Moreover, we performed a stringent quality control
analysis to minimize this possibility, as detailed above. It
is thus likely that additional true-positive associations
exist outside the MHC region.
The most significant SNPs (outside the MHC
region) that we identified in the Korean population are
shown in Table 1. The full list of such SNPs for the
threshold P ⬍ 0.01 is given in Supplementary Table 2
(available on the Arthritis & Rheumatism web site at
Figure 2. Quantile–quantile plot of the chi-square test statistic from the
single-nucleotide polymorphism (SNP) allele association tests. A, After
excluding the major histocompatibility complex region, a clear deviation
from the expectation (straight line) indicates the presence of true-positive
association signals. B, When the analysis was further restricted to the
6,726 SNPs with a significance of P ⬍ 1.0 ⫻ 10–02 obtained in a recent
meta-analysis of rheumatoid arthritis in European population samples
and were also genotyped in our genome-wide association study, the
deviation from the expectation became more prominent.
http://onlinelibrary.wiley.com/journal/10.1002/(ISSN)
1529-0131 and on the authors’ web site at http://
www.biorep.org/supplementary/freudenberg2010/
888
FREUDENBERG ET AL
Table 1. Loci found at the GWAS stage to be most strongly associated with RA in the Korean population, based on P values
from allelic SNP association tests*
MAF, %
SNP
rs2240335
rs17769245
rs2944021
rs2290652
rs1025065
rs7834685
rs1216363
rs4823569
rs7561798
rs9636786
rs11236774
rs791195
rs2062583
rs12579024
rs4583322
rs10466245
rs6962404
rs1077773
rs6702348
rs2159214
rs1474581
rs4368165
rs942880
rs10421853
rs6590343
rs4547623
rs879036
rs12831974
rs6679652
rs9916862
rs1415654
rs6815902
rs17617822
rs1265883
rs289744
rs2303025
rs4867947
rs17328497
rs12542184
rs218311
rs1126133
rs1541596
Chromosome
1
16
19
19
16
8
4
22
2
21
11
6
3
12
18
10
7
7
1
19
20
16
13
19
11
22
6
12
1
17
9
4
18
1
16
5
5
15
8
2
14
19
Cases
49.69
27.32
28.29
25.03
21
30.48
8.68
34.36
36.14
11.19
3
22.13
6.12
13.63
27.03
3.38
9.18
46.67
27.81
20.56
24.78
47
13.16
24.03
13.92
15.92
2.18
48.38
31.84
33.88
1.37
8.86
16.92
10.31
29.49
39.64
25.88
27.59
8.24
28.66
40.82
23.53
Controls
39.65
19.18
20.19
17.51
28.34
38.34
4.56
42.38
28.64
16.71
6.34
15.92
10.44
8.71
34.11
6.69
5.24
39.21
21.35
14.84
31.42
45.63
8.58
30.52
19.36
21.58
4.82
41.2
25.36
27.24
3.57
5.22
22.63
6.38
36.15
46.62
32.31
21.53
4.76
35.2
34.04
17.84
P
⫺08
2.00
1.30⫺07
1.80⫺07
4.60⫺07
2.01⫺06
3.86⫺06
4.14⫺06
4.44⫺06
7.86⫺06
8.62⫺06
9.43⫺06
1.07⫺05
1.15⫺05
1.57⫺05
1.81⫺05
2.36⫺05
2.37⫺05
2.94⫺05
2.99⫺05
3.39⫺05
3.78⫺05
4.02⫺05
4.42⫺05
4.80⫺05
5.11⫺05
5.13⫺05
5.77⫺05
5.98⫺05
6.55⫺05
6.64⫺05
7.10⫺05
7.41⫺05
7.48⫺05
8.06⫺05
8.32⫺05
8.39⫺05
8.39⫺05
8.78⫺05
8.85⫺05
8.98⫺05
9.37⫺05
9.43⫺05
OR
Nearest gene(s)
1.50
1.58
1.56
1.57
0.67
0.70
1.99
1.41
1.41
1.59
0.46
1.50
0.56
0.61
1.40
0.49
1.83
0.74
1.42
0.67
1.39
1.34
1.61
1.39
0.67
0.69
2.27
1.34
0.73
1.37
0.38
0.57
0.70
1.69
0.74
0.75
0.73
1.39
1.80
1.35
0.75
1.42
PADI4
SYCE1L/MON1B
CCDC123
ZNF302
MPHOSPH6/CDH13
CNBD1/CNGB3
PHF17
GRAMD4
SPHKAP
ADAMTS1
C11orf30
SLC22A1
ARHGEF3
TBX3
KIAA0427
MARCH8
COL28A1/C1GALT1
AHR
GPR137B
NACC1/TRMT1
PLCB1
GP2/GPR139
SPATA13
TSHZ3
FLI1/ETS1
GGA1/LGALS2
ETV7
TRHDE
RGS7
ABR
PPP3R2/GRIN3A
PCDH7/STIM2
METTL4
SLAMF6
CETP
ANXA6
LCP2/C5orf58
SEMA6D
CSMD1
TLK1
PRKCH
CARM1
* For each locus the most significant marker (P ⬍ 0.0001), allele frequencies, and nearest gene are shown. GWAS ⫽
genome-wide association study; RA ⫽ rheumatoid arthritis; SNP ⫽ single-nucleotide polymorphism; MAF ⫽ minor allele
frequency; OR ⫽ odds ratio.
index.html). Interestingly, none of the loci that have
been associated with RA in European populations (4)
are contained in the list of top associations in the
Korean RA population. However, a number of the
established Caucasian risk loci did show evidence of
association at lower levels of significance (Table 2).
These loci include STAT4, as previously reported, as well
as AFF3, TNFAIP3, CCR6, BLK, and TRAF1. Furthermore, PTPN2, which has been established as a risk factor
for type 1 diabetes mellitus (20), showed some evidence
of association with RA in the Korean population.
We also looked at RA loci that were previously
established in the Japanese RA population. We did not
see any associations with the FCRL3 or CD244 gene. At
the PADI gene cluster, we found the strongest signal
at PADI4, as expected (Table 1 and Supplementary
Table 2; available online at http://www.biorep.org/
supplementary/freudenberg2010/index.html). Interest-
GWAS OF RA IN KOREANS
889
Table 2. Markers previously implicated in RA susceptibility in European populations that were found at the GWAS stage to
be associated at a level of P ⬍ 0.005 in the Korean population*
MAF, %
SNP
Chromosome
Cases
Controls
P
OR
Gene
locus
OR from European
meta-analysis
rs10168266
rs2009094
rs12055552
rs204295
rs6984212
rs1953126
rs657555
2
2
6
6
8
9
18
35.27
48.19
22.69
51.37
26.47
35.33
37.77
30.25
42.59
18.01
44.97
31.94
29.54
31.62
2.87⫺03
1.72⫺03
1.23⫺03
3.54⫺04
7.85⫺04
6.03⫺04
3.31⫺04
1.26
1.25
1.34
1.29
0.77
1.30
1.31
STAT4
AFF3
TNFAIP3
CCR6
BLK
TRAF1
PTPN2
1.16
0.91
1.03
0.94
0.93
1.1
1.14
* Odds ratios (ORs) for the European population were obtained from the study reported by Stahl et al (4) and refer to the same
alleles as those in the Korean population. RA ⫽ rheumatoid arthritis; GWAS ⫽ genome-wide association study; SNP ⫽
single-nucleotide polymorphism; MAF ⫽ minor allele frequency.
ingly, we also found a second association peak at the
neighboring PADI2 gene that did not show any linkage
disequilibrium with the associated SNPs in PADI4 (Supplementary Figure S3; available online at http://
www.biorep.org/supplementary/freudenberg2010/
index.html). Although the statistical significance of
associated markers at PADI2 (P ⫽ 2 ⫻ 10–03, OR 1.25
for rs2075696) was much weaker than that at PADI4
(P ⫽ 2 ⫻ 10–08, OR 1.5 for rs2240335), it may be
interesting to point out that PADI2 and PADI4 are the
only 2 PADI genes that are highly expressed in hematopoietic cells (21).
To quantify more precisely the amount of true
signal in our data, we next partitioned SNP markers
based on the linkage disequilibrium blocks from the
HapMap phase II database (19). We then compared the
observed number of linkage disequilibrium blocks with
associated SNPs to their expectation, as obtained from
permutation of the affection status (Supplementary Figure S4; available online at http://www.biorep.org/
supplementary/freudenberg2010/index.html). This analysis showed an excess of 14 blocks (of 46) with at least
1 associated SNP, when calling SNPs associated at a
level of P ⬍ 10–04. This excess increased to 46 linkage
disequilibrium blocks (of 316) for the threshold value of
P ⬍ 0.001. For even less stringent thresholds (P ⬍ 0.01),
200 associated linkage disequilibrium blocks were observed above the expected number (data not shown).
This indicates that additional true association signals
exist at more modest levels of statistical significance in
our dataset.
Findings of the replication study. To gain further
insight into the loci that cause the excess of RA association signals at the GWAS stage, we picked 96 SNPs for
genotyping in an independent sample of 718 RA cases
and 719 controls. These SNPs were primarily chosen
from the set of 42 loci that harbor at least 1 SNP with
significance at P ⬍ 10–04 (Table 1). Based on an OR of
1.3, a risk allele frequency of 10%, and a disease
prevalence of 1%, we estimated that this replication
sample provided a statistical power of 67% to attain
significance of P ⬍ 0.05 for a true-positive SNP from the
GWAS stage. Thus, based on the above estimate of 14
truly associated non-MHC loci among the 42 loci with
P ⬍ 10–04, one may expect that around 9 of these loci
would fall below P ⬍ 0.05 in our replication sample.
Because we did not attempt replication for PADI4 and
were unable to attain high-quality genotypes for all
SNPs chosen for replication, rather fewer than 9 loci
might be expected to fall under the threshold P ⬍ 0.05 in
the replication stage.
Furthermore, we complemented the replication
stage analysis with SNPs that had attained association at
P ⬍ 5 ⫻ 10–3 in the Korean GWAS and were found at
loci with prior evidence of association with autoimmune
disease in Europeans (Table 2). Six of these particular
SNPs had shown at least a weak association (P ⬍ 5 ⫻
10–2) with RA in Europeans, and 4 of these had the same
direction of association as seen in this study of Koreans.
Using the same disease parameters as above and assuming that the attempted replications are true-positive
associations, we would expect a successful replication for
4 of these loci at a threshold of P ⬍ 0.05.
High-quality replication genotypes could be obtained for 79 SNPs covering 46 different loci. The most
significant replication signal was found at the BLK locus
(P ⫽ 7 ⫻ 10–04, OR 0.77 for rs1600249). In total,
nominally significant (P ⬍ 0.05) replication signals were
found for 11 different loci, including 2 markers at the
BLK locus. Among these, the directions of the association were consistent with the GWAS findings for 10 loci,
being inconsistent only for rs10421853 at the TSHZ3
890
FREUDENBERG ET AL
Table 3. Markers most strongly associated with RA in the Korean population, based on the results from the replication stage*
SNP
rs1600249
rs2736340
rs2009094
rs12831974
rs7024727
rs657555
rs2062583
rs7537965
rs4867947
rs4547623
rs4936059
Nearby
gene(s)
BLK
BLK
AFF3
TRHDE
CCL21
PTPN2
ARHGEF3
GPR137B
LCP2/C5orf58
GGA1/
LGALS2
FLI1/ETS1
MAF from the replication
study, %
MAF from the GWAS,%
Cases
Controls
P
⫺03
OR
Cases Controls
P
⫺04
OR
Combined MAFs, %
Cases Controls
P
⫺06
OR (95% CI)
26.88
22.72
48.19
48.38
1.31
37.77
6.12
26.03
25.88
15.92
31.85
27.25
42.59
41.2
2.77
31.62
10.44
20.5
32.31
21.58
2.29
3.51⫺03
1.72⫺03
5.98⫺05
3.73⫺03
3.31⫺04
1.15⫺05
2.68⫺04
8.39⫺05
5.13⫺05
0.79
1.27
1.25
1.34
0.47
1.31
0.56
0.73
0.73
0.69
27.44
24.41
47.56
46.3
1.81
35.94
6.55
25.21
27.48
18.19
33.24
29.76
42.24
41.93
3.13
31.92
8.79
21.91
31.02
21.16
7.15
1.24⫺03
4.20⫺03
1.83⫺02
2.28⫺02
2.28⫺02
2.41⫺02
3.69⫺02
3.71⫺02
4.78⫺02
0.76
1.31
1.24
1.19
0.57
1.20
0.73
0.83
0.84
0.83
27.14
23.52
47.89
47.4
1.55
36.91
6.32
25.64
26.64
16.99
32.53
28.47
42.42
41.56
2.95
31.77
9.63
21.19
31.68
21.38
5.18
1.22⫺05
2.14⫺05
5.69⫺06
2.49⫺04
2.93⫺05
2.16⫺06
4.73⫺05
1.88⫺05
1.75⫺05
0.77 (0.69–0.86)
1.29 (1.15–1.45)
1.25 (1.13–1.38)
1.27 (1.14–1.4)
0.52 (0.36–0.74)
1.26 (1.13–1.4)
0.63 (0.52–0.77)
0.78 (0.69–0.88)
0.78 (0.7–0.88)
0.75 (0.66–0.86)
33.65
40.35
1.10⫺04 0.75
34.89
38.42
4.94⫺02 0.86
34.23
39.41
3.38⫺05
0.80 (0.72–0.89)
* Shown are the results from the initial genome-wide association study (GWAS) stage, the replication stage, and the combined analysis. RA ⫽
rheumatoid arthritis; SNP ⫽ single-nucleotide polymorphism; MAF ⫽ minor allele frequency; OR ⫽ odds ratio; 95% CI ⫽ 95% confidence interval.
locus. Accordingly, the respective 10 loci showed a
stronger signal in the combined analysis (Table 3).
However, as mentioned above, none of these associations reached genome-wide significance in the combined
analysis of the GWAS and replication data.
Although P values obtained at the replication stage
were individually rather weak, their overall distribution
showed a clear skew toward smaller values (Supplementary
Figure S5; available online at http://www.biorep.org/
supplementary/freudenberg2010/index.html). Based on
the skew of this distribution, we estimate a false discovery rate of ⬃25% for the significance threshold of P ⬍
0.05 (17). Thus, one may expect that about 8 of the 11
gene loci with SNP associations of P ⬍ 0.05 constitute
genuine RA associations. This estimate of 8 truepositive loci is only slightly below the expectation, as
derived above from the power analysis for this threshold. However, our analysis of the replication data
tended to be conservative, in the sense that we performed a 2-sided test and the test for significance did not
consider the specific alleles that were found to be
associated at the GWAS stage. From loci showing the
strongest association signals at the GWAS stage (Table
1), the most promising replication signals were obtained
for ARHGEF3, LCP2, GPR137B, TRHDE, and GGA1
(Table 3). Among the European immune loci studied
(Table 2), replication signals in addition to BLK were
also found at AFF3, CCL21, and PTPN2 (Table 3).
Systematic analysis of candidate loci implicated
by GWAS in Europeans. Clearly, the above findings of 3
European RA loci (BLK, AFF3, and CCL21) and 1 type
1 diabetes mellitus locus (PTPN2) among the 10 loci
with positive replication signals indicate a certain over-
lap between European and Asian RA loci. Therefore,
we wanted to formally analyze the overlap of associated
loci in our Korean population with RA loci previously
reported in Europeans. To this end, we used the list of
loci that had shown an association with RA in a recent
meta-analysis (4). In total, we retrieved 6,726 non-MHC
SNPs with associations of P ⬍ 10–02 from this metaanalysis of European GWAS for RA that were also
genotyped in our study. The association signals of these
SNPs displayed a clear deviation from the expected
(Figure 2B).
We next investigated this overlap using a computational framework that we had recently proposed for
category-based analysis of GWAS data (18). In short,
this method takes a set of candidate loci as input and
scores the enrichment of association signals at these loci
in comparison to the remaining genome. To this end, the
method calculates a normalized enrichment score that
quantifies the excess of association signals at the loci
from a candidate category. We excluded the MHC
region and defined candidate loci based on the presence
of associated SNPs in Europeans, varying the threshold
for SNPs to be designated as being associated. We
further varied the threshold for calling associated SNPs
in the Korean GWAS dataset. This showed a significant
enrichment of European RA loci among Korean RA loci
when defining European RA loci based on SNPs with
association values of P ⬍ 10–05 in the earlier meta-analysis and when calling Korean RA loci based on SNPs with
association values of P ⬍ 10–02 in the GWAS dataset
(Figure 3). Thus, although we did not find any established European RA loci among the top hits of our
Korean GWAS (Table 1), this analysis showed them to
GWAS OF RA IN KOREANS
891
maximum of 2.5% of the RA risk in Koreans was
explained by risk scores when the set of SNPs was
restricted to those with P values smaller than 10-02 in
the European meta-analysis (Supplementary Figure
S6; available online at http://www.biorep.org/
supplementary/freudenberg2010/index.html). Although
the explanatory power of this risk score variable was
rather small, its inclusion in the regression model was
highly significant (P ⫽ 1.1 ⫻ 10–08). Thus, alleles that are
associated with RA risk in Europeans also show an
overall stronger-than-expected association with RA in
Koreans.
DISCUSSION
Figure 3. Surface plot of the normalized enrichment score for rheumatoid arthritis (RA)–associated loci in the European population
among loci with single-nucleotide polymorphism (SNP) associations in
data from the Korean genome-wide association study (GWAS). Candidate loci were defined based on the presence of SNPs with variable
evidence for RA association in Europeans (from P ⬍ 10–01 to P ⬍
10–06). These loci were then tested for enrichment of RA association
signals in the Korean GWAS, where the threshold for calling SNP
associations was also varied (from P ⬍ 10–01 to P ⬍ 10–04). The colors
designate the magnitude of the enrichment of candidate loci among
associated loci for the respective threshold parameters, with red
representing a high score and blue representing a low score. For the
range of threshold parameters with the greatest enrichment scores, the
enrichment actually observed for European RA loci was greater than
in any of 1,000 permutations of the affected/unaffected status.
be clearly enriched among loci with weaker association
signals.
In a final step, we conducted a genetic risk score
analysis to evaluate whether the observed overlap of RA
risk loci between Europeans and Koreans extended to
an overlap of RA risk alleles. As proposed by the
International Schizophrenia Consortium (22) and implemented in the program package Plink (13), we calculated a disease risk score for each subject from the
number of present risk alleles. Risk alleles were
weighted by the logOR of the allele, as estimated by the
European RA meta-analysis. We again excluded SNPs
from the MHC region and successively restricted the
set of SNPs based on their maximum P value in the
meta-analysis. For each set of SNPs, we performed a
logistic regression analysis of RA affected/unaffected
status on risk score. We then calculated Nagelkerke’s
R2 as the fraction of variance explained by the risk
score in the regression model. This showed that a
We performed a GWAS and replication study in
the Korean RA population and compared the results to
the accumulating evidence for multiple genetic susceptibility loci in Europeans. Overall, the data demonstrated a complex picture, with both shared and
population-specific disease susceptibility. Because our
GWAS was of modest sample size, statistical power was
limited in the discovery phase. The expected presence of
associations in the HLA–DRB1 and PADI4 regions
demonstrated that our case–control sample was informative with regard to RA-associated loci. Accordingly,
one could expect the presence of true signal below the
formal genome-wide significance threshold. This notion
was supported by our estimate of an excess of 14
true-positive associations among the 42 associated loci
with SNPs having a significance of P ⬍ 10–04. The
respective list of putative RA loci was further narrowed
down by the results from the replication stage, where we
found strong skew toward smaller P values.
We estimated that our study had 50% power to
detect loci with a risk allele frequency of 40% and an OR
1.5 for the P value threshold of P ⬍ 5 ⫻ 10–08 and 95%
power to detect such loci for a threshold of P ⬍ 10–04.
Thus, it appears unlikely that many such loci exist
beyond HLA–DRB1 and PADI4. In contrast, our study
had less than 1% power to detect risk alleles with an
allele frequency of 10% and an OR of 1.3 for the
significance threshold of P ⬍ 5 ⫻ 10–08, 5% power to
detect such loci for a threshold of P ⬍ 1 ⫻ 10–04, and
60% power to detect such loci for a threshold of P ⬍ 5 ⫻
10–02. Notably, it is exactly this threshold range for which
the study was most powerful, where we see the strongest
overlap with RA loci identified in a much larger metaanalysis of European RA loci (Figure 3). This formally
confirms the impression obtained from the presence of
892
several weaker associations signals in Koreans that were
found for European RA loci (Table 2).
It is therefore likely that the extent of overlapping
risk factors between the 2 populations is greater than
that suggested by the list of the very top associations
from the GWAS stage (Table 1). However, our present
study mainly examined the overlap between loci. It will
be interesting in future studies to perform a more
detailed analysis of whether the same or different susceptibility mutations underlie these loci that are shared
across populations.
Because a role of mutations for autoimmune
susceptibility in Europeans has already been established
for BLK, AFF3, and CCL21, the associations of these
genes with RA in Koreans are the most likely to be true
positives. Conversely, it was also interesting to examine
which of the Korean RA loci show subthreshold associations in Europeans, since this would, in turn, increase
the confidence in the association findings we obtained in
the Korean sample. Therefore, we looked up the results
for the associated markers at PTPN2, FLI1, ARHGEF3,
LCP2, GPR137B, TRHDE, and GGA1 in a recent metaanalysis of RA (4). This showed fairly strong evidence
for PTPN2 (P ⫽ 7.4 ⫻ 10–05 for rs657555) and weaker
evidence for FLI1 (P ⫽ 0.003 for rs4936059) in this large
European RA meta-analysis.
FLI1 has been implicated in the risk of murine
lupus due to regulatory polymorphisms acting in T cells
(23), and it shares similar regulatory regions with humans (24). Interestingly, markers from the neighboring
ETS1 gene were recently associated with systemic lupus
erythematosus (SLE) in Chinese (25). These associations of ETS1 with SLE are only 130 kb away from the
association of FLI1 with RA we observed in the present
study. Because linkage disequilibrium between markers
in FLI1 and ETS1 is weak (Supplementary Figure S7;
available online at http://www.biorep.org/supplementary/
freudenberg2010/index.html), we would consider these
to represent independent signals for RA and SLE
susceptibility in Asians. Indeed, none of the ETS1
markers associated with SLE in Chinese showed any
association with RA in our study of Koreans.
Another gene with a possible role for RA in both
European and Asian populations is CCR6 (4,6). Our
GWAS data supported an association of the SNP
rs3093024 with RA in Koreans (P ⫽ 0.004, OR 1.23).
However, we also saw differences in the allele frequencies; the A allele attained a frequency of 45.9% in RA
cases and 40.8% in controls in our study, whereas it
attained a frequency of 52% in RA cases and 46% in
controls in the Japanese population (6). Thus, rs3093024
FREUDENBERG ET AL
seems to be a SNP with a fairly large allele frequency
difference between the Japanese and Korean populations. Interestingly, this SNP was reported to be in
strong linkage disequilibrium with a presumably functional insertion/deletion polymorphism in Japanese (6).
Among the remaining candidate genes shown in
Table 3, LCP2 is of particular interest, since it encodes
SLP-76, a critical adaptor protein for receptor signaling
in T cells and several other hematopoietic cells types
(26). The associated SNP, rs4867947, is located ⬃50 kb
downstream of LCP2, and therefore, much work remains before the functionally relevant locus in this
region is definitively identified.
In summary, we have presented support for associations with 10 different novel putative RA genes in
the Korean population. Despite the fact that none of
these new associations reaches generally accepted levels
of genome-wide significance, we estimate that a large
proportion of these associations are likely to be true
positives. We further showed that the overlap between
non-MHC loci that are associated with RA is significantly larger than expected by chance and, thus, at least
a subset of RA loci are shared between European and
Asian populations. We therefore believe that the list of
associations provided herein are likely to be helpful for
further fine-mapping studies and future meta-analyses
of RA in Asians as well as across populations.
AUTHOR CONTRIBUTIONS
All authors were involved in drafting the article or revising it
critically for important intellectual content, and all authors approved
the final version to be published. Drs. Gregersen and Bae had full
access to all of the data in the study and take responsibility for the
integrity of the data and the accuracy of the data analysis.
Study conception and design. Freudenberg, H.-S. Lee, A. T. Lee,
Gregersen, Bae.
Acquisition of data. H.-S. Lee, Han, Shin, Kang, Sung, Shim, Choi,
A. T. Lee, Gregersen, Bae.
Analysis and interpretation of data. Freudenberg, H.-S. Lee, Shin,
A. T. Lee, Gregersen, Bae.
REFERENCES
1. Plenge RM, Seielstad M, Padyukov L, Lee AT, Remmers EF,
Ding B, et al. TRAF1-C5 as a risk locus for rheumatoid
arthritis—a genomewide study. N Engl J Med 2007;357:1199–209.
2. Remmers EF, Plenge RM, Lee AT, Graham RR, Hom G, Behrens
TW, et al. STAT4 and the risk of rheumatoid arthritis and systemic
lupus erythematosus. N Engl J Med 2007;357:977–86.
3. Gregersen PK, Amos CI, Lee AT, Lu Y, Remmers EF, Kastner
DL, et al. REL, encoding a member of the NF-␬B family of
transcription factors, is a newly defined risk locus for rheumatoid
arthritis. Nat Genet 2009;41:820–3.
4. Stahl EA, Raychaudhuri S, Remmers EF, Xie G, Eyre S, Thomson
BP, et al. Genome-wide association study meta-analysis identifies
GWAS OF RA IN KOREANS
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
seven new rheumatoid arthritis risk loci. Nat Genet 2010;42:
508–14.
Genome-wide association study of 14,000 cases of seven common
diseases and 3,000 shared controls. Nature 2007;447:661–78.
Kochi Y, Okada Y, Suzuki A, Ikari K, Terao C, Takahashi A, et al.
A regulatory variant in CCR6 is associated with rheumatoid
arthritis susceptibility. Nat Genet 2010;42:515–9.
Suzuki A, Yamada R, Chang X, Tokuhiro S, Sawada T, Suzuki M,
et al. Functional haplotypes of PADI4, encoding citrullinating
enzyme peptidylarginine deiminase 4, are associated with rheumatoid arthritis. Nat Genet 2003;34:395–402.
Begovich AB, Carlton VE, Honigberg LA, Schrodi SJ, Chokkalingam AP, Alexander HC, et al. A missense single-nucleotide
polymorphism in a gene encoding a protein tyrosine phosphatase
(PTPN22) is associated with rheumatoid arthritis. Am J Hum
Genet 2004;75:330–7.
Lee HS, Korman BD, Le JM, Kastner DL, Remmers EF,
Gregersen PK, et al. Genetic risk factors for rheumatoid arthritis
differ in Caucasian and Korean populations. Arthritis Rheum
2009;60:364–71.
Lee HS, Lee KW, Song GG, Kim HA, Kim SY, Bae SC. Increased
susceptibility to rheumatoid arthritis in Koreans heterozygous for
HLA–DRB1*0405 and *0901. Arthritis Rheum 2004;50:3468–75.
Lee HS, Remmers EF, Le JM, Kastner DL, Bae SC, Gregersen
PK. Association of STAT4 with rheumatoid arthritis in the Korean
population. Mol Med 2007;13:455–60.
Arnett FC, Edworthy SM, Bloch DA, McShane DJ, Fries JF,
Cooper NS, et al. The American Rheumatism Association 1987
revised criteria for the classification of rheumatoid arthritis.
Arthritis Rheum 1988;31:315–24.
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MA,
Bender D, et al. PLINK: a tool set for whole-genome association
and population-based linkage analyses. Am J Hum Genet 2007;
81:559–75.
Barrett JC, Fry B, Maller J, Daly MJ. Haploview: analysis and
visualization of LD and haplotype maps. Bioinformatics 2005;21:
263–5.
Price AL, Patterson NJ, Plenge RM, Weinblatt ME, Shadick NA,
893
16.
17.
18.
19.
20.
21.
22.
23.
24.
25.
26.
Reich D. Principal components analysis corrects for stratification
in genome-wide association studies. Nat Genet 2006;38:904–9.
Purcell S, Cherny SS, Sham PC. Genetic Power Calculator: design
of linkage and association genetic mapping studies of complex
traits. Bioinformatics 2003;19:149–50.
Storey JD, Tibshirani R. Statistical significance for genomewide
studies. Proc Natl Acad Sci U S A 2003;100:9440–5.
Freudenberg J, Lee AT, Siminovitch KA, Amos CI, Ballard D,
Li W, et al. Locus category based analysis of a large genome-wide
association study of rheumatoid arthritis. Hum Mol Genet 2010;
19:3863–72.
Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs
RA, et al. A second generation human haplotype map of over 3.1
million SNPs. Nature 2007;449:851–61.
Barrett JC, Clayton DG, Concannon P, Akolkar B, Cooper JD,
Erlich HA, et al. Genome-wide association study and meta-analysis find that over 40 loci affect risk of type 1 diabetes. Nat Genet
2009;41:703–7.
Vossenaar ER, Zendman AJ, van Venrooij WJ, Pruijn GJ. PAD,
a growing family of citrullinating enzymes: genes, features and
involvement in disease. Bioessays 2003;25:1106–18.
Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC,
Sullivan PF, et al. Common polygenic variation contributes to
risk of schizophrenia and bipolar disorder. Nature 2009;460:
748–52.
Nowling TK, Fulton JD, Chike-Harris K, Gilkeson GS. Ets factors
and a newly identified polymorphism regulate Fli1 promoter
activity in lymphocytes. Mol Immunol 2008;45:1–12.
Svenson JL, Chike-Harris K, Amria MY, Nowling TK. The mouse
and human Fli1 genes are similarly regulated by Ets factors in
T cells. Genes Immun 2010;11:161–72.
Yang W, Shen N, Ye DQ, Liu Q, Zhang Y, Qian XX, et al.
Genome-wide association study in Asian populations identifies
variants in ETS1 and WDFY4 associated with systemic lupus
erythematosus. PLoS Genet 2010;6:e1000841.
Koretzky GA, Abtahian F, Silverman MA. SLP76 and SLP65:
complex regulation of signalling in lymphocytes and beyond. Nat
Rev Immunol 2006;6:67–78.
Документ
Категория
Без категории
Просмотров
0
Размер файла
203 Кб
Теги
overlay, loci, wide, susceptibility, koreanspopulation, well, associations, stud, specific, genome, arthritis, rheumatoid, european
1/--страниц
Пожаловаться на содержимое документа