Association between TCF4 and schizophrenia does not exert its effect by common nonsynonymous variation or by influencing cis-acting regulation of mRNA expression in adult human brain.код для вставкиСкачать
RESEARCH ARTICLE Neuropsychiatric Genetics Association Between TCF4 and Schizophrenia Does Not Exert its Effect by Common Nonsynonymous Variation or by Inﬂuencing cis-Acting Regulation of mRNA Expression in Adult Human Brain Hywel J. Williams, Valentina Moskvina, Rhodri L. Smith, Sarah Dwyer, Giancarlo Russo, Michael J. Owen, and Michael C. O’Donovan* MRC Centre for Neuropsychiatric Genetics and Genomics, Department of Psychological Medicine and Neurology, School of Medicine, Cardiff University, Cardiff, UK Received 21 March 2011; Accepted 6 July 2011 Large collaborative Genome-wide Association studies of schizophrenia have identiﬁed genes and genomic regions that are associated with the disorder at highly stringent levels of statistical signiﬁcance. Among these, transcription factor 4 (TCF4) is one of the best supported although the associated SNP (rs9960767) is located within intron 3 and has no obvious function. Seeking the mechanism at TCF responsible for the association, we examined TCF4 for coding variants, and for cis regulated variation in TCF4 gene expression correlated with the associated SNP using an assay to detect differential allelic expression. Using data from the 1000 genomes project, we were unable to identify any nonsynonymous coding variants at the locus. Allele speciﬁc expression analysis using human post mortem brain samples revealed no evidence for cis-regulated mRNA expression related to genotype at the schizophrenia associated SNP. We conclude that association between schizophrenia and TCF4 is not mediated by a relatively common nonsynonymous variant, or by a variant that alters mRNA expression as measured in adult human brain. It remains possible that the risk allele at this locus exerts effects on expression exclusively in a developmental context, in cell types or brain regions not adequately represented in our analysis, or through post-transcriptional effects, for example in the abundance of the protein or its sub-cellular distribution. Ó 2011 Wiley-Liss, Inc. How to Cite this Article: Williams HJ, Moskvina V, Smith RL, Dwyer S, Russo G, Owen MJ, O’Donovan MC. 2011. Association Between TCF4 and Schizophrenia Does Not Exert its Effect by Common Nonsynonymous Variation or by Inﬂuencing cis-Acting Regulation of mRNA Expression in Adult Human Brain. Am J Med Genet Part B 156:781–784. chromosome 6 [Purcell et al., 2009; Shi et al., 2009; Stefansson et al., 2009]. Demonstration of strong evidence for association is a crucial step in implicating susceptibility genes, but genetic association ﬁndings point to genomic locations that harbour susceptibility variants rather than speciﬁc genes per se, it being possible that any observed association is the result of linkage disequilibrium (LD) to an as yet unknown functional element close to the gene of interest. In order to apply the genetic ﬁndings for developing an understanding of pathophysiology, it is necessary to determine the identity of the gene, or other element, whose function is inﬂuenced by the true functional variant. Presently, in no instances in schizophrenia, and indeed in very few instances in complex diseases Key words: allelic expression; TCF4; GWAS; eQTL; mRNA INTRODUCTION Genome-wide association studies (GWAS) of schizophrenia have identiﬁed a small number of loci at which variants are associated with the disorder at a level considered genome-wide signiﬁcant (GWS) [Dudbridge and Gusnanto, 2008]. To date, the associated markers implicated in the disorder at this level of support map within, or in the vicinity of, zinc-ﬁnger binding protein 804A (ZNF804A) [Williams et al., 2010], transcription factor 4 (TCF4) [Stefansson et al., 2009], neurogranin (NRGN) [Stefansson et al., 2009] and an extended region spanning the MHC locus on Ó 2011 Wiley-Liss, Inc. Grant sponsor: Medical Research Council (UK); Grant sponsor: The Wellcome Trust; Grant sponsor: NIMH (USA) CONTE: 2 P50; Grant number: MH066392-05A1. *Correspondence to: Michael C. O’Donovan, MRC Centre for Neuropsychiatric Genetics and Genomics, Henry Wellcome Building, Department of Psychological Medicine and Neurology, School of Medicine, Cardiff University, Cardiff CF14 4XN, UK. E-mail: odonovanmc@Cardiff.ac.uk Published online 2 August 2011 in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/ajmg.b.31219 781 AMERICAN JOURNAL OF MEDICAL GENETICS PART B 782 as a whole, has the functional variant responsible for the GWS association been identiﬁed. There are several possible explanations for this. Among these is the requirement to undertake comprehensive variant discovery across often large regions of the genome in sufﬁcient numbers of individuals to be conﬁdent of capturing the susceptibility variant at least once. Even if the variant is captured, thereafter, on statistical grounds alone, it will often be difﬁcult to identify which among many correlated markers best captures what are often weak (in terms of effect sizes) association signals. Another approach is to test those markers showing strong evidence for association for some form of functional impact on the putative gene responsible for the signal, but a comprehensive evaluation of functionality requires innumerable functional assays, some under dynamic conditions (e.g., exposure to hormones). Given the complexities of assigning function to variants, the most common approaches are to test genes in associated regions for (a) nonsynonymous variants in the relevant genes that are associated with disease and then design functional assays based upon the type of gene and the location of the variant within that gene and (b) measure mRNA from the candidate gene in extracts from some relevant tissue and then seek evidence for correlation between expression levels and an associated variant, the premise being that if the associated marker tags a functional variant that regulates gene expression, a so called cis-eQTL, the associated variant should be associated with expression of the candidate gene. In the present study, seeking to clarify the mechanism underpinning one of the most convincing schizophrenia associations to a common allele, that at the TCF4 locus, we apply both of these tractable approaches. TCF4 was identiﬁed as a susceptibility locus for schizophrenia by the GWAS and follow up studies of the SGENE consortium [Stefansson et al., 2009] who observed strong evidence for association to a variant (rs9960767; P ¼ 4.1 109) located within intron 3 of TCF4 (GeneID 6925). The signal most likely points to TCF4 per se given that that there is no LD between this variant and other markers that extend beyond the boundaries of this gene, although LD to an unknown intragenic functional element cannot be excluded. TCF4 is also a highly plausible candidate for schizophrenia, it being a transcription factor whose functions involve roles in the development of the nervous system [Blake et al., 2010]. Moreover, point mutations and deletions of the gene are known causes of Pitt–Hopkins Syndrome, a neurodevelopmental disorder characterized by CNS phenotypes such as severe mental retardation, microcephaly, and epilepsy [Blake et al., 2010]. METHODS Seeking Non-Synonymous Variants Seeking non-synonymous mutations that might be responsible for association at TCF4, we downloaded data from the 1000 genomes project (http://www.1000genomes.org/page.php) which, at the time contained data representing 112 haplotypes of European ancestry. Assuming complete mutation discovery, this dataset has power of 1.0 to detect a variant with a frequency of 0.05 (and power of 0.68 to detect an allele with a frequency of 0.01), and therefore makes in house mutation scanning redundant for variants that are relatively common in the population. Looking for cis-Acting Effects on Expression To screen TCF4 for evidence for cis-eQTLs, we applied a method that allows estimation of the relative expression of each copy of the gene within individual subjects [Bray et al., 2003]. The principle of the approach is that where individuals are heterozygous for an unknown cis-eQTL, the copies of the gene will be unequally expressed, whereas in the absence of such a variant, the transcripts from each chromosome will be equally expressed. This method is particularly sensitive for detecting cis-eQTLs as the ‘‘within subject’’ comparison controls for trans-acting factors (e.g., drug exposure, disease status) or most sources of noise (total amount of RNA, brain pH, agonal state, other causes of RNA degradation) that have to be controlled for in studies of total mRNA. To measure relative expression levels within subjects, we identiﬁed samples heterozygous for a SNP located within an exon of TCF4. In those heterozygotes, we then assayed the relative expression levels of mRNA containing each of the exonic alleles using the SNaPshot system (Applied Biosystems, Foster City, CA) using methods we have published before [Bray et al., 2003, 2005]. To identify the presence of cis-eQTLs, we normalized the relative expression of alleles observed in the cDNA by that observed when the same assay was applied to genomic DNA from heterozygotes as this controls for technical biases that can lead to preferential representation of one allele, even when both alleles are present at a 1:1 ratio (the allelic ratio expected in people who are heterozygous and do not have a copy number variant at the site). Analysis of heterozygous samples was performed as two separate experiments. In each experiment, cDNA from two separate reverse transcription reactions were assayed for each heterozygous individual alongside the corresponding genomic DNA sample. For comparisons of allelic expression differences stratiﬁed by genotype at rs9960767, it is not necessary to normalize relative allele measurement for cDNA by that from genomic DNA as any tendency to over-estimate expression of one allele applies regardless of genotype at rs9960767 (although we note as expected, similar results were obtained if we did normalize). Allelic expression analysis was based on the exonic SNP rs8766, located within the last coding exon of TCF4 transcript NM_001083962.1. Statistics Differences in allelic expression were tested by comparing genomic ratios with cDNA ratios from the same heterozygous samples. Group comparisons were analysed by paired t-test, all tests are two-tailed. To test whether clinical status of the individual from whom the sample was obtained, or the brain region from which it was extracted inﬂuenced the results, a univariate analysis of variance test was performed in which the allelic ratio was the dependant factor, and, whether the sample was gDNA or cDNA was a ﬁxed factor, and diagnosis and brain region were entered as covariates. All statistical analyses were performed using SPSS v16. Brain Samples mRNA derived from post-mortem brains (frontal, cortical, parietal or temporal cortex) of 148 unrelated anonymous individuals was WILLIAMS ET AL. available for analysis. Samples had been obtained from three reputable tissue sources (The MRC London Neurodegenerative Diseases Brain Bank, London, UK; The Stanley Medical Research Institute Brain Bank, Chevy Chase, MD; The Karolinska Institute, Stockholm, Sweden). Of these, 78 individuals had received no psychiatric or neurological diagnosis at the time of death, 22 had a diagnosis of Alzheimer’s disease, 18 a diagnosis of schizophrenia, 15 a diagnosis of bipolar disorder and 15 a diagnosis of major depression. The nature of the performed assay (all measures of relative expression are within individual) means that the disease status is not expected to confound our analysis. Genomic DNA was extracted by phenol–chloroform procedures. Total RNA was extracted using the RNA wizÔ isolation reagent (Ambion, Warrington, UK) and then treated with DNase (Ambion). Reverse transcription was performed using random decamers (Ambion) and SuperScript III (Invitrogen, Paisley, UK). RESULTS Non-Synonymous Variants We found no non-synonymous variants in the 1000 genomes project data at TCF4. Expression Analysis In the total sample of successfully genotyped individuals (n ¼ 138), the genotype frequency distribution for SNP rs8766 was AA (0.49), AG (0.36), and GG (0.13), 50 individuals being heterozygous and therefore informative for allelic expression analysis. The assay showed good reproducibility, with average coefﬁcients of variation (SD/mean) of 0.02. Samples for which assay quality was poor (coefﬁcients of variation >0.2) were removed prior to analysis. No large deviations (>20%) from equal expression were observed implying that if cis-acting variants exist that substantially affect the expression of TCF4 in the tissues studied, they are too rare for a single heterozygote to be observed in 50 informative individuals. Although we were unable to detect large deviations in expression, our data were consistent with that of Buonocore and colleagues, in that overall we observed a modest over expression (3%) of the G allele of rs8766, although this was not speciﬁc to a particular brain region or diagnostic group (Univariate analysis of variance: P ¼ 0.72 and 0.86, respectively). To determine if GWS variant rs9960767 at TCF4 is a cis-eQTL, or is in strong LD with such a variant, we stratiﬁed the allelic expression results by genotype at that site. For this analysis 48 individuals were successfully genotyped for SNP rs9960767, 41 of these were homozygous (all ancestral AA genotype) and 7 were heterozygous (AC genotype). The expectation is that if rs9960767 is a proxy of (or is itself) an eQTL, subjects that are heterozygous for this will show greater unequal expression of alleles than homozygotes, since heterozygotes will tend to carry two functionally distinct eQTL alleles whereas homozygotes will carry two functionally equivalent alleles. However, this pattern was not observed, heterozygotes and homozygotes showing similar levels of relative allelic expression deﬁned both directionally (i.e., the amount of allele G over allele A, t-test P ¼ 0.10; Fig. 1) or in terms of the magnitude of the deviation from equal relative expression of one 783 FIG. 1. Expression of SNP alleles at rs8766 in samples homozygous (n ¼ 41) and heterozygous (n ¼ 7) for SNP rs9960767 deﬁned by direction of effect. Allele G is plotted over allele A. allele to another (test of equal variance P ¼ 0.94). The latter is the more appropriate analysis where there is weak or no LD between a putative cis-eQTL and the marker used to estimate relative expression because the low and high expression alleles at the eQTL will not be in phase with a speciﬁc allele at the assayed locus, as is the case in the sample analyzed here (LD between rs8766 and rs9960767; D0 ¼ 0.20, r2 ¼ 0.005). DISCUSSION The aim of the present study was to determine whether the genomewide signiﬁcant ﬁnding at the TCF4 locus could be related to functional variation in the gene itself, this being a requirement for concluding that the association is the result of altered function of this gene per se rather than a co-localized unknown functional element. We ﬁrst looked in silico for nonsynonymous variants at TCF4 that might explain the association at this locus using data from the 1000 genomes project but none was found. Power considerations suggest that association at this locus is not due to a common coding variant that is present in the European population at a frequency of >5%, although in drawing this conclusion, we should note the caveat that there are still technical difﬁculties in detecting small coding insertion/deletion polymorphisms and therefore it is possible that such variants may have been missed by the 1000 genomes project. We next sought to determine whether the associated SNP (rs9960767) might be associated by virtue of being itself a cis-eQTL for TCF4, or in LD with such an eQTL. Within the TCF4 transcript we chose the SNP rs8766 from dbSNP (http://www.ncbi.nlm.nih.gov/projects/SNP) with which to measure the relative allelic expression levels, this having been used to detect cis-acting inﬂuences on expression in a previous study of this locus, but in a sample too small to examine correlation between that phenomenon and the allele associated with schizophrenia (Buonocore et al., 2010). We next stratiﬁed the data with AMERICAN JOURNAL OF MEDICAL GENETICS PART B 784 respect to genotype at rs9960767, the variant showing strong evidence for association with schizophrenia. SNP rs8766 tags each of the two reference transcripts that are described in Entrez Gene (http://www.ncbi.nlm.nih.gov/gene). These differ in their use of an alternative in-frame 30 splice site in exon 17 resulting in a 12 base-pair truncation in the shorter isoform (long isoform a—NP_001077431 and short isoform b—NP_003190). The allelic expression data did not suggest the existence of common cis-eQTL alleles at this locus with moderately large effects on expression, deﬁned here as in our earlier work and that of others as one allele being expressed 20% more than another. This may not be surprising given that hemizygosity for TCF4 results in Pitt–Hopkins syndrome [Blake et al., 2010] and therefore strong evolutionary constraints on variants that have marked effects on expression are expected. More importantly for the present study, heterozygous carriers of the schizophrenia risk allele at TCF4 did not show evidence for more unequal allelic expression at TCF4 than those who were homozygous at this locus, which strongly suggests the schizophrenia associated allele is not an eQTL for TCF4 or in strong LD with such an eQTL. Much of what is known about TCF4 relates to its function in the immune system [Murre, 2005]. In brain, while understanding is still at an early stage, during brain development, bHLH proteins, of which TCF4 is one, modulate critical events in neuronal and glial progenitor cells, controlling the transition from proliferation to differentiation [Ross et al., 2003]. Given the likely developmental importance of TCF4, it may be that the associated variant tags an eQTL, but that effect is only manifest at a particular developmental stage. We are unable to test that hypothesis, which would require large numbers of foetal brain samples at different developmental stages. It should also be noted that our study is unable to address the role of quantitative post-transcriptional effects, for example in the abundance of protein or its sub-cellular localization. Finally, in addition to the two reference transcripts, a large number (n ¼ 38) of potential transcripts are described in Aceview (http:// www.humangenes.org/) based upon extensive alternative splicing and promoter use. These remain to be validated, but if they are genuine TCF4 transcripts, perturbation in splicing or regulation of a minor transcript is a possible mechanism through which variation at this locus might impact on disease. In summary, the results of this study do not support the hypothesis that the genome-wide signiﬁcant SNP in the vicinity of TCF4 exerts its effect either through cis-acting regulation of the TCF4 transcript or via a common non-synonymous mutation. The functional mechanism responsible for that association therefore remains to be uncovered. ACKNOWLEDGMENTS We are indebted to all individuals who have participated in, or helped with, our research. The research was supported by the Medical Research Council (UK), The Wellcome Trust, and NIMH (USA) CONTE: 2 P50 MH066392-05A1. REFERENCES Blake DJ, Forrest M, Chapman RM, Tinsley CL, O’Donovan MC, Owen MJ. 2010. TCF4, schizophrenia, and Pitt–Hopkins syndrome. Schizophr Bull 36(3):443–447. Bray NJ, Buckland PR, Williams NM, Williams HJ, Norton N, Owen MJ, O’Donovan MC. 2003. A haplotype implicated in schizophrenia susceptibility is associated with reduced COMT expression in human brain. Am J Hum Genet 73(1):152–161. Bray NJ, Preece A, Williams NM, Moskvina V, Buckland PR, Owen MJ, O’Donovan MC. 2005. Haplotypes at the dystrobrevin binding protein 1 (DTNBP1) gene locus mediate risk for schizophrenia through reduced DTNBP1 expression. Hum Mol Genet 14(14):1947–1954. Buonocore F, Hill MJ, Campbell CD, Oladimeji PB, Jeffries AR, Troakes C, Hortobagyi T, Williams BP, Cooper JD, Bray NJ. 2010. Effects of cisregulatory variation differ across regions of the adult human brain. Hum Mol Genet 19(22):4490–4496. Dudbridge F, Gusnanto A. 2008. Estimation of signiﬁcance thresholds for genomewide association scans. Genet Epidemiol 32(3):227– 234. Murre C. 2005. Helix-loop-helix proteins and lymphocyte development. Nat Immunol 6(11):1079–1086. Purcell SM, Wray NR, Stone JL, Visscher PM, O’Donovan MC, Sullivan PF, Sklar P, Purcell Leader SM, Stone JL, Sullivan PF, et al. 2009. Common polygenic variation contributes to risk of schizophrenia and bipolar disorder. Nature 460:748–752. Ross SE, Greenberg ME, Stiles CD. 2003. Basic helix-loop-helix factors in cortical development. Neuron 39(1):13–25. Shi J, Levinson DF, Duan J, Sanders AR, Zheng Y, Pe’er I, Dudbridge F, Holmans PA, Whittemore AS, Mowry BJ, et al. 2009. Common variants on chromosome 6p22.1 are associated with schizophrenia. Nature 460:753–757. Stefansson H, Ophoff RA, Steinberg S, Andreassen OA, Cichon S, Rujescu D, Werge T, Pietilainen OP, Mors O, Mortensen PB, et al. 2009. Common variants conferring risk of schizophrenia. Nature 460: 744–747. Williams HJ, Norton N, Dwyer S, Moskvina V, Nikolov I, Carroll L, Georgieva L, Williams NM, Morris DW, Quinn EM, et al. 2010. Fine mapping of ZNF804A and genome-wide signiﬁcant evidence for its involvement in schizophrenia and bipolar disorder. Mol Psychiatry 16(4):429–441.DOI: 10.1038/mp.2010.36 (Epub ahead of print).