doi: 10.1111/ppe.12418 1 Commentary Genetic Association Family-Based Studies and Preeclampsia Claire Infante-Rivard Department of Epidemiology, Biostatistics and Occupational Health, Faculty of Medicine, McGill University, Montreal, QC, Canada Preeclampsia has been termed a disease of theories; the latter have evolved over the years but the condition remains of uncertain aetiology. Preeclampsia is now considered a syndrome with different phenotypes (mild, severe, early onset, late onset, maternal, placental, recurrent).1 Overall prevalence figures range from <1% to 9%. Together, these observations (disease heterogeneity, complex and diverse mechanisms, and relatively low frequency) pose a challenge for the conduct of robust genetic and epidemiological studies. Despite these challenges, the weight of the consequences justifies additional studies: for the foetus these could be, for example, preterm delivery, foetal growth restriction, respiratory problems, with possible premature cardiovascular disease later in life. Moreover, it is believed that there are future cardiovascular and renal consequences for women with a history of preeclampsia.2,3 Preeclampsia aggregates in families. One relevant question is how much of the familial aggregation can be attributed to genetic causes. Heritability is a parameter measured to answer this question. It has been reported to be approximately 55%, with 30–35% due to the mother and 20% to the foetus.4 Heritability is computed from phenotypic data and can vary according to studies; nevertheless, here, it provides a strong enough indication to pursue a genetic line of investigation for preeclampsia. In this issue of the journal, Bauer et al.5 explore the role of candidate genes and their polymorphisms (single-nucleotide polymorphisms, SNP) on the development of preeclampsia. The study involved a considerable amount of work and expertise. Nevertheless, one may argue that the conclusions seem quite limited. In particular, one has to consider whether using a case and control duo design to study genetic associations is worth the Correspondence: Claire Infante-Rivard, Department of Epidemiology, Biostatistics and Occupational Health, Faculty of Medicine, McGill University, Montreal, Canada. Email: firstname.lastname@example.org © 2017 John Wiley & Sons Ltd Paediatric and Perinatal Epidemiology, 2017, , – additional effort. Other more traditional study aspects may also have contributed to the apparent paucity of findings. Among the many possible mechanistic pathways for preeclampsia, two were selected: the nitric oxide and the heme oxygenase pathways. Known polymorphic genes involved in these pathways were genotyped. However, most of the selected variants were not among the frequently studied ones since 1990, and among those that were (e.g. NOS3), these had not been found to be associated with preeclampsia.6 The mechanistically plausible thrombophilia (in particular the F5 and F2 genes) and angiogenic pathways would have seemed more promising. The main contribution of the Bauer et al. study5 rests on the use of a family-based design consisting of dyads defined as case duos (mothers with preeclampsia and their offspring) as well as control duos (unaffected mothers and their offspring). Whereas pedigree studies measure linkage (transmission within families), both case–control and family-based association studies evaluate the association between a phenotypic condition and some SNP (transmission across families). On the other hand, family-based studies usually involve more subjects to ascertain and more measures to take than a case–control study, which necessarily impacts on feasibility and cost. Therefore, their advantages need to be considered. Among the family-based association studies, the case-parent trio has the most advantages. In reference to the above distinction between linkage versus association, trio studies can test the composite null hypothesis of linkage and association. However, their main advantage is with respect to internal validity as they are robust against the principal source of confounding in genetic association studies namely population structure. Population structure can be understood as an ethnic/genetic background that causes both the distribution of the exposure (the studied SNP) and the outcome. In the case–control study, the expected under the null is computed assuming the distribution of alleles is the same in cases and controls, and 2 C. Infante-Rivard therefore, the expected in the cases is derived from the common distribution. Whereas in case-parent trios, the expected is derived from parents’ genotypes based on Mendel’s laws. The parents’ non-transmitted alleles serve as controls for the affected offspring. Therefore, there can be no confounding from differences in ethnic background between the affected offspring and his/her controls because these are the non-transmitted alleles from the parents. Assuming no measurement error or selection bias, the risk estimates in trio studies are causal because alleles to the offspring (i.e. the exposure) are assumed perfectly randomised according to Mendel’s laws. On the other hand, a comparison of unrelated cases and controls is always susceptible to bias from population structure. Unfortunately, the population structure bias can also affect the case-mother/control-mother duos as used in Bauer et al.5 It has been shown that case duos will not maintain nominal type I error for the genotype relative risk estimates if there is population stratification.7 Adding different types of controls to help estimate the mating type frequencies (defined below) as done in this study does not necessarily help as differences in these frequencies arise between (here) mothers of cases and mothers of controls. Therefore, case and control duos are not as robust to population stratification as case-parent trios. It may be difficult to obtain DNA samples from fathers, but the birth period is possibly the easiest one to do so and the additional effort may be worthwhile. This is particularly relevant given that the trio design requires less samples than the case and control duo design and would appear to have some important advantages. Alternatively, there are analytical methods to detect population structure in association studies and to control for it in the analysis (e.g. with principal components analysis); unfortunately, not all control methods can be applied with candidate gene studies because neutral SNP have not been genotyped. Despite a solid quality control assessment, this limitation was applicable to the Bauer et al. study; however, other analyses were not indicative of a population structure bias. With case-parent trios, the log-linear model (Poisson regression) is the prevalent model of analysis. Genotype relative risks for maternal and offspring genotypes are estimated using the paternal genotype and alternative control genotypes, if included in hybrid designs,8 to estimate the mating type frequencies. These frequencies are the possible combinations of the mother, father and child genotypes. This additional information (mating type frequency, allele frequency) in the log-linear model probably confers it more power and greater efficiency than say the logistic regression.7 The Bauer et al. study used this analysis model with its advantages, although case duos are generally not as powerful as case trios. However, including both mother and child genotypes in the model as opposed to a separate analysis of these genotypes can address confounding from maternal genotype on the foetal genotype. Covariate main effects can be estimated using the log-linear model in duo designs, whereas the caseparent trio does not readily lend itself to the inclusion of additional non-genetic covariates. Nevertheless, this is not as straightforward as in logistic regression. The present paper did not analyse non-genetic covariates together with genetic factors in the log-linear model. The Mendelian inheritance ratio assumption is the basis of many genetic analyses. It can be adjusted for with a transmission ratio distortion offset or the inclusion of control trios in the log-linear model.9,10 Such adjustment is exceedingly uncommon in genetic association studies possibly because there is scepticism about the existence of these distortions (but, of course, they will not be observed if not investigated). In this study, control duos were mainly included to increase power and obtain better parameter estimation. The family-based association studies provide an opportunity to study genetic aspects such as parentof-origin effects and maternal foetal genotype interactions.7 This is particularly relevant with preeclampsia. Interaction as mismatch between maternal and foetal genotypes can have a negative impact (e.g. Rh incompatibility). A preliminary analysis of interaction was done and results were negative.5 Despite a large study, power probably remained an issue. Study replication, rarely done in non-genetic epidemiological studies, was carried out here. Despite similar allele frequencies in the original and replication samples, results remain unconfirmed.5 Whether this was because of a different analysis, sample size, or lack of transportability, remains to be understood. Finally, the study 5 is not without a potential for selection bias (at entry and on follow-up in the original cohort). Conditioning on initial and sustained participation, spurious associations could have resulted. This is a distinct issue from missing genotypes among sample subjects which are reasonably assumed missing at random conditional on disease status and © 2017 John Wiley & Sons Ltd Paediatric and Perinatal Epidemiology, 2017, , – Commentary observed genotypes and imputed using an EM algorithm in the software used for the analysis. In summary, the design, the analysis, and the quality control features are very strong in the Bauer et al. study. Therefore, the largely null results may suggest that the SNP in the selected pathways do not in effect play an important role in preeclampsia. However, a future confirmatory study could benefit, including for internal validity, from collecting paternal DNA (ideally from fathers of case and control offspring) to use in a related and feasible design such as trios. About the author Claire Infante-Rivard is Professor of Epidemiology at McGill University in Montréal, Canada. Her work focuses on environmental and genetic factors as they relate to childhood diseases, in particular childhood leukemia and other cancers, and adverse pregnancy outcomes. She is board-certified in community medicine. Dr. Infante-Rivard serves on the editorial board of Paediatric and Perinatal Epidemiology. References 1 Umesawa M, Kobashi G. Epidemiology of hypertensive disorders in pregnancy: prevalence, risk factors, predictors and prognosis. Hypertension Research 2017; 40:213–220. 2 Sones JL, Davisson RL. Preeclampsia, of mice and women. Physiological Genomics 2016; 48:565–572. © 2017 John Wiley & Sons Ltd Paediatric and Perinatal Epidemiology, 2017, , – 3 3 Grandi SM, Vallee-Pouliot K, Reynier P, Eberg M, Platt RW, Arel R, et al. Hypertensive disorders in pregnancy and the risk of subsequent cardiovascular disease. Paediatric and Perinatal Epidemiology 2017; 31:412–421. 4 Cnattingius S, Reilly M, Pawitan Y, Lichtenstein P. Maternal and fetal genetic factors account for most of familial aggregation of preeclampsia: A population-based Swedish cohort study. American Journal of Medical Genetics 2004; 130A:365–371. 5 Bauer AE, Avery CL, Shi M, Weinberg CR, Olshan AF, Harmon QE, et al. A family-based study of carbon monoxide and nitric oxide signaling genes and preeclampsia. Epub ahead of print. 6 Staines-Urias E, Paez MC, Doyle P, Dudbridge F, Serrano NC, Ioannidis JPA, et al. Genetic association studies in pre-eclampsia: systematic meta-analyses and field synopsis. International Journal of Epidemiology 2012; 41:1764–1775. 7 Ainsworth HF, Unwin J, Jamison DL, Cordell HJ. Investigation of maternal effects, maternal-fetal interactions and parent-of-origin effects (imprinting), using mothers and their offspring. Genetic Epidemiology 2011; 35:19–45. 8 Infante-Rivard C, Mirea L, Bull SB. Combining case-control and case-trio data from the same population in genetic association analyses: overview of approaches and illustration with a candidate gene study. American Journal of Epidemiology 2009; 170:657–664. 9 Huang LO, Labbe A, Infante-Rivard C. Transmission ratio distortion: review of concept and implications for genetic association studies. Human Genetics 2013; 132:245–263. 10 Huang LO, Infante-Rivard C, Labbe A. Analysis of caseparent trios using a log-linear model with adjustment for transmission ratio distortion. Frontiers in Genetics 2016; 7: Article 155. doi:10.3389/fgene.2016.00155.