Uses of microarrays and related methodologies in animal breeding QuickTimeтДв and a TIFF ( Uncompr essed) decompressor are needed to see this pictur e. Bruce Walsh, firstname.lastname@example.org University of Arizona (Depts. of Ecology & Evolutionary Biology, Molecular & Cellular Biology, Plant Sciences, Animal Sciences, and Epidemology & Biostatistics) The basic idea behind gene expression arrays тАв With a complete (or partial) genome sequence in hand, one can array sequences from genes of interest on small chip, glass slide, or a membrane тАв mRNA is extracted from cells of interest and hybridized to the array тАв Genes showing different levels of mRNA can be detected Types of microarrays тАв Synthetic oligonucleotide arrays тАУ Chemically synthesize oligonucleotide sequences directly on slide/chip/membrane (e.g., using photolithography) тАУ Affymetrix, Agilent тАв Spotted cDNA arrays тАУ PCR products from clones of genes of interest are spotted on a glass slide using a robot тАУ Extracted cellular mRNA is reversetranscribed into cDNAs for hybridization Cell type 1 Cell type 2 Extract mRNA Label mRNA with red fluorescent dye (Cy5) Label mRNA with Green fluorescent dye (Cy3) Cell Type 1 Cell type 2 Hybridize mRNA to array The color of the spot corresponds mRNA to the equal mix from cell Types relative concentrations 1 and 2 of mRNAs for that gene in the two cell types mRNAsfor from these mRNAs from these mRNAs these Genes of roughly equal genes more abundant genes more abundant Abundance celltype type21in both cell inin cell types Analysis of microarray data тАв Image processing and normalization тАв Detecting significant changes in expression тАв Clustering and classification тАУ Clustering: detecting groups of co-expressed genes тАУ Classification: finding those genes at which changes in mRNA expression level predicts phenotype Significance testing-- GLM Yklijk = u + Ak +Rkl + Ti + Gj + TGij +elkijk Array k Replicate l Gene j i Treatment in array k between Interaction k-th spotting of gene j under genei ion and treatment treatment replicate l of jarray k Problem of very many tests (genes) vs. few actual data vectors тАв Expectation: A large number of the GxT interactions will be significant тАУ Controlling experiment-wide p value is very overly conservative (further, tests may be strongly correlated) тАв Generating a reduced set of genes for future consideration (data mining) тАУ FDR (false discovery rate) тАУ PFP (proportion of false positives) тАУ Empirical Bayes approaches Which loci control array-detected changes in mRNA expression? тАв Cis-acting factors тАУ Control regions immediately adjacent to the gene тАв Trans-acting factors тАУ Diffusable factors unlinked (or loosely linked) to the gene of interest тАв Global (Master) regulators тАУ Trans-acting factors that influence a large number of genes David TreadgillтАЩs (UNC) mouse experiment тАв Recombinant Inbred lines from a cross of DBA/2J and C57BL тАв The level of mRNA expression (measured by array analysis) is treated as a quantitative trait and QTL analysis performed for each gene in the array Distribution of >12,000 gene interactions CIS-modifiers MASTER modifiers Genomic location of genes on array TRANS-modifiers Genomic location of mRNA level modifiers Candidate loci : Differences in Gene Expression between lines тАв Correlate differences in levels of expression with trait levels тАв Map factors underlying changes in expression тАУ These are (very) often trans-acting factors тАв Difference between structural alleles and regulatory alleles тАУ Different structural alleles may go undetected on an array analysis Expanded selection opportunities offered by microarrays тАв GxE тАУ Candidate genes may be suggested by examining levels of mRNA expression over different major environments тАУ With candidates in hand, potential for selection of genes showing reduced variance in expression over critical environments тАв Breaking (or at least reducing) potentially deleterious genetic correlations тАУ Look for variation in genes that have little (if any) trans-acting effects on other genes Towards the future тАв Selection decisions using information on gene networks / pathways тАв Microarrays are one tool for reconstructing gene networks тАв Additional tools for examining proteinprotein interactions тАУ Two hybrid screens тАУ FRET & FRAP тАУ 2D Protein gels Analysis and Exploitation of Gene and Metabolic Networks тАв Graph theory тАв Most estimation and statistical issues unresolved тАв Major (current) analytic tool: Kascer-Burns Sensitivity Analysis Gene networks are graphs Kascer-Burns Sensitivity Analysis (aka. Metabolic Control Analysis) тАЬAll theory modelsshould are wrong, are usefulтАЭ тАЬNo fit allsome the models facts because some of (Box) the facts are wrongтАЭ (N. Bohr) Flux = production rate of a Perhaps we increase the concentration of ehere particular product, F 1 However, it may be more efficient The flux control coefficient, introduced by To increase the concentration of e Kascer and Burns, provides a quantitative solution 4 How best to increase the flux through this to this problem pathway? Flux Control Coefficients, C The control coefficient for the flux at step i in a pathway associated with enzyme j, j Ci = @F i E j @E j f i = @ln F i @ln E j Roughly speaking, the control coefficient is the percentage change in flux divided by percentage change in enzyme activity j i . Flux Why many mutations are recessive: a 50% reduction in activity (the heterozygote) results in only a very small change in the flux Activity When When the the activity activity of of E E is is large, near zero, C C is is close close to to zero 1 Kacser-Burns Flux summation theorem: X C j i = 1 i тАвтАв While Coefficients are notproteins intrinsic properties most valuescoefficient of C for are positive, If a control is greatly тАв Truly rate-limiting steps are rare negative regulators (repressors) give negativesystem values, of an enzyme, but rather a (local) increased in value, this decreases the allowing for C values > 1. property values of other control coefficients тАЬrate-limitingтАЭ steps in pathways Small-Kacser theorem: the factor f by which flux is Hence, the limiting increase in f is increased by an r-fold increase in activity of E is 1 1 f = j f = 1 ┬░r ┬░C E1 j 1┬░ CE r Using estimated Control Coefficients as selection aids тАв Loci with larger C values should respond faster to selection тАв Such loci are obvious targets for screens of natural variation (candidate loci) тАв Selection with reduced correlations тАУ Tallis or Kempthorne - Nordskog restricted selection index тАУ Select on loci with large C for flux of interest, smallest C for other fluxes not of concern тАУ Positive selection on C for flux of interest, selection to reduce flux changes in other pathways We wish this flux to remain unchanged A more Thecorrect initial approach approach, might however be to is to Flux we wish to increase Pick try theeither step(s)e3that or emaximize CF while e1 orminimizing e2 CH 4, rather than Index selection on pathways тАв The elements of selection include both phenotype and C, and (possibly) marker markers as well тАв Problems: тАУ C is a local estimate, changing as the pathway evolves тАУ Still have all the standard concerns with a selection index (e.g., stability of inverse of genetic covariance matrix) тАУ These are important caveats to consider even under the rosy scenaro where all CтАЩs are known What to call it? MAS = Marker Assisted Selection CAS = Control Coefficient Assisted Selection CASH $ = Control Activity Selection Helper Summary тАв Microarray analysis = data mining тАв Potential (immediate) useage: тАУ Suggesting candidate loci тАУ More efficient use of G X E тАУ Reducing/breaking deleterious correlations тАв Cis (easy) vs. trans (hard) control of expression levels тАв Future = analysis of pathways тАУ Index selection (and all its problems) Farewell from the тАЬdesertтАЭ U of A Campus QuickTimeтДв and a TIFF (Unc ompressed) dec ompressor are needed to see this picture.