Basic Experimental Design Larry V. Hedges Northwestern University Prepared for the IES Summer Research Training Institute July 8, 2008 What is Experimental Design? Experimental design includes both вЂў Strategies for organizing data collection вЂў Data analysis procedures matched to those data collection strategies Classical treatments of design stress analysis procedures based on the analysis of variance (ANOVA) Other analysis procedure such as those based on hierarchical linear models or analysis of aggregates (e.g., class or school means) are also appropriate Why Do We Need Experimental Design? Because of variability We wouldnвЂ™t need a science of experimental design if вЂў If all units (students, teachers, & schools) were identical and вЂў If all units responded identically to treatments We need experimental design to control variability so that treatment effects can be identified A Little History The idea of controlling variability through design has a long history In 1747 Sir James LindвЂ™s studies of scurvy Their cases were as similar as I could have them. They all in general had putrid gums, spots and lassitude, with weakness of their knees. They lay together on one place вЂ¦ and had one diet common to all (Lind, 1753, p. 149) Lind then assigned six different treatments to groups of patients A Little History The idea of random assignment was not obvious and took time to catch on In 1648 von Helmont carried out one randomization in a trial of bloodletting for fevers In 1904 Karl Pearson suggested matching and alternation in typhoid trials Amberson, et al. (1931) carried out a trial with one randomization In 1937 Sir Bradford Hill advocated alternation of patients in trials rather than randomization Diehl, et al. (1938) carried out a trial that is sometimes referred to as randomized, but it actually used alternation A Little History The first modern randomized clinical trial in medicine is usually considered to be the trial of streptomycin for treating tuberculosis It was conducted by the British Medical Research Council in 1946 and reported in 1948 A Little History Experiments have been used longer in the behavioral sciences (e.g., psychophysics: Pierce and Jastrow, 1885) Experiments conducted in laboratory settings were widely used in educational psychology (e.g., McCall, 1923) Thorndike (early 1900вЂ™s) Lindquist (1953) Gage field experiments on teaching (1978 вЂ“ 1984) A Little History Studies in crop variation I вЂ“ VI (1921 вЂ“ 1929) In 1919 a statistician named Fisher was hired at Rothamsted agricultural station They had a lot of observational data on crop yields and hoped a statistician could analyze it to find effects of various treatments All he had to do was sort out the effects of confounding variables Studies in Crop Variation I (1921) Fisher does regression analysesвЂ”lots of themвЂ”to study (and get rid of) the effects of confounders вЂў вЂў вЂў вЂў soil fertility gradients drainage effects of rainfall effects of temperature and weather, etc. Fisher does qualitative work to sort out anomalies Conclusion The effects of confounders are typically larger than those of the systematic effects we want to study Studies in Crop Variation II (1923) Fisher invents вЂў Basic principles of experimental design вЂў Control of variation by randomization вЂў Analysis of variance Studies in Crop Variation IV and VI Studies in Crop variation IV (1927) Fisher invents analysis of covariance to combine statistical control and control by randomization Studies in crop variation VI (1929) Fisher refines the theory of experimental design, introducing most other key concepts known today Our Hero in 1929 Principles of Experimental Design Experimental design controls background variability so that systematic effects of treatments can be observed Three basic principles 1. Control by matching 2. Control by randomization 3. Control by statistical adjustment Their importance is in that order Control by Matching Known sources of variation may be eliminated by matching Eliminating genetic variation Compare animals from the same litter of mice Eliminating district or school effects Compare students within districts or schools However matching is limited вЂў matching is only possible on observable characteristics вЂў perfect matching is not always possible вЂў matching inherently limits generalizability by removing (possibly desired) variation Control by Matching Matching ensures that groups compared are alike on specific known and observable characteristics (in principle, everything we have thought of) WouldnвЂ™t it be great if there were a method of making groups alike on not only everything we have thought of, but everything we didnвЂ™t think of too? There is such a method Control by Randomization Matching controls for the effects of variation due to specific observable characteristics Randomization controls for the effects all (observable or non-observable, known or unknown) characteristics Randomization makes groups equivalent (on average) on all variables (known and unknown, observable or not) Randomization also gives us a way to assess whether differences after treatment are larger than would be expected due to chance. Control by Randomization Random assignment is not assignment with no particular rule. It is a purposeful process Assignment is made at random. This does not mean that the experimenter writes down the names of the varieties in any order that occurs to him, but that he carries out a physical experimental process of randomization, using means which shall ensure that each variety will have an equal chance of being tested on any particular plot of ground (Fisher, 1935, p. 51) Control by Randomization Random assignment of schools or classrooms is not assignment with no particular rule. It is a purposeful process Assignment of schools to treatments is made at random. This does not mean that the experimenter assigns schools to treatments in any order that occurs to her, but that she carries out a physical experimental process of randomization, using means which shall ensure that each treatment will have an equal chance of being tested in any particular school (Hedges, 2007) Control by Statistical Adjustment Control by statistical adjustment is a form of pseudomatching It uses statistical relations to simulate matching Statistical control is important for increasing precision but should not be relied upon to control biases that may exist prior to assignment Statistical control is the weakest of the three experimental design principles because its validity depends on knowing a statistical model for responses Using Principles of Experimental Design You have to know a lot (be smart) to use matching and statistical control effectively You do not have to be smart to use randomization effectively But Where all are possible, randomization is not as efficient (requires larger sample sizes for the same power) as matching or statistical control Basic Ideas of Design: Independent Variables (Factors) The values of independent variables are called levels Some independent variables can be manipulated, others canвЂ™t Treatments are independent variables that can be manipulated Blocks and covariates are independent variables that cannot be manipulated These concepts are simple, but are often confused Remember: You can randomly assign treatment levels but not blocks Basic Ideas of Design (Crossing) Relations between independent variables Factors (treatments or blocks) are crossed if every level of one factor occurs with every level of another factor Example The Tennessee class size experiment assigned students to one of three class size conditions. All three treatment conditions occurred within each of the participating schools Thus treatment was crossed with schools Basic Ideas of Design (Nesting) Factor B is nested in factor A if every level of factor B occurs within only one level of factor A Example The Tennessee class size experiment actually assigned classrooms to one of three class size conditions. Each classroom occurred in only one treatment condition Thus classrooms were nested within treatments (But treatment was crossed with schools) Where Do These Terms Come From? (Nesting) An agricultural experiment where blocks are literally blocks or plots of land Blocks 1 T1 2 T2 вЂ¦ вЂ¦ n T1 Here each block is literally nested within a treatment condition Where Do These Terms Come From? (Crossing) An agricultural experiment Blocks 1 2 T1 T2 T2 T1 вЂ¦ вЂ¦ n T1 T2 Blocks were literally blocks of land and plots of land within blocks were assigned different treatments Where Do These Terms Come From? (Crossing) Blocks were literally blocks of land and plots of land within blocks were assigned different treatments. Blocks 1 2 T1 T2 T2 T1 вЂ¦ вЂ¦ n T1 T2 Here treatment literally crosses the blocks Where Do These Terms Come From? (Crossing) The experiment is often depicted like this. What is wrong with this as a field layout? Blocks 1 Treatment 1 2 вЂ¦ n вЂ¦ Treatment 2 Consider possible sources of bias Think About These Designs A study assigns a reading treatment (or control) to children in 20 schools. Each child is classified into one of three groups with different risk of reading failure. A study assigns T or C to 20 teachers. The teachers are in five schools, and each teacher teaches 4 science classes Two schools in each district are picked to participate. Each school has two grade 4 teachers. One of them is assigned to T, the other to C. Three Basic Designs The completely randomized design Treatments are assigned to individuals The randomized block design Treatments are assigned to individuals within blocks (This is sometimes called the matched design, because individuals are matched within blocks) The hierarchical design Treatments are assigned to blocks, the same treatment is assigned to all individuals in the block The Completely Randomized Design Individuals are randomly assigned to one of two treatments Treatment Control Individual 1 Individual 1 Individual 2 Individual 2 вЂ¦ вЂ¦ Individual nT Individual nC The Randomized Block Design Block 1 вЂ¦ Individual 1 Individual 1 вЂ¦ вЂ¦ Individual n1 Individual nm Individual n1 +1 Individual nm + 1 Individual 2n1 вЂ¦ вЂ¦ вЂ¦ Treatment 2 вЂ¦ Treatment 1 Block m Individual 2nm The Hierarchical Design Treatment Control Block 1 Block m Block m+1 Block 2m Individual 1 Individual 1 Individual 1 Individual 1 Individual 2 Individual 2 Individual 2 Individual 2 вЂ¦ Individual nm+1 вЂ¦ Individual nm вЂ¦ вЂ¦ вЂ¦ Individual n1 вЂ¦ Individual n2m Randomization Procedures Randomization has to be done as an explicit process devised by the experimenter вЂў Haphazard is not the same as random вЂў Unknown assignment is not the same as random вЂў вЂњEssentially randomвЂќ is technically meaningless вЂў Alternation is not random, even if you alternate from a random start This is why R.A. Fisher was so explicit about randomization processes Randomization Procedures R.A. Fisher on how to randomize an experiment with small sample size and 5 treatments A satisfactory method is to use a pack of cards numbered from 1 to 100, and to arrange them in random order by repeated shuffling. The varieties [treatments] are numbered from 1 to 5, and any card such as the number 33, for example is deemed to correspond to variety [treatment] number 3, because on dividing by 5 this number is found as the remainder. (Fisher, 1935, p.51) Randomization Procedures You may want to use a table of random numbers, but be sure to pick an arbitrary start point! Beware random number generatorsвЂ”they typically depend on seed values, be sure to vary the seed value (if they do not do it automatically) Otherwise you can reliably generate the same sequence of random numbers every time It is no different that starting in the same place in a table of random numbers Randomization Procedures Completely Randomized Design (2 treatments, 2n individuals) Make a list of all individuals For each individual, pick a random number from 1 to 2 (odd or even) Assign the individual to treatment 1 if even, 2 if odd When one treatment is assigned n individuals, stop assigning more individuals to that treatment Randomization Procedures Completely Randomized Design (2pn individuals, p treatments) Make a list of all individuals For each individual, pick a random number from 1 to p One way to do this is to get a random number of any size, divide by p, the remainder R is between 0 and (p вЂ“ 1), so add 1 to the remainder to get R + 1 Assign the individual to treatment R + 1 Stop assigning individuals to any treatment after it gets n individuals Randomization Procedures Randomized Block Design with 2 Treatments (m blocks per treatment, 2n individuals per block) Make a list of all individuals in the first block For each individual, pick a random number from 1 to 2 (odd or even) Assign the individual to treatment 1 if even, 2 if odd Stop assigning a treatment it is assigned n individuals in the block Repeat the same process with every block Randomization Procedures Randomized Block Design with p Treatments (m blocks per treatment, pn individuals per block) Make a list of all individuals in the first block For each individual, pick a random number from 1 to p Assign the individual to treatment p Stop assigning a treatment it is assigned n individuals in the block Repeat the same process with every block Randomization Procedures Hierarchical Design with 2 Treatments (m blocks per treatment, n individuals per block) Make a list of all blocks For each block, pick a random number from 1 to 2 Assign the block to treatment 1 if even, treatment 2 if odd Stop assigning a treatment after it is assigned m blocks Every individual in a block is assigned to the same treatment Randomization Procedures Hierarchical Design with p Treatments (m blocks per treatment, n individuals per block) Make a list of all blocks For each block, pick a random number from 1 to p Assign the block to treatment corresponding to the number Stop assigning a treatment after it is assigned m blocks Every individual in a block is assigned to the same treatment Sampling Models Sampling Models in Educational Research Sampling models are often ignored in educational research But Sampling is where the randomness comes from in social research Sampling therefore has profound consequences for statistical analysis and research designs Sampling Models in Educational Research Simple random samples are rare in field research Educational populations are hierarchically nested: вЂў Students in classrooms in schools вЂў Schools in districts in states We usually exploit the population structure to sample students by first sampling schools Even then, most samples are not probability samples, but they are intended to be representative (of some population) Sampling Models in Educational Research Survey research calls this strategy multistage (multilevel) clustered sampling We often sample clusters (schools) first then individuals within clusters (students within schools) This is a two-stage (two-level) cluster sample We might sample schools, then classrooms, then students This is a three-stage (three-level) cluster sample Precision of Estimates Depends on the Sampling Model Suppose the total population variance is ПѓT2 and ICC is ПЃ Consider two samples of size N = mn A simple random sample or stratified sample The variance of the mean is ПѓT2/mn A clustered sample of n students from each of m schools The variance of the mean is (ПѓT2/mn)[1 + (n вЂ“ 1)ПЃ] The inflation factor [1 + (n вЂ“ 1)ПЃ] is called the design effect Precision of Estimates Depends on the Sampling Model Suppose the population variance is ПѓT2 School level ICC is ПЃS, class level ICC is ПЃC Consider two samples of size N = mpn A simple random sample or stratified sample The variance of the mean is ПѓT2/mpn A clustered sample of n students from p classes in m schools The variance is (ПѓT2/mpn)[1 + (pn вЂ“ 1)ПЃS + (n вЂ“ 1)ПЃC] The three level design effect is [1 + (pn вЂ“ 1)ПЃS + (n вЂ“ 1)ПЃC] Precision of Estimates Depends on the Sampling Model Treatment effects in experiments and quasiexperiments are mean differences Therefore precision of treatment effects and statistical power will depend on the sampling model Sampling Models in Educational Research The fact that the population is structured does not mean the sample is must be a clustered sample Whether it is a clustered sample depends on: вЂў How the sample is drawn (e.g., are schools sampled first then individuals randomly within schools) вЂў What the inferential population is (e.g., is the inference to these schools studied or a larger population of schools) Sampling Models in Educational Research A necessary condition for a clustered sample is that it is drawn in stages using population subdivisions вЂў schools then students within schools вЂў schools then classrooms then students However, if all subdivisions in a population are present in the sample, the sample is not clustered, but stratified Stratification has different implications than clustering Whether there is stratification or clustering depends on the definition of the population to which we draw inferences (the inferential population) Sampling Models in Educational Research The clustered/stratified distinction matters because it influences the precision of statistics estimated from the sample If all population subdivisions are included in the every sample, there is no sampling (or exhaustive sampling) of subdivisions вЂў therefore differences between subdivisions add no uncertainty to estimates If only some population subdivisions are included in the sample, it matters which ones you happen to sample вЂў thus differences between subdivisions add to uncertainty Inferential Population and Inference Models The inferential population or inference model has implications for analysis and therefore for the design of experiments Do we make inferences to the schools in this sample or to a larger population of schools? Inferences to the schools or classes in the sample are called conditional inferences Inferences to a larger population of schools or classes are called unconditional inferences Inferential Population and Inference Models Note that the inferences (what we are estimating) are different in conditional versus unconditional inference models вЂў In a conditional inference, we are estimating the mean (or treatment effect) in the observed schools вЂў In unconditional inference we are estimating the mean (or treatment effect) in the population of schools from which the observed schools are sampled We are still estimating a mean (or a treatment effect) but they are different parameters with different uncertainties Fixed and Random Effects When the levels of a factor (e.g., particular blocks included) in a study are sampled and the inference model is unconditional, that factor is called random and its effects are called random effects When the levels of a factor (e.g., particular blocks included) in a study constitute the entire inference population and the inference model is conditional, that factor is called fixed and its effects are called fixed effects Applications to Experimental Design We will look in detail at the two most widely used experimental designs in education вЂў Randomized blocks designs вЂў Hierarchical designs Experimental Designs For each design we will look at вЂў Structural Model for data (and what it means) вЂў Two inference models вЂ“ What does вЂ�treatment effectвЂ™ mean in principle вЂ“ What is the estimate of treatment effect вЂ“ How do we deal with context effects вЂў Two statistical analysis procedures вЂ“ How do we estimate and test treatment effects вЂ“ How do we estimate and test context effects вЂ“ What is the sensitivity of the tests The Randomized Block Design The population (the sampling frame) We wish to compare two treatments вЂў We assign treatments within schools вЂў Many schools with 2n students in each вЂў Assign n students to each treatment in each school The Randomized Block Design The experiment Compare two treatments in an experiment вЂў We assign treatments within schools вЂў With m schools with 2n students in each вЂў Assign n students to each treatment in each school The Randomized Block Design Diagram of the design Schools Treatment 1 2 вЂ¦ 1 вЂ¦ 2 вЂ¦ m The Randomized Block Design School 1 Schools Treatment 1 2 вЂ¦ 1 вЂ¦ 2 вЂ¦ m The Conceptual Model The statistical model for the observation on the kth person in the jth school in the ith treatment is Yijk = Ој +О±i + ОІj + О±ОІij + Оµijk where Ој is the grand mean, О±i is the average effect of being in treatment i, ОІj is the average effect of being in school j, О±ОІij is the difference between the average effect of treatment i and the effect of that treatment in school j, Оµijk is a residual Effect of Context Yijk пЂЅ пЃ пЂ« пЃЎ i пЂ« пЃў j пЂ« пЃЎ пЃў ij пЂ« пЃҐ ijk Context Effect Two-level Randomized Block Design With No Covariates (HLM Notation) Level 1 (individual level) Yijk = ОІ0j + ОІ1jTijk+ Оµijk Оµ ~ N(0, ПѓW2) Level 2 (school Level) ОІ0j = ПЂ00 + Оѕ0j Оѕ0j ~ N(0, ПѓS2) ОІ1j = ПЂ10+ Оѕ1j Оѕ1j ~ N(0, ПѓTxS2) If we code the treatment Tijk = ВЅ or - ВЅ , then the parameters are identical to those in standard ANOVA Effects and Estimates The population mean of treatment 1 in school j is О±1 + О±ОІ1j The population mean of treatment 2 in school j is О±2 + О±ОІ2j The estimate of the mean of treatment 1 in school j is О±1 + О±ОІ1j + Оµ1jв—Џ The estimate of the mean of treatment 2 in school j is О±2 + О±ОІ2j + Оµ2jв—Џ Effects and Estimates The comparative treatment effect in any given school j is (О±1 вЂ“ О±2) + (О±ОІ1j вЂ“ О±ОІ2j) The estimate of comparative treatment effect in school j is (О±1 вЂ“ О±2) + (О±ОІ1j вЂ“ О±ОІ2j) + (Оµ1jв—Џ вЂ“ Оµ2jв—Џ) The mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) The estimate of the mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (О±ОІ 1в—Џ вЂ“ О±ОІ2в—Џ) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) Inference Models Two different kinds of inferences about effects Unconditional Inference (Schools Random) Inference to the whole universe of schools (requires a representative sample of schools) Conditional Inference (Schools Fixed) Inference to the schools in the experiment (no sampling requirement on schools) Statistical Analysis Procedures Two kinds of statistical analysis procedures Mixed Effects Procedures (Schools Random) Treat schools in the experiment as a sample from a population of schools (only strictly correct if schools are a sample) Fixed Effects Procedures (Schools Fixed) Treat schools in the experiment as a population Unconditional Inference (Schools Random) The estimate of the mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (О±ОІ 1в—Џ вЂ“ О±ОІ2в—Џ) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) The average treatment effect we want to estimate is (О±1 вЂ“ О±2) The term (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) depends on the students in the schools in the sample The term (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) depends on the schools in sample Both (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) and (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) are random and average to 0 across students and schools, respectively Conditional Inference (Schools Fixed) The estimate of the mean treatment effect in the experiment is still (О±1 вЂ“ О±2) + (О±ОІ 1в—Џ вЂ“ О±ОІ2в—Џ) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) Now the average treatment effect we want to estimate is (О±1 + О±ОІ1в—Џ) вЂ“ (О±2 + О±ОІ2в—Џ) = (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) The term (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) depends on the students in the schools in the sample The term (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) depends on the schools in sample, but the treatment effect in the sample of schools is the effect we want to estimate Expected Mean Squares Randomized Block Design (Two Levels, Schools Random) Source df E{MS} Treatment (T) 1 ПѓW2 + nПѓTxS2 + nmОЈО±i2 Schools (S) mвЂ“1 ПѓW2 + 2nПѓS2 TXS mвЂ“1 ПѓW2 + nПѓTxS2 Within Cells 2 m(n вЂ“ 1) ПѓW2 Mixed Effects Procedures (Schools Random) The test for treatment effects has H0: (О±1 вЂ“ О±2) = 0 Estimated mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) The variance of the estimated treatment effect is 2[ПѓW2 + nПѓTxS2] /mn = 2[1 + (nП‰S вЂ“ 1)ПЃ]Пѓ2/mn Here П‰S = ПѓTxS2/ПѓS2 and ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2 Mixed Effects Procedures The test for treatment effects: FT = MST/MSTxS with (m вЂ“ 1) df The test for context effects (treatment by schools interaction) is FTxS = MSTxS/MSWS with 2m(n вЂ“ 1) df Power is determined by the operational effect size пЂЁ О±1 пЂ О± 2 пЂ© n пЃі 1 пЂ« ( nП‰ S пЂ 1) ПЃ where П‰S = ПѓTxS2/ПѓS2 and ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2 Expected Mean Squares Randomized Block Design (Two Levels, Schools Fixed) Source df E{MS} Treatment (T) 1 ПѓW2 + nmОЈО±i2 Schools (S) mвЂ“1 ПѓW2 + 2nОЈОІi2/(m вЂ“ 1) SXT mвЂ“1 ПѓW2 + nОЈОЈО±ОІij2/(m вЂ“ 1) Within Cells 2m(n вЂ“ 1) ПѓW2 Fixed Effects Procedures The test for treatment effects has H0: (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) = 0 Estimated mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) The variance of the estimated treatment effect is 2ПѓW2 /mn Fixed Effects Procedures The test for treatment effects: FT = MST/MSWS with m(n вЂ“ 1) df The test for context effects (treatment by schools interaction) is FC = MSTxS/MSWS with 2m(n вЂ“ 1) df Power is determined by the operational effect size пЂЁ О±1 пЂ О± 2 пЂ© пЂ пЂЁ О± пЃў1п‚· пЂ О± пЃў 2 п‚· пЂ© пЃі with m(n вЂ“ 1) df n Comparing Fixed and Mixed Effects Statistical Procedures (Randomized Block Design) Fixed Mixed Inference Model Conditional Unconditional Estimand (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) (О±1 вЂ“ О±2) (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) Contaminating Factors Operational Effect Size df Power пЂЁ О±1 пЂ О± 2 пЂ© пЂ пЂЁ О± пЃў1п‚· пЂ О± пЃў 2 п‚· пЂ© пЃі n пЂЁ О±1 пЂ О± 2 пЂ© n пЃі 1 пЂ« ( nП‰ S пЂ 1) ПЃ 2m(n вЂ“ 1) (m вЂ“ 1) higher lower Comparing Fixed and Mixed Effects Procedures (Randomized Block Design) Conditional and unconditional inference models вЂў estimate different treatment effects вЂў have different contaminating factors that add uncertainty Mixed procedures are good for unconditional inference The fixed procedures are good for conditional inference The fixed procedures have higher power The Hierarchical Design The universe (the sampling frame) We wish to compare two treatments вЂў We assign treatments to whole schools вЂў Many schools with n students in each вЂў Assign all students in each school to the same treatment The Hierarchical Design The experiment We wish to compare two treatments вЂў We assign treatments to whole schools вЂў Assign 2m schools with n students in each вЂў Assign all students in each school to the same treatment The Hierarchical Design Diagram of the experiment Schools Treatment 1 2 1 2 вЂ¦ m m +1 m +2 вЂ¦ 2 m The Hierarchical Design Treatment 1 schools Schools Treatment 1 2 1 2 вЂ¦ m m +1 m+2 вЂ¦ 2m The Hierarchical Design Treatment 2 schools Schools Treatment 1 2 1 2 вЂ¦ m m+1 m+2 вЂ¦ 2m The Conceptual Model The statistical model for the observation on the kth person in the jth school in the ith treatment is Yijk = Ој + О±i + ОІi + О±ОІij + Оµjk(i) = Ој + О±i + ОІj(i) + Оµjk(i) Ој is the grand mean, О±i is the average effect of being in treatment i, ОІj is the average effect if being in school j, О±ОІij is the difference between the average effect of treatment i and the effect of that treatment in school j, Оµijk is a residual Or ОІj(i) = ОІi + О±ОІij is a term for the combined effect of schools within treatments The Conceptual Model The statistical model for the observation on the kth person in the jth school in the ith treatment is Yijk = Ој + О±i + ОІi + О±ОІij + Оµjk(i) = Ој + О±i + ОІj(i) + Оµjk(i) Context Effects Ој is the grand mean, О±i is the average effect of being in treatment i, ОІj is the average effect if being in school j, О±ОІij is the difference between the average effect of treatment i and the effect of that treatment in school j, Оµijk is a residual or ОІj(i) = ОІi + О±ОІij is a term for the combined effect of schools within treatments Two-level Hierarchical Design With No Covariates (HLM Notation) Level 1 (individual level) Yijk = ОІ0j + Оµijk Оµ ~ N(0, ПѓW2) Level 2 (school Level) Оі0j = ПЂ00 + ПЂ01Tj + Оѕ0j Оѕ ~ N(0, ПѓS2) If we code the treatment Tj = ВЅ or - ВЅ , then ПЂ00 = Ој, ПЂ01 = О±1, Оѕ0j = ОІj(i) The intraclass correlation is ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2 Effects and Estimates The comparative treatment effect in any given school j is still (О±1 вЂ“ О±2) + (О±ОІ1j вЂ“ О±ОІ2j) But we cannot estimate the treatment effect in a single school because each school gets only one treatment The mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) = (О±1 вЂ“ О±2) +(ОІ1в—Џ вЂ“ ОІ2в—Џ )+ (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) The estimate of the mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (ОІв—Џ (1) вЂ“ ОІв—Џ (2)) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) Inference Models Two different kinds of inferences about effects (as in the randomized block design) Unconditional Inference (schools random) Inference to the whole universe of schools (requires a representative sample of schools) Conditional Inference (schools fixed) Inference to the schools in the experiment (no sampling requirement on schools) Unconditional Inference (Schools Random) The average treatment effect we want to estimate is (О±1 вЂ“ О±2) The term (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) depends on the students in the schools in the sample The term (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) depends on the schools in sample Both (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) and (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) are random and average to 0 across students and schools, respectively Conditional Inference (Schools Fixed) The average treatment effect we want to (can) estimate is (О±1 + ОІв—Џ(1)) вЂ“ (О±2 + ОІв—Џ(2)) = (О±1 вЂ“ О±2) + (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) = (О±1 вЂ“ О±2) + (ОІ1в—Џ вЂ“ ОІ2в—Џ )+ (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) The term (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) depends on the schools in sample, but we want to estimate the effect of treatment in the schools in the sample Note that this treatment effect is not quite the same as in the randomized block design, where we estimate (О±1 вЂ“ О±2) + (О±ОІ1в—Џ вЂ“ О±ОІ2в—Џ) Statistical Analysis Procedures Two kinds of statistical analysis procedures (as in the randomized block design) Mixed Effects Procedures Treat schools in the experiment as a sample from a universe Fixed Effects Procedures Treat schools in the experiment as a universe Expected Mean Squares Hierarchical Design (Two Levels, Schools Random) Source df E{MS} Treatment (T) 1 ПѓW2 + nПѓS2 + nmОЈО±i2 Schools (S) 2(m вЂ“ 1) ПѓW2 + nПѓS2 Within Schools 2m(n вЂ“ 1) ПѓW2 Mixed Effects Procedures (Schools Random) The test for treatment effects has H0: (О±1 вЂ“ О±2) = 0 Estimated mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) The variance of the estimated treatment effect is 2[ПѓW2 + nПѓS2] /mn = 2[1 + (n вЂ“ 1)ПЃ]Пѓ2/mn where ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2 Mixed Effects Procedures (Schools Random) The test for treatment effects: FT = MST/MSBS with (m вЂ“ 2) df There is no omnibus test for context effects Power is determined by the operational effect size пЂЁ О±1 пЂ О± 2 пЂ© n пЃі 1 пЂ« ( n пЂ 1) ПЃ where ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2 Expected Mean Squares Hierarchical Design (Two Levels, Schools Fixed) Source df E{MS} Treatment (T) 1 ПѓW2 + nmОЈ(О±i + ОІв—Џ(i))2 mвЂ“1 ПѓW2 + nОЈОЈОІj(i)2/2(m вЂ“ 1) Schools (S) Within Schools 2 m(n вЂ“ 1) ПѓW2 Mixed Effects Procedures (Schools Fixed) The test for treatment effects has H0: (О±1 вЂ“ О±2) + (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) = 0 Note that the school effects are confounded with treatment effects Estimated mean treatment effect in the experiment is (О±1 вЂ“ О±2) + (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) The variance of the estimated treatment effect is 2ПѓW2 /mn Mixed Effects Procedures (Schools Fixed) The test for treatment effects: FT = MST/MSWS with m(n вЂ“ 1) df There is no omnibus test for context effects, because each school gets only one treatment Power is determined by the operational effect size пЂЁ О±1 пЂ О± 2 пЂ© пЂ пЂЁ пЃў п‚· (1) пЂ пЃў п‚· ( 2 ) пЂ© пЃі and m(n вЂ“ 1) df n Comparing Fixed and Mixed Effects Procedures (Hierarchical Design) Fixed Mixed Inference Model Conditional Unconditional Estimand (О±1 вЂ“ О±2) + (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) (О±1 вЂ“ О±2) (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) (ОІв—Џ(1) вЂ“ ОІв—Џ(2)) + (Оµ1в—Џв—Џ вЂ“ Оµ2в—Џв—Џ) Contaminating Factors Effect Size пЂЁ О±1 пЂ О± 2 пЂ© пЂ пЂЁ пЃў п‚· (1) пЂ пЃў п‚· ( 2 ) пЂ© пЃі df Power n пЂЁ О±1 пЂ О± 2 пЂ© n пЃі 1 пЂ« ( n пЂ 1) ПЃ m(n вЂ“ 1) (m вЂ“ 2) higher lower Comparing Fixed and Mixed Effects Statistical Procedures (Hierarchical Design) Conditional and unconditional inference models вЂў estimate different treatment effects вЂў have different contaminating factors that add uncertainty Mixed procedures are good for unconditional inference The fixed procedures are not generally recommended The fixed procedures have higher power Comparing Hierarchical Designs to Randomized Block Designs Randomized block designs usually have higher power, but assignment of different treatments within schools or classes may be вЂў practically difficult вЂў politically infeasible вЂў theoretically impossible It may be methodologically unwise because of potential for вЂў Contamination or diffusion of treatments вЂў compensatory rivalry or demoralization Applications to Experimental Design We will address the two most widely used experimental designs in education вЂў Randomized blocks designs with 2 levels вЂў Randomized blocks designs with 3 levels вЂў Hierarchical designs with 2 levels вЂў Hierarchical designs with 3 levels We also examine the effect of covariates Hereafter, we generally take schools to be random Complications Which matchings do we have to take into account in design (e.g., schools, districts, regions, states, regions of the country, country)? Ignore some, control for effects of others as fixed blocking factors Justify this as part of the population definition For example, we define the inference population as these five districts within these two states But, doing so obviously constrains generalizability Precision of the Estimated Treatment Effect Precision is the standard error of the estimated treatment effect Precision in simple (simple random sample) designs depends on: вЂў Standard deviation in the population Пѓ вЂў Total sample size N The precision is SE пЂЅ 2пЃі N Precision of the Estimated Treatment Effect Precision in complex (clustered sample) designs depends on: вЂў The (total) standard deviation ПѓT вЂў Sample size at each level of sampling (e.g., m clusters, n individuals per cluster) вЂў Intraclass correlation structure It is a little harder to compute than in simple designs, but important because it helps you see what matters in design Intraclass Correlations in Two-level Designs In two-level designs the intraclass correlation structure is determined by a single intraclass correlation This intraclass correlation is the proportion of the total variance that is between schools (clusters) 2 ПЃпЂЅ пЃіS пЃі 2 2 пЂ« пЃі S W 2 пЂЅ пЃіS 2 пЃіT Precision in Two-level Hierarchical Design With No Covariates The standard error of the treatment effect is SE пЂЅ пЃі T пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« ( n пЂ 1) ПЃ пѓ¶ пѓ§ пѓ·пѓ§ пѓ· n пѓЁ m пѓёпѓЁ пѓё SE decreases as m (number of schools) increases SE deceases as n increases, but only up to point SE increases as ПЃ increases Statistical Power Power in simple (simple random sample) designs depends on: вЂў Significance level вЂў Effect size вЂў Sample size Look power up in a table for sample size and effect size Fragment of CohenвЂ™s Table 2.3.5 d n 0.10 0.20 вЂ¦ 0.80 1.00 1.20 1.40 8 05 07 вЂ¦ 31 46 60 73 9 06 07 вЂ¦ 35 51 65 79 10 06 07 вЂ¦ 39 56 71 84 11 06 07 вЂ¦ 43 63 76 87 Computing Statistical Power Power in complex (clustered sample) designs depends on: вЂў Significance level вЂў Effect size Оґ вЂў Sample size at each level of sampling (e.g., m clusters, n individuals per cluster) вЂў Intraclass correlation structure This makes it seem a lot harder to compute Computing Statistical Power Computing statistical power in complex designs is only a little harder than computing it for simple designs Compute operational effect size (incorporates sample design information) О”T Look power up in a table for operational sample size and operational effect size This is the same table that you use for simple designs Power in Two-level Hierarchical Design With No Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) пЃ„ пЂЅпЃ¤ T О”T = Оґ x (Design Effect) n 1 пЂ« пЂЁ n пЂ 1пЂ© ПЃ For the two-level hierarchical design with no covariates пЃ„ пЂЅпЃ¤ T n 1 пЂ« пЂЁ n пЂ 1пЂ© ПЃ Operational sample size is number of schools (clusters) Power in Two-level Hierarchical Design With No Covariates As m (number of schools) increases, power increases As effect size increases, power increases Other influences occur through the design effect n 1 пЂ« пЂЁ n пЂ 1пЂ© ПЃ пЂЅ 1 1 n пЂ« (1 пЂ 1n ) пЃІ As ПЃ increases the design effect (and power) decreases No matter how large n gets the maximum design effect is 1/ ПЃ Thus power only increases up to some limit as n increases Two-level Hierarchical Design With Covariates (HLM Notation) Level 1 (individual level) Yijk = ОІ0j + ОІ1jXijk+ Оµijk Оµ ~ N(0, ПѓAW2) Level 2 (school Level) ОІ0j = ПЂ00 + ПЂ01Tj + ПЂ02Wj + Оѕ0j ОІ1j = ПЂ10 Оѕ ~ N(0, ПѓAS2) Note that the covariate effect ОІ1j = ПЂ10 is a fixed effect If we code the treatment Tj = ВЅ or - ВЅ , then the parameters are identical to those in standard ANCOVA Precision in Two-level Hierarchical Design With Covariates The standard error of the treatment effect SE пЂЅ пЃі T пѓ¦ 1 пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© R 2 пЂ« пЂЁ nR 2 пЂ R 2 пЂ© пЃІ пѓ№ пѓ¶ S W пѓ¦ 2 пѓ¶пѓ§ пѓ« W пѓ»пѓ· пѓ§ пѓ· пѓ· n пѓЁ m пѓёпѓ§ пѓЁ пѓё SE decreases as m increases SE deceases as n increases, but only up to point SE increases as ПЃ increases SE decreases as RW2 and RS2 increase Power in Two-level Hierarchical Design With Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) О”T = Оґ x (Design Effect) For the two-level hierarchical design with covariates пЃ„ T A пЂЅпЃ¤ n 2 2 2 1 пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ nR S пЂ RW пЂ© пЃІ пѓ№ пѓ« пѓ» The covariates increase the design effect Power in Two-level Hierarchical Design With Covariates As m and effect size increase, power increases Other influences occur through the design effect n 2 2 2 1 пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ nR S пЂ RW пЂ© пЃІ пѓ№ пѓ« пѓ» As ПЃ increases the design effect (and power) decrease Now the maximum design effect as large n gets big is 2 1 (1 пЂ R S ) ПЃ As the covariate-outcome correlations RW2 and RS2 increase the design effect (and power) increases Three-level Hierarchical Design Here there are three factors вЂў Treatment вЂў Schools (clusters) nested in treatments вЂў Classes (subclusters) nested in schools Suppose there are вЂў m schools (clusters) per treatment вЂў p classes (subclusters) per school (cluster) вЂў n students (individuals) per class (subcluster) Three-level Hierarchical Design With No Covariates The statistical model for the observation on the lth person in the kth class in the jth school in the ith treatment is Yijkl = Ој + О±i + ОІj(i) + Оіk(ij) + Оµijkl where Ој is the grand mean, О±i is the average effect of being in treatment i, ОІj(i) is the average effect of being in school j, in treatment i Оіk(ij) is the average effect of being in class k in treatment i, in school j, Оµijkl is a residual Three-level Hierarchical Design With No Covariates (HLM Notation) Level 1 (individual level) Yijkl = ОІ0jk + Оµijkl Оµ ~ N(0, ПѓW2) Level 2 (classroom level) ОІ0jk = Оі0j + О·0jk О· ~ N(0, ПѓC2) Level 3 (school Level) Оі0j = ПЂ00 + ПЂ01Tj + Оѕ0j Оѕ ~ N(0, ПѓS2) If we code the treatment Tj = ВЅ or - ВЅ , then ПЂ00 = Ој, ПЂ01 = О±1, Оѕ0j = Оіk(ij), О·0jk = ОІj(i) Three-level Hierarchical Design Intraclass Correlations In three-level designs there are two levels of clustering and two intraclass correlations At the school (cluster) level 2 ПЃS пЂЅ 2 пЃіS 2 2 2 пЃі S пЂ« пЃі C пЂ« пЃіW пЂЅ пЃіS 2 пЃіT At the classroom (subcluster) level 2 ПЃC пЂЅ пЃіC пЃі 2 2 2 пЂ« пЃі пЂ« пЃі S C W 2 пЂЅ пЃіC 2 пЃіT Precision in Three-level Hierarchical Design With No Covariates The standard error of the treatment effect SE пЂЅ пЃі T пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« пЂЁ pn пЂ 1 пЂ© ПЃ S пЂ« ( n пЂ 1) пЃІ C пѓ¶ пѓ· пѓ§ пѓ·пѓ§ pn пѓЁ m пѓёпѓЁ пѓё SE decreases as m increases SE deceases as p and n increase, but only up to point SE increases as ПЃS and ПЃC increase Power in Three-level Hierarchical Design With No Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) О”T = Оґ x (Design Effect) For the three-level hierarchical design with no covariates пЃ„ пЂЅпЃ¤ T pn 1 пЂ« ( pn пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ C The operational sample size is the number of schools Power in Three-level Hierarchical Design With No Covariates As m and the effect size increase, power increases Other influences occur through the design effect pn 1 пЂ« ( pn пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ C As ПЃS or ПЃC increases the design effect decreases No matter how large n gets the maximum design effect is 1 пЂЁпЃІ S пЂ« 1 p пЃІC пЂ© Thus power only increases up to some limit as n increases Three-level Hierarchical Design With Covariates (HLM Notation) Level 1 (individual level) Yijkl = ОІ0jk + ОІ1jkXijkl + Оµijkl Оµ ~ N(0, ПѓAW2) Level 2 (classroom level) ОІ0jk = Оі00j + Оі01jZjk + О·0jk ОІ1jk = Оі10j О· ~ N(0, ПѓAC2) Level 3 (school Level) Оі00j = ПЂ00 + ПЂ01Tj + ПЂ02Wj + Оѕ0j Оі01j = ПЂ01 Оі10j = ПЂ10 Оѕ ~ N(0, ПѓAS2) The covariate effects ОІ1jk = Оі10j = ПЂ10 and Оі01j = ПЂ01 are fixed Precision in Three-level Hierarchical Design With Covariates SE пЂЅ пЃі T пѓ¦ 2 пѓ¶ пѓ§ пѓ· пѓЁmпѓё п‚ґ 2 2 2 2 2 1 пЂ« ( pn пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ pnR S пЂ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ RW пЂ© пЃІ C пѓ№ пѓ« пѓ» pn SE decreases as m increases SE deceases as p and n increase, but only up to point SE increases as ПЃ increases SE decreases as RW2, RC2, and RS2 increase Power in Three-level Hierarchical Design With Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) О”T = Оґ x (Design Effect) For the three-level hierarchical design with covariates пЃ„A пЂЅ пЃ¤ T pn 2 2 2 2 2 1 пЂ« ( pn пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ pnR S пЂ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ RW пЂ© пЃІ C пѓ№ пѓ« пѓ» The operational sample size is the number of schools Power in Three-level Hierarchical Design With Covariates As m and the effect size increase, power increases Other influences occur through the design effect pn 2 2 2 2 2 1 пЂ« ( pn пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ pnR S пЂ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ RW пЂ© пЃІ C пѓ№ пѓ« пѓ» As ПЃS or ПЃC increase the design effect decreases No matter how large n gets the maximum design effect is 2 пѓ© 1 пЂЁ1 пЂ R S пЂ© пЃІ S пЂ« пѓ« 1 p пЂЁ1 пЂ R пЂ© пЃІ 2 C C пѓ№ пѓ» Thus power only increases up to some limit as n increases Randomized Block Designs Two-level Randomized Block Design With No Covariates (HLM Notation) Level 1 (individual level) Yijk = ОІ0j + ОІ1jTijk+ Оµijk Оµ ~ N(0, ПѓW2) Level 2 (school Level) ОІ0j = ПЂ00 + Оѕ0j ОІ1j = ПЂ10+ Оѕ1j Оѕ0j ~ N(0, ПѓS2) Оѕ1j ~ N(0, ПѓTxS2) If we code the treatment Tijk = ВЅ or - ВЅ , then the parameters are identical to those in standard ANOVA Randomized Block Designs In randomized block designs, as in hierarchical designs, the intraclass correlation has an impact on precision and power However, in randomized block designs designs there is also a parameter reflecting the degree of heterogeneity of treatment effects across schools We define this heterogeneity parameter П‰S in terms of the amount of heterogeneity of treatment effects relative to the heterogeneity of school means Thus П‰S = ПѓTxS2/ПѓS2 Precision in Two-level Randomized Block Design With No Covariates The standard error of the treatment effect SE пЂЅ пЃі T пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« ( nпЃ· S пЂ 1) ПЃ пѓ¶ пѓ§ пѓ·пѓ§ пѓ· n пѓЁ m пѓёпѓЁ пѓё SE decreases as m (number of schools) increases SE deceases as n and p increase, but only up to point SE increases as ПЃ increases SE increases as П‰S = ПѓTxS2/ПѓS2 increases Power in Two-level Randomized Block Design With No Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) пЃ„ пЂЅпЃ¤ T О”T = Оґ x (Design Effect) n 1 пЂ« пЂЁ n пЂ 1пЂ© ПЃ For the two-level hierarchical design with no covariates пЃ„ пЂЅпЃ¤ T n/2 1 пЂ« пЂЁ nпЃ· S пЂ 1 пЂ© ПЃ Operational sample size is number of schools (clusters) Precision in Two-level Randomized Block Design With Covariates The standard error of the treatment effect SE пЂЅ пЃі T пѓ¦ 1 пЂ« пЂЁ nпЃ· пЂ 1 пЂ© ПЃ пЂ пѓ© R 2 пЂ« пЂЁ nпЃ· R 2 пЂ R 2 пЂ© пЃІ пѓ№ пѓ¶ S S S W пѓ¦ 2 пѓ¶пѓ§ пѓ« W пѓ»пѓ· пѓ§ пѓ· пѓ· n пѓЁ m пѓёпѓ§ пѓЁ пѓё SE decreases as m increases SE deceases as n increases, but only up to point SE increases as ПЃ increases SE increases as П‰S = ПѓTxS2/ПѓS2 increases SE (generally) decreases as RW2 and RS2 increase Power in Two-level Randomized Block Design With Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) О”T = Оґ x (Design Effect) For the two-level hierarchical design with covariates пЃ„ T A пЂЅпЃ¤ n/2 2 2 2 1 пЂ« пЂЁ n пЃ· S пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ n пЃ· S R S пЂ RW пЂ© пЃІ пѓ№ пѓ« пѓ» The covariates increase the design effect Three-level Randomized Block Designs Three-level Randomized Block Design With No Covariates Here there are three factors вЂў Treatment вЂў Schools (clusters) nested in treatments вЂў Classes (subclusters) nested in schools Suppose there are вЂў m schools (clusters) per treatment вЂў 2p classes (subclusters) per school (cluster) вЂў n students (individuals) per class (subcluster) Three-level Randomized Block Design With No Covariates The statistical model for the observation on the lth person in the kth class in the ith treatment in the jth school is Yijkl = Ој +О±i + ОІj + Оіk + О±ОІij + Оµijkl where Ој is the grand mean, О±i is the average effect of being in treatment i, ОІj is the average effect of being in school j, Оіk is the effect of being in the kth class, О±ОІij is the difference between the average effect of treatment i and the effect of that treatment in school j, Оµijkl is a residual Three-level Randomized Block Design With No Covariates (HLM Notation) Level 1 (individual level) Yijkl = ОІ0jk + Оµijkl Оµ ~ N(0, ПѓW2) Level 2 (classroom level) ОІ0jk = Оі00j + Оі01jTj + О·0jk О· ~ N(0, ПѓC2) Level 3 (school Level) Оі00j = ПЂ00 + Оѕ0j Оі01j = ПЂ10 + Оѕ1j Оѕoi ~ N(0, ПѓS2) Оѕ1i ~ N(0, ПѓTxS2) If we code the treatment Tj = ВЅ or - ВЅ , then ПЂ00 = Ој, ПЂ10 = О±1, Оѕ0j = ОІj , Оѕ1j = О±ОІij , О·0jk = Оіk Three-level Randomized Block Design Intraclass Correlations In three-level designs there are two levels of clustering and two intraclass correlations At the school (cluster) level 2 ПЃS пЂЅ пЃіS пЃі 2 2 2 пЂ« пЃі пЂ« пЃі S C W 2 пЂЅ пЃіS 2 пЃіT At the classroom (subcluster) level 2 ПЃC пЂЅ пЃіC пЃі 2 2 2 пЂ« пЃі пЂ« пЃі S C W 2 пЂЅ пЃіC 2 пЃіT Three-level Randomized Block Design Heterogeneity Parameters In three-level designs, as in two-level randomized block designs, there is also a parameter reflecting the degree of heterogeneity of treatment effects across schools We define this parameter П‰S in terms of the amount of heterogeneity of treatment effects relative to the heterogeneity of school means (just like in two-level designs) Thus П‰S = ПѓTxS2/ПѓS2 Precision in Three-level Randomized Block Design With No Covariates The standard error of the treatment effect SE пЂЅ пЃі T пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« пЂЁ pnпЃ· S пЂ 1 пЂ© ПЃ S пЂ« ( n пЂ 1) пЃІ C пѓ¶ пѓ· пѓ§ пѓ·пѓ§ pn пѓЁ m пѓёпѓЁ пѓё SE decreases as m increases SE deceases as p and n increase, but only up to point SE increases as П‰S increases SE increases as ПЃS and ПЃC increase Power in Three-level Randomized Block Design With No Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) О”T = Оґ x (Design Effect) For the three-level hierarchical design with no covariates пЃ„ пЂЅпЃ¤ T pn / 2 1 пЂ« ( pnпЃ· S пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ C The operational sample size is the number of schools Power in Three-level Randomized Block Design With No Covariates As m and the effect size increase, power increases Other influences occur through the design effect pn / 2 1 пЂ« ( pnпЃ· S пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ C As ПЃS or ПЃC increases the design effect decreases No matter how large n gets the maximum design effect is 1 2 пЂЁпЃ· S пЃІ S пЂ« 1 p пЃІC пЂ© Thus power only increases up to some limit as n increases Power in Three-level Randomized Block Design With Covariates SE пЂЅ пЃі T пѓ¦ 2 пѓ¶ пѓ§ пѓ· пѓЁmпѓё п‚ґ 2 2 2 2 2 1 пЂ« ( pnпЃ· S пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ pnпЃ· S R S пЂ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ RW пЂ© пЃІ C пѓ№ пѓ« пѓ» pn SE decreases as m increases SE deceases as p and n increases, but only up to point SE increases as ПЃ and П‰S increase SE decreases as RW2, RC2, and RS2 increase Power in Three-level Randomized Block Design With Covariates Basic Idea: Operational Effect Size = (Effect Size) x (Design Effect) О”T = Оґ x (Design Effect) For the three-level hierarchical design with covariates пЃ„A пЂЅ T пЃ¤ pn / 2 2 2 2 2 2 1 пЂ« ( pnпЃ· S пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ pnпЃ· S R S пЂ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ RW пЂ© пЃІ C пѓ№ пѓ« пѓ» The operational sample size is the number of schools Power in Three-level Randomized Block Design With Covariates As m and the effect size increase, power increases Other influences occur through the design effect pn / 2 2 2 2 2 2 1 пЂ« ( pnпЃ· S пЂ 1) пЃІ S пЂ« пЂЁ n пЂ 1 пЂ© ПЃ пЂ пѓ© RW пЂ« пЂЁ pnпЃ· S R S пЂ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ RW пЂ© пЃІ C пѓ№ пѓ« пѓ» As ПЃS or ПЃC increases the design effect decreases No matter how large n gets the maximum design effect is 2 пѓ© 1 2 пЂЁ1 пЂ R S пЂ© пЃ· S пЃІ S пЂ« пѓ« 1 p пЂЁ1 пЂ R пЂ© пЃІ 2 C C пѓ№ пѓ» Thus power only increases up to some limit as n increases What Unit Should Be Randomized? (Schools, Classrooms, or Students) Experiments cannot estimate the causal effect on any individual Experiments estimate average causal effects on the units that have been randomized вЂў If you randomize schools the (average) causal effects are effects on schools вЂў If you randomize classes, the (average) causal effects are on classes вЂў If you randomize individuals, the (average) causal effects estimated are on individuals What Unit Should Be Randomized? (Schools, Classrooms, or Students) Theoretical Considerations Decide what level you care about, then randomize at that level Randomization at lower levels may impact generalizability of the causal inference (and it is generally a lot more trouble) Suppose you randomize classrooms, should you also randomly assign students to classes? It depends: Are you interested in the average causal effect of treatment on naturally occurring classes or on randomly assembled ones? What Unit Should Be Randomized? (Schools, Classrooms, or Students) Relative power/precision of treatment effect Assign Schools (Hierarchical Design) пѓ¦ 1 пЂ« пЂЁ pn пЂ 1 пЂ© ПЃ S пЂ« ( n пЂ 1) пЃІ C пѓ¶ пѓ§ пѓ· pn пѓЁ пѓё Assign Classrooms (Randomized Block) пѓ¦ 1 пЂ« пЂЁ pnпЃ· S пЂ 1 пЂ© ПЃ S пЂ« ( n пЂ 1) пЃІ C пѓ¶ пѓ§ пѓ· pn пѓЁ пѓё Assign Students (Randomized Block) пѓ¦ 1 пЂ« пЂЁ pnпЃ· S пЂ 1 пЂ© ПЃ S пЂ« ( nпЃ· C пЂ 1) пЃІ C пѓ¶ пѓ§ пѓ· pn пѓЁ пѓё What Unit Should Be Randomized? (Schools, Classrooms, or Students) Precision of estimates or statistical power dictate assigning the lowest level possible But the individual (or even classroom) level will not always be feasible or even theoretically desirable Questions and Answers About Design Questions and Answers About Design 1. Is it ok to match my schools (or classes) before I randomize to decrease variation? 2. I assigned treatments to schools and am not using classes in the analysis. Do I have to take them into account in the design? 3. I am assigning schools, and using every class in the school. Do I have to include classes as a nested factor? 4. My schools all come from two districts, but I am randomly assigning the schools. Do I have to take district into account some way? Questions and Answers About Design 1. I didnвЂ™t really sample the schools in my experiment (who does?). Do I still have to treat schools as random effects? 2. I didnвЂ™t really sample my schools, so what population can I generalize to anyway? 3. I am using a randomized block design with fixed effects. Do you really mean I canвЂ™t say anything about effects in schools that are not in the sample? Questions and Answers About Design 1. We randomly assigned, but our assignment was corrupted by treatment switchers. What do we do? 2. We randomly assigned, but our assignment was corrupted by attrition. What do we do? 3. We randomly assigned but got a big imbalance on characteristics we care about (gender, race, language, SES). What do we do? 4. We randomly assigned but when we looked at the pretest scores, we see that we got a big imbalance (a вЂњbad randomizationвЂќ). What do we do? Questions and Answers About Design 1. We care about treatment effects, but we really want to know about mechanism. How do we find out if implementation impacts treatment effects? 2. We want to know where (under what conditions) the treatment works. Can we analyze the relation between conditions and treatment effect to find this out? 3. We have a randomized block design and find heterogeneous treatment effects. What can we say about the main effect of treatment in the presence of interactions? Questions and Answers About Design 1. I prefer to use regression and I know that regression and ANOVA are equivalent. Why do I need all this ANOVA stuff to design and analyze experiments? 2. DonвЂ™t robust standard errors in regression solve all these problems? 3. I have heard of using вЂњschool fixed effectsвЂќ to analyze a randomized block design. Is the a good alternative to ANOVA or HLM? 4. Can I use school fixed effects in a hierarchical design? Questions and Answers About Design 1. We want to use covariates to improve precision, but we find that they act somewhat differently in different groups (have different slopes). What do we do? 2. We get somewhat different variances in different groups. Should we use robust standard errors? 3. We get somewhat different answers with different analyses. What do we do? Thank You !

1/--страниц