close

Вход

Забыли?

вход по аккаунту

?

Basic Experimental Design - Institute for Policy Research

код для вставкиСкачать
Basic Experimental Design
Larry V. Hedges
Northwestern University
Prepared for the IES Summer Research
Training Institute July 8, 2008
What is Experimental Design?
Experimental design includes both
• Strategies for organizing data collection
• Data analysis procedures matched to those data
collection strategies
Classical treatments of design stress analysis procedures
based on the analysis of variance (ANOVA)
Other analysis procedure such as those based on
hierarchical linear models or analysis of aggregates
(e.g., class or school means) are also appropriate
Why Do We Need Experimental Design?
Because of variability
We wouldn’t need a science of experimental design if
• If all units (students, teachers, & schools) were identical
and
• If all units responded identically to treatments
We need experimental design to control variability so that
treatment effects can be identified
A Little History
The idea of controlling variability through design has a long
history
In 1747 Sir James Lind’s studies of scurvy
Their cases were as similar as I could have them. They all in
general had putrid gums, spots and lassitude, with weakness of
their knees. They lay together on one place … and had one diet
common to all (Lind, 1753, p. 149)
Lind then assigned six different treatments to groups of
patients
A Little History
The idea of random assignment was not obvious and took
time to catch on
In 1648 von Helmont carried out one randomization in a
trial of bloodletting for fevers
In 1904 Karl Pearson suggested matching and alternation
in typhoid trials
Amberson, et al. (1931) carried out a trial with one
randomization
In 1937 Sir Bradford Hill advocated alternation of patients
in trials rather than randomization
Diehl, et al. (1938) carried out a trial that is sometimes
referred to as randomized, but it actually used alternation
A Little History
The first modern randomized clinical trial in
medicine is usually considered to be the
trial of streptomycin for treating
tuberculosis
It was conducted by the British Medical
Research Council in 1946 and reported in
1948
A Little History
Experiments have been used longer in the behavioral
sciences (e.g., psychophysics: Pierce and Jastrow,
1885)
Experiments conducted in laboratory settings were widely
used in educational psychology (e.g., McCall, 1923)
Thorndike (early 1900’s)
Lindquist (1953)
Gage field experiments on teaching (1978 – 1984)
A Little History
Studies in crop variation I – VI (1921 – 1929)
In 1919 a statistician named Fisher was hired at
Rothamsted agricultural station
They had a lot of observational data on crop yields
and hoped a statistician could analyze it to find
effects of various treatments
All he had to do was sort out the effects of
confounding variables
Studies in Crop Variation I (1921)
Fisher does regression analyses—lots of them—to study
(and get rid of) the effects of confounders
•
•
•
•
soil fertility gradients
drainage
effects of rainfall
effects of temperature and weather, etc.
Fisher does qualitative work to sort out anomalies
Conclusion
The effects of confounders are typically larger than those
of the systematic effects we want to study
Studies in Crop Variation II (1923)
Fisher invents
• Basic principles of experimental design
• Control of variation by randomization
• Analysis of variance
Studies in Crop Variation IV and VI
Studies in Crop variation IV (1927)
Fisher invents analysis of covariance to combine
statistical control and control by randomization
Studies in crop variation VI (1929)
Fisher refines the theory of experimental design,
introducing most other key concepts known
today
Our Hero in 1929
Principles of Experimental Design
Experimental design controls background variability so that
systematic effects of treatments can be observed
Three basic principles
1.
Control by matching
2.
Control by randomization
3.
Control by statistical adjustment
Their importance is in that order
Control by Matching
Known sources of variation may be eliminated by matching
Eliminating genetic variation
Compare animals from the same litter of mice
Eliminating district or school effects
Compare students within districts or schools
However matching is limited
• matching is only possible on observable characteristics
• perfect matching is not always possible
• matching inherently limits generalizability by removing (possibly
desired) variation
Control by Matching
Matching ensures that groups compared are alike
on specific known and observable
characteristics (in principle, everything we have
thought of)
Wouldn’t it be great if there were a method of
making groups alike on not only everything we
have thought of, but everything we didn’t think of
too?
There is such a method
Control by Randomization
Matching controls for the effects of variation due to specific
observable characteristics
Randomization controls for the effects all (observable or
non-observable, known or unknown) characteristics
Randomization makes groups equivalent (on average) on
all variables (known and unknown, observable or not)
Randomization also gives us a way to assess whether
differences after treatment are larger than would be
expected due to chance.
Control by Randomization
Random assignment is not assignment with no
particular rule. It is a purposeful process
Assignment is made at random. This does not
mean that the experimenter writes down the
names of the varieties in any order that occurs to
him, but that he carries out a physical
experimental process of randomization, using
means which shall ensure that each variety will
have an equal chance of being tested on any
particular plot of ground (Fisher, 1935, p. 51)
Control by Randomization
Random assignment of schools or classrooms is not
assignment with no particular rule. It is a
purposeful process
Assignment of schools to treatments is made at
random. This does not mean that the
experimenter assigns schools to treatments in any
order that occurs to her, but that she carries out a
physical experimental process of randomization,
using means which shall ensure that each
treatment will have an equal chance of being
tested in any particular school (Hedges, 2007)
Control by Statistical Adjustment
Control by statistical adjustment is a form of pseudomatching
It uses statistical relations to simulate matching
Statistical control is important for increasing precision but
should not be relied upon to control biases that may exist
prior to assignment
Statistical control is the weakest of the three experimental
design principles because its validity depends on
knowing a statistical model for responses
Using Principles of Experimental Design
You have to know a lot (be smart) to use matching
and statistical control effectively
You do not have to be smart to use randomization
effectively
But
Where all are possible, randomization is not as
efficient (requires larger sample sizes for the
same power) as matching or statistical control
Basic Ideas of Design:
Independent Variables (Factors)
The values of independent variables are called levels
Some independent variables can be manipulated, others
can’t
Treatments are independent variables that can be
manipulated
Blocks and covariates are independent variables that
cannot be manipulated
These concepts are simple, but are often confused
Remember:
You can randomly assign treatment levels but not blocks
Basic Ideas of Design (Crossing)
Relations between independent variables
Factors (treatments or blocks) are crossed if every level of
one factor occurs with every level of another factor
Example
The Tennessee class size experiment assigned students to
one of three class size conditions. All three treatment
conditions occurred within each of the participating
schools
Thus treatment was crossed with schools
Basic Ideas of Design (Nesting)
Factor B is nested in factor A if every level of factor B
occurs within only one level of factor A
Example
The Tennessee class size experiment actually assigned
classrooms to one of three class size conditions. Each
classroom occurred in only one treatment condition
Thus classrooms were nested within treatments
(But treatment was crossed with schools)
Where Do These Terms Come From?
(Nesting)
An agricultural experiment where blocks are literally blocks
or plots of land
Blocks
1
T1
2
T2
…
…
n
T1
Here each block is literally nested within a treatment
condition
Where Do These Terms Come From?
(Crossing)
An agricultural experiment
Blocks
1
2
T1
T2
T2
T1
…
…
n
T1
T2
Blocks were literally blocks of land and plots
of land within blocks were assigned
different treatments
Where Do These Terms Come From?
(Crossing)
Blocks were literally blocks of land and plots of land within
blocks were assigned different treatments.
Blocks
1
2
T1
T2
T2
T1
…
…
n
T1
T2
Here treatment literally crosses the blocks
Where Do These Terms Come From?
(Crossing)
The experiment is often depicted like this.
What is wrong with this as a field layout?
Blocks
1
Treatment 1
2
…
n
…
Treatment 2
Consider possible sources of bias
Think About These Designs
A study assigns a reading treatment (or control) to children
in 20 schools. Each child is classified into one of three
groups with different risk of reading failure.
A study assigns T or C to 20 teachers. The teachers are in
five schools, and each teacher teaches 4 science
classes
Two schools in each district are picked to participate. Each
school has two grade 4 teachers. One of them is
assigned to T, the other to C.
Three Basic Designs
The completely randomized design
Treatments are assigned to individuals
The randomized block design
Treatments are assigned to individuals within blocks
(This is sometimes called the matched design, because
individuals are matched within blocks)
The hierarchical design
Treatments are assigned to blocks, the same treatment
is assigned to all individuals in the block
The Completely Randomized Design
Individuals are randomly assigned to one of two treatments
Treatment
Control
Individual 1
Individual 1
Individual 2
Individual 2
…
…
Individual nT
Individual nC
The Randomized Block Design
Block 1
…
Individual 1
Individual 1
…
…
Individual n1
Individual nm
Individual n1 +1
Individual nm + 1
Individual 2n1
…
…
…
Treatment 2
…
Treatment 1
Block m
Individual 2nm
The Hierarchical Design
Treatment
Control
Block 1
Block m
Block m+1
Block 2m
Individual 1
Individual 1
Individual 1
Individual 1
Individual 2
Individual 2
Individual 2
Individual 2
…
Individual nm+1
…
Individual nm
…
…
…
Individual n1
…
Individual n2m
Randomization Procedures
Randomization has to be done as an explicit process
devised by the experimenter
• Haphazard is not the same as random
• Unknown assignment is not the same as random
• “Essentially random” is technically meaningless
• Alternation is not random, even if you alternate from a
random start
This is why R.A. Fisher was so explicit about randomization
processes
Randomization Procedures
R.A. Fisher on how to randomize an experiment with small
sample size and 5 treatments
A satisfactory method is to use a pack of cards
numbered from 1 to 100, and to arrange them in random
order by repeated shuffling. The varieties [treatments]
are numbered from 1 to 5, and any card such as the
number 33, for example is deemed to correspond to
variety [treatment] number 3, because on dividing by 5
this number is found as the remainder. (Fisher, 1935,
p.51)
Randomization Procedures
You may want to use a table of random numbers, but be
sure to pick an arbitrary start point!
Beware random number generators—they typically depend
on seed values, be sure to vary the seed value (if they
do not do it automatically)
Otherwise you can reliably generate the same sequence of
random numbers every time
It is no different that starting in the same place in a table of
random numbers
Randomization Procedures
Completely Randomized Design
(2 treatments, 2n individuals)
Make a list of all individuals
For each individual, pick a random number from 1 to 2 (odd
or even)
Assign the individual to treatment 1 if even, 2 if odd
When one treatment is assigned n individuals, stop
assigning more individuals to that treatment
Randomization Procedures
Completely Randomized Design
(2pn individuals, p treatments)
Make a list of all individuals
For each individual, pick a random number from 1 to p
One way to do this is to get a random number of any
size, divide by p, the remainder R is between 0 and (p –
1), so add 1 to the remainder to get R + 1
Assign the individual to treatment R + 1
Stop assigning individuals to any treatment after it gets n
individuals
Randomization Procedures
Randomized Block Design with 2 Treatments
(m blocks per treatment, 2n individuals per block)
Make a list of all individuals in the first block
For each individual, pick a random number from 1 to 2 (odd
or even)
Assign the individual to treatment 1 if even, 2 if odd
Stop assigning a treatment it is assigned n individuals in
the block
Repeat the same process with every block
Randomization Procedures
Randomized Block Design with p Treatments
(m blocks per treatment, pn individuals per block)
Make a list of all individuals in the first block
For each individual, pick a random number from 1 to p
Assign the individual to treatment p
Stop assigning a treatment it is assigned n individuals in
the block
Repeat the same process with every block
Randomization Procedures
Hierarchical Design with 2 Treatments
(m blocks per treatment, n individuals per block)
Make a list of all blocks
For each block, pick a random number from 1 to 2
Assign the block to treatment 1 if even, treatment 2 if odd
Stop assigning a treatment after it is assigned m blocks
Every individual in a block is assigned to the same
treatment
Randomization Procedures
Hierarchical Design with p Treatments
(m blocks per treatment, n individuals per block)
Make a list of all blocks
For each block, pick a random number from 1 to p
Assign the block to treatment corresponding to the number
Stop assigning a treatment after it is assigned m blocks
Every individual in a block is assigned to the same
treatment
Sampling Models
Sampling Models in Educational Research
Sampling models are often ignored in educational
research
But
Sampling is where the randomness comes from in
social research
Sampling therefore has profound consequences
for statistical analysis and research designs
Sampling Models in Educational Research
Simple random samples are rare in field research
Educational populations are hierarchically nested:
• Students in classrooms in schools
• Schools in districts in states
We usually exploit the population structure to sample
students by first sampling schools
Even then, most samples are not probability samples, but
they are intended to be representative (of some
population)
Sampling Models in Educational Research
Survey research calls this strategy multistage (multilevel)
clustered sampling
We often sample clusters (schools) first then individuals
within clusters (students within schools)
This is a two-stage (two-level) cluster sample
We might sample schools, then classrooms, then students
This is a three-stage (three-level) cluster sample
Precision of Estimates
Depends on the Sampling Model
Suppose the total population variance is ПѓT2 and ICC is ПЃ
Consider two samples of size N = mn
A simple random sample or stratified sample
The variance of the mean is ПѓT2/mn
A clustered sample of n students from each of m schools
The variance of the mean is (σT2/mn)[1 + (n – 1)ρ]
The inflation factor [1 + (n – 1)ρ] is called the design effect
Precision of Estimates
Depends on the Sampling Model
Suppose the population variance is ПѓT2
School level ICC is ПЃS, class level ICC is ПЃC
Consider two samples of size N = mpn
A simple random sample or stratified sample
The variance of the mean is ПѓT2/mpn
A clustered sample of n students from p classes in m
schools
The variance is (σT2/mpn)[1 + (pn – 1)ρS + (n – 1)ρC]
The three level design effect is [1 + (pn – 1)ρS + (n – 1)ρC]
Precision of Estimates
Depends on the Sampling Model
Treatment effects in experiments and quasiexperiments are mean differences
Therefore precision of treatment effects
and statistical power will depend on the
sampling model
Sampling Models in Educational Research
The fact that the population is structured does not mean
the sample is must be a clustered sample
Whether it is a clustered sample depends on:
• How the sample is drawn (e.g., are schools sampled first
then individuals randomly within schools)
• What the inferential population is (e.g., is the inference to
these schools studied or a larger population of schools)
Sampling Models in Educational Research
A necessary condition for a clustered sample is that it is
drawn in stages using population subdivisions
• schools then students within schools
• schools then classrooms then students
However, if all subdivisions in a population are present in
the sample, the sample is not clustered, but stratified
Stratification has different implications than clustering
Whether there is stratification or clustering depends on the
definition of the population to which we draw inferences
(the inferential population)
Sampling Models in Educational Research
The clustered/stratified distinction matters because it
influences the precision of statistics estimated from the
sample
If all population subdivisions are included in the every
sample, there is no sampling (or exhaustive sampling) of
subdivisions
• therefore differences between subdivisions add no
uncertainty to estimates
If only some population subdivisions are included in the
sample, it matters which ones you happen to sample
• thus differences between subdivisions add to uncertainty
Inferential Population and Inference Models
The inferential population or inference model has
implications for analysis and therefore for the design of
experiments
Do we make inferences to the schools in this sample or to
a larger population of schools?
Inferences to the schools or classes in the sample are
called conditional inferences
Inferences to a larger population of schools or classes are
called unconditional inferences
Inferential Population and Inference Models
Note that the inferences (what we are estimating) are
different in conditional versus unconditional inference
models
• In a conditional inference, we are estimating the mean
(or treatment effect) in the observed schools
• In unconditional inference we are estimating the mean
(or treatment effect) in the population of schools from
which the observed schools are sampled
We are still estimating a mean (or a treatment effect) but
they are different parameters with different uncertainties
Fixed and Random Effects
When the levels of a factor (e.g., particular blocks
included) in a study are sampled and the
inference model is unconditional, that factor is
called random and its effects are called random
effects
When the levels of a factor (e.g., particular blocks
included) in a study constitute the entire
inference population and the inference model is
conditional, that factor is called fixed and its
effects are called fixed effects
Applications to Experimental Design
We will look in detail at the two most widely
used experimental designs in education
• Randomized blocks designs
• Hierarchical designs
Experimental Designs
For each design we will look at
• Structural Model for data (and what it means)
• Two inference models
– What does �treatment effect’ mean in principle
– What is the estimate of treatment effect
– How do we deal with context effects
• Two statistical analysis procedures
– How do we estimate and test treatment effects
– How do we estimate and test context effects
– What is the sensitivity of the tests
The Randomized Block Design
The population (the sampling frame)
We wish to compare two treatments
• We assign treatments within schools
• Many schools with 2n students in each
• Assign n students to each treatment in each
school
The Randomized Block Design
The experiment
Compare two treatments in an experiment
• We assign treatments within schools
• With m schools with 2n students in each
• Assign n students to each treatment in each
school
The Randomized Block Design
Diagram of the design
Schools
Treatment
1
2
…
1
…
2
…
m
The Randomized Block Design
School 1
Schools
Treatment
1
2
…
1
…
2
…
m
The Conceptual Model
The statistical model for the observation on the kth person
in the jth school in the ith treatment is
Yijk = Ој +О±i + ОІj + О±ОІij + Оµijk
where
Ој is the grand mean,
О±i is the average effect of being in treatment i,
ОІj is the average effect of being in school j,
О±ОІij is the difference between the average effect of
treatment i and the effect of that treatment in school j,
Оµijk is a residual
Effect of Context
Yijk пЂЅ пЃ­ пЂ« пЃЎ i пЂ« пЃў j пЂ« пЃЎ пЃў ij пЂ« пЃҐ ijk
Context Effect
Two-level Randomized Block Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijk = ОІ0j + ОІ1jTijk+ Оµijk
Оµ ~ N(0, ПѓW2)
Level 2 (school Level)
ОІ0j = ПЂ00 + Оѕ0j
Оѕ0j ~ N(0, ПѓS2)
ОІ1j = ПЂ10+ Оѕ1j
Оѕ1j ~ N(0, ПѓTxS2)
If we code the treatment Tijk = ВЅ or - ВЅ , then the
parameters are identical to those in standard ANOVA
Effects and Estimates
The population mean of treatment 1 in school j is
О±1 + О±ОІ1j
The population mean of treatment 2 in school j is
О±2 + О±ОІ2j
The estimate of the mean of treatment 1 in school j is
О±1 + О±ОІ1j + Оµ1jв—Џ
The estimate of the mean of treatment 2 in school j is
О±2 + О±ОІ2j + Оµ2jв—Џ
Effects and Estimates
The comparative treatment effect in any given school j is
(α1 – α2) + (αβ1j – αβ2j)
The estimate of comparative treatment effect in school j is
(α1 – α2) + (αβ1j – αβ2j) + (ε1j● – ε2j●)
The mean treatment effect in the experiment is
(α1 – α2) + (αβ1● – αβ2●)
The estimate of the mean treatment effect in the experiment is
(α1 – α2) + (αβ 1● – αβ2●) + (ε1●● – ε2●●)
Inference Models
Two different kinds of inferences about effects
Unconditional Inference (Schools Random)
Inference to the whole universe of schools
(requires a representative sample of schools)
Conditional Inference (Schools Fixed)
Inference to the schools in the experiment
(no sampling requirement on schools)
Statistical Analysis Procedures
Two kinds of statistical analysis procedures
Mixed Effects Procedures (Schools Random)
Treat schools in the experiment as a sample
from a population of schools
(only strictly correct if schools are a sample)
Fixed Effects Procedures (Schools Fixed)
Treat schools in the experiment as a population
Unconditional Inference
(Schools Random)
The estimate of the mean treatment effect in the experiment is
(α1 – α2) + (αβ 1● – αβ2●) + (ε1●● – ε2●●)
The average treatment effect we want to estimate is
(α1 – α2)
The term (ε1●● – ε2●●) depends on the students in the schools in the
sample
The term (αβ1● – αβ2●) depends on the schools in sample
Both (ε1●● – ε2●●) and (αβ1● – αβ2●) are random and average to 0
across students and schools, respectively
Conditional Inference
(Schools Fixed)
The estimate of the mean treatment effect in the
experiment is still
(α1 – α2) + (αβ 1● – αβ2●) + (ε1●● – ε2●●)
Now the average treatment effect we want to estimate is
(α1 + αβ1●) – (α2 + αβ2●) = (α1 – α2) + (αβ1● – αβ2●)
The term (ε1●● – ε2●●) depends on the students in the
schools in the sample
The term (αβ1● – αβ2●) depends on the schools in sample,
but the treatment effect in the sample of schools is the
effect we want to estimate
Expected Mean Squares
Randomized Block Design
(Two Levels, Schools Random)
Source
df
E{MS}
Treatment (T)
1
ПѓW2 + nПѓTxS2 + nmОЈО±i2
Schools (S)
m–1
ПѓW2 + 2nПѓS2
TXS
m–1
ПѓW2 + nПѓTxS2
Within Cells
2 m(n – 1)
ПѓW2
Mixed Effects Procedures
(Schools Random)
The test for treatment effects has
H0: (α1 – α2) = 0
Estimated mean treatment effect in the experiment is
(α1 – α2) + (αβ1● – αβ2●) + (ε1●● – ε2●●)
The variance of the estimated treatment effect is
2[σW2 + nσTxS2] /mn = 2[1 + (nωS – 1)ρ]σ2/mn
Here П‰S = ПѓTxS2/ПѓS2 and ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2
Mixed Effects Procedures
The test for treatment effects:
FT = MST/MSTxS with (m – 1) df
The test for context effects (treatment by schools
interaction) is
FTxS = MSTxS/MSWS with 2m(n – 1) df
Power is determined by the operational effect size
пЂЁ О±1 пЂ­ О± 2 пЂ©
n
пЃі
1  ( nω S  1) ρ
where П‰S = ПѓTxS2/ПѓS2 and ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2
Expected Mean Squares
Randomized Block Design
(Two Levels, Schools Fixed)
Source
df
E{MS}
Treatment (T)
1
ПѓW2 + nmОЈО±i2
Schools (S)
m–1
σW2 + 2nΣβi2/(m – 1)
SXT
m–1
σW2 + nΣΣαβij2/(m – 1)
Within Cells
2m(n – 1)
ПѓW2
Fixed Effects Procedures
The test for treatment effects has
H0: (α1 – α2) + (αβ1● – αβ2●) = 0
Estimated mean treatment effect in the experiment is
(α1 – α2) + (αβ1● – αβ2●) + (ε1●● – ε2●●)
The variance of the estimated treatment effect is
2ПѓW2 /mn
Fixed Effects Procedures
The test for treatment effects:
FT = MST/MSWS with m(n – 1) df
The test for context effects (treatment by schools interaction) is
FC = MSTxS/MSWS with 2m(n – 1) df
Power is determined by the operational effect size
пЂЁ О±1 пЂ­ О± 2 пЂ© пЂ­ пЂЁ О± пЃў1п‚· пЂ­ О± пЃў 2 п‚· пЂ©
пЃі
with m(n – 1) df
n
Comparing Fixed and Mixed Effects
Statistical Procedures
(Randomized Block Design)
Fixed
Mixed
Inference
Model
Conditional
Unconditional
Estimand
(α1 – α2) + (αβ1● – αβ2●)
(α1 – α2)
(ε1●● – ε2●●)
(αβ1● – αβ2●) + (ε1●● – ε2●●)
Contaminating
Factors
Operational
Effect Size
df
Power
пЂЁ О±1 пЂ­ О± 2 пЂ© пЂ­ пЂЁ О± пЃў1п‚· пЂ­ О± пЃў 2 п‚· пЂ©
пЃі
n
пЂЁ О±1 пЂ­ О± 2 пЂ©
n
пЃі
1  ( nω S  1) ρ
2m(n – 1)
(m – 1)
higher
lower
Comparing Fixed and Mixed Effects Procedures
(Randomized Block Design)
Conditional and unconditional inference models
• estimate different treatment effects
• have different contaminating factors that add uncertainty
Mixed procedures are good for unconditional inference
The fixed procedures are good for conditional inference
The fixed procedures have higher power
The Hierarchical Design
The universe (the sampling frame)
We wish to compare two treatments
• We assign treatments to whole schools
• Many schools with n students in each
• Assign all students in each school to the
same treatment
The Hierarchical Design
The experiment
We wish to compare two treatments
• We assign treatments to whole schools
• Assign 2m schools with n students in each
• Assign all students in each school to the
same treatment
The Hierarchical Design
Diagram of the experiment
Schools
Treatment
1
2
1
2
…
m
m +1
m +2
…
2
m
The Hierarchical Design
Treatment 1 schools
Schools
Treatment
1
2
1
2
…
m
m +1
m+2
…
2m
The Hierarchical Design
Treatment 2 schools
Schools
Treatment
1
2
1
2
…
m
m+1
m+2
…
2m
The Conceptual Model
The statistical model for the observation on the kth person in the jth
school in the ith treatment is
Yijk = Ој + О±i + ОІi + О±ОІij + Оµjk(i) = Ој + О±i + ОІj(i) + Оµjk(i)
Ој is the grand mean,
О±i is the average effect of being in treatment i,
ОІj is the average effect if being in school j,
О±ОІij is the difference between the average effect of treatment i and the
effect of that treatment in school j,
Оµijk is a residual
Or ОІj(i) = ОІi + О±ОІij is a term for the combined effect of schools within
treatments
The Conceptual Model
The statistical model for the observation on the kth person in the jth
school in the ith treatment is
Yijk = Ој + О±i + ОІi + О±ОІij + Оµjk(i) = Ој + О±i + ОІj(i) + Оµjk(i)
Context Effects
Ој is the grand mean,
О±i is the average effect of being in treatment i,
ОІj is the average effect if being in school j,
О±ОІij is the difference between the average effect of treatment i and the
effect of that treatment in school j,
Оµijk is a residual
or ОІj(i) = ОІi + О±ОІij is a term for the combined effect of schools within
treatments
Two-level Hierarchical Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijk = ОІ0j + Оµijk
Оµ ~ N(0, ПѓW2)
Level 2 (school Level)
Оі0j = ПЂ00 + ПЂ01Tj + Оѕ0j
Оѕ ~ N(0, ПѓS2)
If we code the treatment Tj = ВЅ or - ВЅ , then
ПЂ00 = Ој, ПЂ01 = О±1, Оѕ0j = ОІj(i)
The intraclass correlation is ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2
Effects and Estimates
The comparative treatment effect in any given school j is still
(α1 – α2) + (αβ1j – αβ2j)
But we cannot estimate the treatment effect in a single school because
each school gets only one treatment
The mean treatment effect in the experiment is
(α1 – α2) + (β●(1) – β●(2))
= (α1 – α2) +(β1● – β2● )+ (αβ1● – αβ2●)
The estimate of the mean treatment effect in the experiment is
(α1 – α2) + (β● (1) – β● (2)) + (ε1●● – ε2●●)
Inference Models
Two different kinds of inferences about effects
(as in the randomized block design)
Unconditional Inference (schools random)
Inference to the whole universe of schools
(requires a representative sample of schools)
Conditional Inference (schools fixed)
Inference to the schools in the experiment
(no sampling requirement on schools)
Unconditional Inference
(Schools Random)
The average treatment effect we want to estimate is
(α1 – α2)
The term (ε1●● – ε2●●) depends on the students in the
schools in the sample
The term (β●(1) – β●(2)) depends on the schools in sample
Both (ε1●● – ε2●●) and (β●(1) – β●(2)) are random and
average to 0 across students and schools, respectively
Conditional Inference
(Schools Fixed)
The average treatment effect we want to (can) estimate is
(α1 + β●(1)) – (α2 + β●(2)) = (α1 – α2) + (β●(1) – β●(2))
= (α1 – α2) + (β1● – β2● )+ (αβ1● – αβ2●)
The term (β●(1) – β●(2)) depends on the schools in sample,
but we want to estimate the effect of treatment in the
schools in the sample
Note that this treatment effect is not quite the same as in
the randomized block design, where we estimate
(α1 – α2) + (αβ1● – αβ2●)
Statistical Analysis Procedures
Two kinds of statistical analysis procedures
(as in the randomized block design)
Mixed Effects Procedures
Treat schools in the experiment as a sample
from a universe
Fixed Effects Procedures
Treat schools in the experiment as a universe
Expected Mean Squares
Hierarchical Design
(Two Levels, Schools Random)
Source
df
E{MS}
Treatment (T)
1
ПѓW2 + nПѓS2 + nmОЈО±i2
Schools (S)
2(m – 1)
ПѓW2 + nПѓS2
Within Schools
2m(n – 1)
ПѓW2
Mixed Effects Procedures
(Schools Random)
The test for treatment effects has
H0: (α1 – α2) = 0
Estimated mean treatment effect in the experiment is
(α1 – α2) + (β●(1) – β●(2)) + (ε1●● – ε2●●)
The variance of the estimated treatment effect is
2[σW2 + nσS2] /mn = 2[1 + (n – 1)ρ]σ2/mn
where ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2
Mixed Effects Procedures
(Schools Random)
The test for treatment effects:
FT = MST/MSBS with (m – 2) df
There is no omnibus test for context effects
Power is determined by the operational effect size
пЂЁ О±1 пЂ­ О± 2 пЂ©
n
пЃі
1 пЂ« ( n пЂ­ 1) ПЃ
where ПЃ = ПѓS2/(ПѓS2 + ПѓW2) = ПѓS2/Пѓ2
Expected Mean Squares
Hierarchical Design
(Two Levels, Schools Fixed)
Source
df
E{MS}
Treatment (T)
1
ПѓW2 + nmОЈ(О±i + ОІв—Џ(i))2
m–1
σW2 + nΣΣβj(i)2/2(m – 1)
Schools (S)
Within Schools
2 m(n – 1)
ПѓW2
Mixed Effects Procedures
(Schools Fixed)
The test for treatment effects has
H0: (α1 – α2) + (β●(1) – β●(2)) = 0
Note that the school effects are confounded with treatment effects
Estimated mean treatment effect in the experiment is
(α1 – α2) + (β●(1) – β●(2)) + (ε1●● – ε2●●)
The variance of the estimated treatment effect is
2ПѓW2 /mn
Mixed Effects Procedures
(Schools Fixed)
The test for treatment effects:
FT = MST/MSWS with m(n – 1) df
There is no omnibus test for context effects,
because each school gets only one treatment
Power is determined by the operational effect size
пЂЁ О±1 пЂ­ О± 2 пЂ© пЂ­ пЂЁ пЃў п‚· (1) пЂ­ пЃў п‚· ( 2 ) пЂ©
пЃі
and m(n – 1) df
n
Comparing Fixed and Mixed Effects Procedures
(Hierarchical Design)
Fixed
Mixed
Inference
Model
Conditional
Unconditional
Estimand
(α1 – α2) + (β●(1) – β●(2))
(α1 – α2)
(ε1●● – ε2●●)
(β●(1) – β●(2)) + (ε1●● – ε2●●)
Contaminating
Factors
Effect Size
пЂЁ О±1 пЂ­ О± 2 пЂ© пЂ­ пЂЁ пЃў п‚· (1) пЂ­ пЃў п‚· ( 2 ) пЂ©
пЃі
df
Power
n
пЂЁ О±1 пЂ­ О± 2 пЂ©
n
пЃі
1 пЂ« ( n пЂ­ 1) ПЃ
m(n – 1)
(m – 2)
higher
lower
Comparing Fixed and Mixed Effects
Statistical Procedures (Hierarchical Design)
Conditional and unconditional inference models
• estimate different treatment effects
• have different contaminating factors that add uncertainty
Mixed procedures are good for unconditional inference
The fixed procedures are not generally recommended
The fixed procedures have higher power
Comparing Hierarchical Designs to
Randomized Block Designs
Randomized block designs usually have higher power, but
assignment of different treatments within schools or
classes may be
• practically difficult
• politically infeasible
• theoretically impossible
It may be methodologically unwise because of potential for
• Contamination or diffusion of treatments
• compensatory rivalry or demoralization
Applications to Experimental Design
We will address the two most widely used experimental
designs in education
• Randomized blocks designs with 2 levels
• Randomized blocks designs with 3 levels
• Hierarchical designs with 2 levels
• Hierarchical designs with 3 levels
We also examine the effect of covariates
Hereafter, we generally take schools to be random
Complications
Which matchings do we have to take into account in design
(e.g., schools, districts, regions, states, regions of the
country, country)?
Ignore some, control for effects of others as fixed blocking
factors
Justify this as part of the population definition
For example, we define the inference population as these
five districts within these two states
But, doing so obviously constrains generalizability
Precision of the Estimated Treatment Effect
Precision is the standard error of the estimated treatment
effect
Precision in simple (simple random sample) designs
depends on:
• Standard deviation in the population σ
• Total sample size N
The precision is
SE пЂЅ
2пЃі
N
Precision of the Estimated Treatment Effect
Precision in complex (clustered sample) designs depends
on:
• The (total) standard deviation σT
• Sample size at each level of sampling
(e.g., m clusters, n individuals per cluster)
• Intraclass correlation structure
It is a little harder to compute than in simple designs, but
important because it helps you see what matters in
design
Intraclass Correlations in
Two-level Designs
In two-level designs the intraclass correlation structure is
determined by a single intraclass correlation
This intraclass correlation is the proportion of the total
variance that is between schools (clusters)
2
ПЃпЂЅ
пЃіS
пЃі
2
2
пЂ«
пЃі
S
W
2
пЂЅ
пЃіS
2
пЃіT
Precision in Two-level Hierarchical Design
With No Covariates
The standard error of the treatment effect is
SE пЂЅ пЃі T
пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« ( n пЂ­ 1) ПЃ пѓ¶
пѓ§ пѓ·пѓ§
пѓ·
n
пѓЁ m пѓёпѓЁ
пѓё
SE decreases as m (number of schools) increases
SE deceases as n increases, but only up to point
SE increases as ПЃ increases
Statistical Power
Power in simple (simple random sample) designs depends
on:
• Significance level
• Effect size
• Sample size
Look power up in a table for sample size and effect size
Fragment of Cohen’s Table 2.3.5
d
n
0.10
0.20
…
0.80
1.00
1.20
1.40
8
05
07
…
31
46
60
73
9
06
07
…
35
51
65
79
10
06
07
…
39
56
71
84
11
06
07
…
43
63
76
87
Computing Statistical Power
Power in complex (clustered sample) designs depends on:
• Significance level
• Effect size δ
• Sample size at each level of sampling
(e.g., m clusters, n individuals per cluster)
• Intraclass correlation structure
This makes it seem a lot harder to compute
Computing Statistical Power
Computing statistical power in complex designs is only a
little harder than computing it for simple designs
Compute operational effect size (incorporates sample
design information) О”T
Look power up in a table for operational sample size and
operational effect size
This is the same table that you use for simple designs
Power in Two-level Hierarchical Design
With No Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
пЃ„ пЂЅпЃ¤
T
О”T
= Оґ x (Design Effect)
n
1 пЂ« пЂЁ n пЂ­ 1пЂ© ПЃ
For the two-level hierarchical design with no covariates
пЃ„ пЂЅпЃ¤
T
n
1 пЂ« пЂЁ n пЂ­ 1пЂ© ПЃ
Operational sample size is number of schools (clusters)
Power in Two-level Hierarchical Design
With No Covariates
As m (number of schools) increases, power increases
As effect size increases, power increases
Other influences occur through the design effect
n
1 пЂ« пЂЁ n пЂ­ 1пЂ© ПЃ
пЂЅ
1
1
n
пЂ« (1 пЂ­ 1n ) пЃІ
As ПЃ increases the design effect (and power) decreases
No matter how large n gets the maximum design effect is
1/ ПЃ
Thus power only increases up to some limit as n increases
Two-level Hierarchical Design
With Covariates (HLM Notation)
Level 1 (individual level)
Yijk = ОІ0j + ОІ1jXijk+ Оµijk
Оµ ~ N(0, ПѓAW2)
Level 2 (school Level)
ОІ0j = ПЂ00 + ПЂ01Tj + ПЂ02Wj + Оѕ0j
ОІ1j = ПЂ10
Оѕ ~ N(0, ПѓAS2)
Note that the covariate effect ОІ1j = ПЂ10 is a fixed effect
If we code the treatment Tj = ВЅ or - ВЅ , then the parameters
are identical to those in standard ANCOVA
Precision in Two-level Hierarchical Design
With Covariates
The standard error of the treatment effect
SE пЂЅ пЃі T
пѓ¦ 1 пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© R 2 пЂ« пЂЁ nR 2 пЂ­ R 2 пЂ© пЃІ пѓ№ пѓ¶
S
W
пѓ¦ 2 пѓ¶пѓ§
пѓ« W
пѓ»пѓ·
пѓ§ пѓ·
пѓ·
n
пѓЁ m пѓёпѓ§
пѓЁ
пѓё
SE decreases as m increases
SE deceases as n increases, but only up to point
SE increases as ПЃ increases
SE decreases as RW2 and RS2 increase
Power in Two-level Hierarchical Design
With Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
О”T = Оґ x (Design Effect)
For the two-level hierarchical design with covariates
пЃ„
T
A
пЂЅпЃ¤
n
2
2
2
1 пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ nR S пЂ­ RW пЂ© пЃІ пѓ№
пѓ«
пѓ»
The covariates increase the design effect
Power in Two-level Hierarchical Design
With Covariates
As m and effect size increase, power increases
Other influences occur through the design effect
n
2
2
2
1 пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ nR S пЂ­ RW пЂ© пЃІ пѓ№
пѓ«
пѓ»
As ПЃ increases the design effect (and power) decrease
Now the maximum design effect as large n gets big is
2
1 (1 пЂ­ R S ) ПЃ
As the covariate-outcome correlations RW2 and RS2
increase the design effect (and power) increases
Three-level Hierarchical Design
Here there are three factors
• Treatment
• Schools (clusters) nested in treatments
• Classes (subclusters) nested in schools
Suppose there are
• m schools (clusters) per treatment
• p classes (subclusters) per school (cluster)
• n students (individuals) per class (subcluster)
Three-level Hierarchical Design
With No Covariates
The statistical model for the observation on the lth person in
the kth class in the jth school in the ith treatment is
Yijkl = Ој + О±i + ОІj(i) + Оіk(ij) + Оµijkl
where
Ој is the grand mean,
О±i is the average effect of being in treatment i,
ОІj(i) is the average effect of being in school j, in treatment i
Оіk(ij) is the average effect of being in class k in treatment i, in
school j,
Оµijkl is a residual
Three-level Hierarchical Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijkl = ОІ0jk + Оµijkl
Оµ ~ N(0, ПѓW2)
Level 2 (classroom level)
ОІ0jk = Оі0j + О·0jk
О· ~ N(0, ПѓC2)
Level 3 (school Level)
Оі0j = ПЂ00 + ПЂ01Tj + Оѕ0j
Оѕ ~ N(0, ПѓS2)
If we code the treatment Tj = ВЅ or - ВЅ , then
ПЂ00 = Ој, ПЂ01 = О±1, Оѕ0j = Оіk(ij), О·0jk = ОІj(i)
Three-level Hierarchical Design
Intraclass Correlations
In three-level designs there are two levels of clustering and
two intraclass correlations
At the school (cluster) level
2
ПЃS пЂЅ
2
пЃіS
2
2
2
пЃі S пЂ« пЃі C пЂ« пЃіW
пЂЅ
пЃіS
2
пЃіT
At the classroom (subcluster) level
2
ПЃC пЂЅ
пЃіC
пЃі
2
2
2
пЂ«
пЃі
пЂ«
пЃі
S
C
W
2
пЂЅ
пЃіC
2
пЃіT
Precision in Three-level Hierarchical Design
With No Covariates
The standard error of the treatment effect
SE пЂЅ пЃі T
пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« пЂЁ pn пЂ­ 1 пЂ© ПЃ S пЂ« ( n пЂ­ 1) пЃІ C пѓ¶
пѓ·
пѓ§ пѓ·пѓ§
pn
пѓЁ m пѓёпѓЁ
пѓё
SE decreases as m increases
SE deceases as p and n increase, but only up to point
SE increases as ПЃS and ПЃC increase
Power in Three-level Hierarchical Design
With No Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
О”T = Оґ x (Design Effect)
For the three-level hierarchical design with no covariates
пЃ„ пЂЅпЃ¤
T
pn
1 пЂ« ( pn пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ C
The operational sample size is the number of schools
Power in Three-level Hierarchical Design
With No Covariates
As m and the effect size increase, power increases
Other influences occur through the design effect
pn
1 пЂ« ( pn пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ C
As ПЃS or ПЃC increases the design effect decreases
No matter how large n gets the maximum design effect is
1
пЂЁпЃІ
S
пЂ«
1
p
пЃІC
пЂ©
Thus power only increases up to some limit as n increases
Three-level Hierarchical Design
With Covariates (HLM Notation)
Level 1 (individual level)
Yijkl = ОІ0jk + ОІ1jkXijkl + Оµijkl
Оµ ~ N(0, ПѓAW2)
Level 2 (classroom level)
ОІ0jk = Оі00j + Оі01jZjk + О·0jk
ОІ1jk = Оі10j
О· ~ N(0, ПѓAC2)
Level 3 (school Level)
Оі00j = ПЂ00 + ПЂ01Tj + ПЂ02Wj + Оѕ0j
Оі01j = ПЂ01
Оі10j = ПЂ10
Оѕ ~ N(0, ПѓAS2)
The covariate effects ОІ1jk = Оі10j = ПЂ10 and Оі01j = ПЂ01 are fixed
Precision in Three-level Hierarchical Design
With Covariates
SE пЂЅ пЃі T
пѓ¦ 2 пѓ¶
пѓ§ пѓ·
пѓЁmпѓё
п‚ґ
2
2
2
2
2
1 пЂ« ( pn пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ pnR S пЂ­ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ­ RW пЂ© пЃІ C пѓ№
пѓ«
пѓ»
pn
SE decreases as m increases
SE deceases as p and n increase, but only up to point
SE increases as ПЃ increases
SE decreases as RW2, RC2, and RS2 increase
Power in Three-level Hierarchical Design
With Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
О”T = Оґ x (Design Effect)
For the three-level hierarchical design with covariates
пЃ„A пЂЅ пЃ¤
T
pn
2
2
2
2
2
1 пЂ« ( pn пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ pnR S пЂ­ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ­ RW пЂ© пЃІ C пѓ№
пѓ«
пѓ»
The operational sample size is the number of schools
Power in Three-level Hierarchical Design
With Covariates
As m and the effect size increase, power increases
Other influences occur through the design effect
pn
2
2
2
2
2
1 пЂ« ( pn пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ pnR S пЂ­ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ­ RW пЂ© пЃІ C пѓ№
пѓ«
пѓ»
As ПЃS or ПЃC increase the design effect decreases
No matter how large n gets the maximum design effect is
2
пѓ©
1 пЂЁ1 пЂ­ R S пЂ© пЃІ S пЂ«
пѓ«
1
p
пЂЁ1 пЂ­ R пЂ© пЃІ
2
C
C
пѓ№
пѓ»
Thus power only increases up to some limit as n increases
Randomized Block Designs
Two-level Randomized Block Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijk = ОІ0j + ОІ1jTijk+ Оµijk
Оµ ~ N(0, ПѓW2)
Level 2 (school Level)
ОІ0j = ПЂ00 + Оѕ0j
ОІ1j = ПЂ10+ Оѕ1j
Оѕ0j ~ N(0, ПѓS2)
Оѕ1j ~ N(0, ПѓTxS2)
If we code the treatment Tijk = ВЅ or - ВЅ , then the
parameters are identical to those in standard ANOVA
Randomized Block Designs
In randomized block designs, as in hierarchical designs,
the intraclass correlation has an impact on precision and
power
However, in randomized block designs designs there is
also a parameter reflecting the degree of heterogeneity
of treatment effects across schools
We define this heterogeneity parameter П‰S in terms of the
amount of heterogeneity of treatment effects relative to
the heterogeneity of school means
Thus
П‰S = ПѓTxS2/ПѓS2
Precision in Two-level Randomized Block Design
With No Covariates
The standard error of the treatment effect
SE пЂЅ пЃі T
пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« ( nпЃ· S пЂ­ 1) ПЃ пѓ¶
пѓ§ пѓ·пѓ§
пѓ·
n
пѓЁ m пѓёпѓЁ
пѓё
SE decreases as m (number of schools) increases
SE deceases as n and p increase, but only up to point
SE increases as ПЃ increases
SE increases as П‰S =
ПѓTxS2/ПѓS2 increases
Power in Two-level Randomized Block Design
With No Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
пЃ„ пЂЅпЃ¤
T
О”T
= Оґ x (Design Effect)
n
1 пЂ« пЂЁ n пЂ­ 1пЂ© ПЃ
For the two-level hierarchical design with no covariates
пЃ„ пЂЅпЃ¤
T
n/2
1 пЂ« пЂЁ nпЃ· S пЂ­ 1 пЂ© ПЃ
Operational sample size is number of schools (clusters)
Precision in Two-level Randomized Block Design
With Covariates
The standard error of the treatment effect
SE пЂЅ пЃі T
пѓ¦ 1 пЂ« пЂЁ nпЃ· пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© R 2 пЂ« пЂЁ nпЃ· R 2 пЂ­ R 2 пЂ© пЃІ пѓ№ пѓ¶
S
S
S
W
пѓ¦ 2 пѓ¶пѓ§
пѓ« W
пѓ»пѓ·
пѓ§ пѓ·
пѓ·
n
пѓЁ m пѓёпѓ§
пѓЁ
пѓё
SE decreases as m increases
SE deceases as n increases, but only up to point
SE increases as ПЃ increases
SE increases as П‰S =
ПѓTxS2/ПѓS2 increases
SE (generally) decreases as RW2 and RS2 increase
Power in Two-level Randomized Block Design
With Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
О”T = Оґ x (Design Effect)
For the two-level hierarchical design with covariates
пЃ„
T
A
пЂЅпЃ¤
n/2
2
2
2
1 пЂ« пЂЁ n пЃ· S пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ n пЃ· S R S пЂ­ RW пЂ© пЃІ пѓ№
пѓ«
пѓ»
The covariates increase the design effect
Three-level Randomized Block Designs
Three-level Randomized Block Design
With No Covariates
Here there are three factors
• Treatment
• Schools (clusters) nested in treatments
• Classes (subclusters) nested in schools
Suppose there are
• m schools (clusters) per treatment
• 2p classes (subclusters) per school (cluster)
• n students (individuals) per class (subcluster)
Three-level Randomized Block Design
With No Covariates
The statistical model for the observation on the lth person in
the kth class in the ith treatment in the jth school is
Yijkl = Ој +О±i + ОІj + Оіk + О±ОІij + Оµijkl
where
Ој is the grand mean,
О±i is the average effect of being in treatment i,
ОІj is the average effect of being in school j,
Оіk is the effect of being in the kth class,
О±ОІij is the difference between the average effect of
treatment i and the effect of that treatment in school j,
Оµijkl is a residual
Three-level Randomized Block Design
With No Covariates (HLM Notation)
Level 1 (individual level)
Yijkl = ОІ0jk + Оµijkl
Оµ ~ N(0, ПѓW2)
Level 2 (classroom level)
ОІ0jk = Оі00j + Оі01jTj + О·0jk
О· ~ N(0, ПѓC2)
Level 3 (school Level)
Оі00j = ПЂ00 + Оѕ0j
Оі01j = ПЂ10 + Оѕ1j
Оѕoi ~ N(0, ПѓS2)
Оѕ1i ~ N(0, ПѓTxS2)
If we code the treatment Tj = ВЅ or - ВЅ , then
ПЂ00 = Ој, ПЂ10 = О±1, Оѕ0j = ОІj , Оѕ1j = О±ОІij , О·0jk = Оіk
Three-level Randomized Block Design
Intraclass Correlations
In three-level designs there are two levels of clustering and
two intraclass correlations
At the school (cluster) level
2
ПЃS пЂЅ
пЃіS
пЃі
2
2
2
пЂ«
пЃі
пЂ«
пЃі
S
C
W
2
пЂЅ
пЃіS
2
пЃіT
At the classroom (subcluster) level
2
ПЃC пЂЅ
пЃіC
пЃі
2
2
2
пЂ«
пЃі
пЂ«
пЃі
S
C
W
2
пЂЅ
пЃіC
2
пЃіT
Three-level Randomized Block Design
Heterogeneity Parameters
In three-level designs, as in two-level randomized block
designs, there is also a parameter reflecting the degree
of heterogeneity of treatment effects across schools
We define this parameter П‰S in terms of the amount of
heterogeneity of treatment effects relative to the
heterogeneity of school means (just like in two-level
designs)
Thus
П‰S = ПѓTxS2/ПѓS2
Precision in Three-level Randomized Block Design
With No Covariates
The standard error of the treatment effect
SE пЂЅ пЃі T
пѓ¦ 2 пѓ¶ пѓ¦ 1 пЂ« пЂЁ pnпЃ· S пЂ­ 1 пЂ© ПЃ S пЂ« ( n пЂ­ 1) пЃІ C пѓ¶
пѓ·
пѓ§ пѓ·пѓ§
pn
пѓЁ m пѓёпѓЁ
пѓё
SE decreases as m increases
SE deceases as p and n increase, but only up to point
SE increases as П‰S increases
SE increases as ПЃS and ПЃC increase
Power in Three-level Randomized Block Design
With No Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
О”T = Оґ x (Design Effect)
For the three-level hierarchical design with no covariates
пЃ„ пЂЅпЃ¤
T
pn / 2
1 пЂ« ( pnпЃ· S пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ C
The operational sample size is the number of schools
Power in Three-level Randomized Block Design
With No Covariates
As m and the effect size increase, power increases
Other influences occur through the design effect
pn / 2
1 пЂ« ( pnпЃ· S пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ C
As ПЃS or ПЃC increases the design effect decreases
No matter how large n gets the maximum design effect is
1 2 пЂЁпЃ· S пЃІ S пЂ«
1
p
пЃІC
пЂ©
Thus power only increases up to some limit as n increases
Power in Three-level Randomized Block Design
With Covariates
SE пЂЅ пЃі T
пѓ¦ 2 пѓ¶
пѓ§ пѓ·
пѓЁmпѓё
п‚ґ
2
2
2
2
2
1 пЂ« ( pnпЃ· S пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ pnпЃ· S R S пЂ­ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ­ RW пЂ© пЃІ C пѓ№
пѓ«
пѓ»
pn
SE decreases as m increases
SE deceases as p and n increases, but only up to point
SE increases as ПЃ and П‰S increase
SE decreases as RW2, RC2, and RS2 increase
Power in Three-level Randomized Block Design
With Covariates
Basic Idea:
Operational Effect Size = (Effect Size) x (Design Effect)
О”T = Оґ x (Design Effect)
For the three-level hierarchical design with covariates
пЃ„A пЂЅ
T
пЃ¤
pn / 2
2
2
2
2
2
1 пЂ« ( pnпЃ· S пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ pnпЃ· S R S пЂ­ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ­ RW пЂ© пЃІ C пѓ№
пѓ«
пѓ»
The operational sample size is the number of schools
Power in Three-level Randomized Block Design
With Covariates
As m and the effect size increase, power increases
Other influences occur through the design effect
pn / 2
2
2
2
2
2
1 пЂ« ( pnпЃ· S пЂ­ 1) пЃІ S пЂ« пЂЁ n пЂ­ 1 пЂ© ПЃ пЂ­ пѓ© RW пЂ« пЂЁ pnпЃ· S R S пЂ­ RW пЂ© пЃІ S пЂ« пЂЁ nR C пЂ­ RW пЂ© пЃІ C пѓ№
пѓ«
пѓ»
As ПЃS or ПЃC increases the design effect decreases
No matter how large n gets the maximum design effect is
2
пѓ©
1 2 пЂЁ1 пЂ­ R S пЂ© пЃ· S пЃІ S пЂ«
пѓ«
1
p
пЂЁ1 пЂ­ R пЂ© пЃІ
2
C
C
пѓ№
пѓ»
Thus power only increases up to some limit as n increases
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Experiments cannot estimate the causal effect on any
individual
Experiments estimate average causal effects on the units
that have been randomized
• If you randomize schools the (average) causal effects
are effects on schools
• If you randomize classes, the (average) causal effects
are on classes
• If you randomize individuals, the (average) causal effects
estimated are on individuals
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Theoretical Considerations
Decide what level you care about, then randomize at that
level
Randomization at lower levels may impact generalizability
of the causal inference (and it is generally a lot more
trouble)
Suppose you randomize classrooms, should you also
randomly assign students to classes?
It depends: Are you interested in the average causal effect
of treatment on naturally occurring classes or on
randomly assembled ones?
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Relative power/precision of treatment effect
Assign Schools
(Hierarchical Design)
пѓ¦ 1 пЂ« пЂЁ pn пЂ­ 1 пЂ© ПЃ S пЂ« ( n пЂ­ 1) пЃІ C пѓ¶
пѓ§
пѓ·
pn
пѓЁ
пѓё
Assign Classrooms
(Randomized Block)
пѓ¦ 1 пЂ« пЂЁ pnпЃ· S пЂ­ 1 пЂ© ПЃ S пЂ« ( n пЂ­ 1) пЃІ C пѓ¶
пѓ§
пѓ·
pn
пѓЁ
пѓё
Assign Students
(Randomized Block)
пѓ¦ 1 пЂ« пЂЁ pnпЃ· S пЂ­ 1 пЂ© ПЃ S пЂ« ( nпЃ· C пЂ­ 1) пЃІ C пѓ¶
пѓ§
пѓ·
pn
пѓЁ
пѓё
What Unit Should Be Randomized?
(Schools, Classrooms, or Students)
Precision of estimates or statistical power
dictate assigning the lowest level possible
But the individual (or even classroom) level
will not always be feasible or even
theoretically desirable
Questions and Answers About Design
Questions and Answers About Design
1.
Is it ok to match my schools (or classes) before I
randomize to decrease variation?
2.
I assigned treatments to schools and am not using
classes in the analysis. Do I have to take them into
account in the design?
3.
I am assigning schools, and using every class in the
school. Do I have to include classes as a nested
factor?
4.
My schools all come from two districts, but I am
randomly assigning the schools. Do I have to take
district into account some way?
Questions and Answers About Design
1. I didn’t really sample the schools in my
experiment (who does?). Do I still have to
treat schools as random effects?
2. I didn’t really sample my schools, so what
population can I generalize to anyway?
3. I am using a randomized block design with fixed
effects. Do you really mean I can’t say
anything about effects in schools that are not
in the sample?
Questions and Answers About Design
1.
We randomly assigned, but our assignment was
corrupted by treatment switchers. What do we do?
2.
We randomly assigned, but our assignment was
corrupted by attrition. What do we do?
3.
We randomly assigned but got a big imbalance on
characteristics we care about (gender, race, language,
SES). What do we do?
4.
We randomly assigned but when we looked at the
pretest scores, we see that we got a big imbalance (a
“bad randomization”). What do we do?
Questions and Answers About Design
1.
We care about treatment effects, but we really want to
know about mechanism. How do we find out if
implementation impacts treatment effects?
2.
We want to know where (under what conditions) the
treatment works. Can we analyze the relation between
conditions and treatment effect to find this out?
3.
We have a randomized block design and find
heterogeneous treatment effects. What can we say
about the main effect of treatment in the presence of
interactions?
Questions and Answers About Design
1.
I prefer to use regression and I know that regression
and ANOVA are equivalent. Why do I need all this
ANOVA stuff to design and analyze experiments?
2.
Don’t robust standard errors in regression solve all
these problems?
3.
I have heard of using “school fixed effects” to analyze
a randomized block design. Is the a good alternative
to ANOVA or HLM?
4.
Can I use school fixed effects in a hierarchical design?
Questions and Answers About Design
1. We want to use covariates to improve
precision, but we find that they act somewhat
differently in different groups (have different
slopes). What do we do?
2. We get somewhat different variances in
different groups. Should we use robust
standard errors?
3. We get somewhat different answers with
different analyses. What do we do?
Thank You !
Документ
Категория
Презентации
Просмотров
10
Размер файла
852 Кб
Теги
1/--страниц
Пожаловаться на содержимое документа