close

Вход

Забыли?

вход по аккаунту

?

Cosmic microwave background analysis of CAPMAP and future experiments

код для вставкиСкачать
THE UNIVERSITY OF CHICAGO
COSMIC MICROWAVE BACKGROUND ANALYSIS FOR CAPMAP AND
FUTURE EXPERIMENTS
A DISSERTATION SUBMITTED TO
THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES
IN CANDIDACY FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
DEPARTMENT OF PHYSICS
BY
KENDRICK M. SMITH
CHICAGO, ILLINOIS
AUGUST 2007
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
UMI Number: 3273077
INFORMATION TO USERS
The quality of this reproduction is dependent upon the quality of the copy
submitted. Broken or indistinct print, colored or poor quality illustrations and
photographs, print bleed-through, substandard margins, and improper
alignment can adversely affect reproduction.
In the unlikely event that the author did not send a complete manuscript
and there are missing pages, these will be noted. Also, if unauthorized
copyright material had to be removed, a note will indicate the deletion.
®
UMI
UMI Microform 3273077
Copyright 2007 by ProQuest Information and Learning Company.
All rights reserved. This microform edition is protected against
unauthorized copying under Title 17, United States Code.
ProQuest Information and Learning Company
300 North Zeeb Road
P.O. Box 1346
Ann Arbor, Ml 48106-1346
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Copyright © 2007 by Kendrick M. Smith
All rights reserved
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ABSTR ACT
A major frontier for cosmology in the coming decade will be making precision mea­
surements of the cosmic microwave background (CMB) polarization, complement­
ing existing measurements of the CMB tem perature anisotropies. The E-mode, or
gradient-like component, of CMB polarization will break param eter degeneracies from
CMB tem perature alone and improve constraints on reionization history and initial
conditions in the standard cosmological model. The B-mode, or curl-like component
will permit strong constraints on growth of structure from CMB lensing, and probe
new physics by measuring the gravity wave content of the early universe.
In the first half of this thesis, we describe design and implementation of the analysis
pipeline for the 2005 observing season of CAPMAP, an experiment to measure CMB
polarization on small angular scales using coherent polarimeters and the Lucent 7
meter telescope in Crawford Hill, New Jersey. Although the results of the analysis
are not completely finalized, we present partial results obtained from the data, and
full results for a full-sesaon simulation, in order to illustrate the measurement th at
will be obtained.
The CAPMAP analysis pipeline uses a likelihood formalism which is computation­
ally expensive, but results in measurement uncertainties which are provably optimal.
An optimal analysis will be computationally infeasible for upcoming generations of
CMB polarization experiments, in which the problem size will be larger by several
orders of magnitude. Therefore, fast approximate methods have been proposed. In
the second half of this thesis, we show th at in their originally proposed form, these
methods fail to preserve the E-B decomposition, and this failure ultimately acts as a
limiting source of noise when measuring B-modes, and propose modifications which
solve this problem.
iii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
ACKNOW LEDGM ENTS
First and foremost, I would like to thank my advisors Wayne Hu and Bruce Winstein
for their constant ideas and advice throughout the last five years. Wayne’s ability
to answer detailed questions “cold” from any subfield of cosmology served as both a
source of inspiration and humility. Bruce’s great enthusiasm for physics was always
highly contagious and led to frequent Friday afternoon conversations about oddball
problems and thought experiments. Very few scientists would have the excitement
and motivation to start over as a “graduate student” in a new area, after reaching
prominence as one of the leading figures in one field of physics.
This thesis would not have been possible without the hard work and additional
perspective of my collaborators. The CAPMAP experiment was designed, built and
operated by Denis Barkats, Colin Bischoff, Phil Farese, Todd Gaier, Josh Gundersen,
M att Hedman, Lewis Hyatt, Glen Nixon, Dorothea Samtleben, Suzanne Staggs, Eu­
genia Stefanescu, Keith Vanderlinde, and Bruce Winstein. The CAPMAP analysis
effort was distributed among many members of the collaboration; but I am especially
indebted to Bruce and Lewis, who I worked most closely with over the course of the
analysis.
Thanks to my collaborators throughout my time at Chicago: Olivier Dore, Wayne
Hu, Dragan Huterer, Manoj Kaplinghat, Mike Nolta, Oliver Zahn, and Matias Zaldarriaga. This research could not have happened without their contributions and
feedback. Getting an unexpected email from Matias complimenting one of my papers
was a great encouragement to a budding cosmologist.
Thanks to all of my officemates and friends at Chicago who helped keep me
sane over the years: Prahlad, Sunny, David, Andrei, Monica, Vikram, Ivo, Takemi,
Zhaoming, Robert, Colin, Keith, Cora, as well as everyone else I’ve forgetten to
iv
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
mention.
I would like to thank my family for all of their support and encouragment. Finally,
I would like to thank Anjali for reasons too numerous to mention, from helping me
realize th at I always secretly wanted to be a physicist in the first place, to staying
up late making figures the night before the thesis deadline. She is the center of my
universe.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
TABLE OF C O N T E N T S
ABSTRACT
.................................................................................................................
iii
A C K N O W L E D G M E N T S ...........................................................................................
iv
LIST OF F I G U R E S ....................................................................................................
ix
LIST OF T A B L E S ........................................................................................................
xiii
1 IN T R O D U C T IO N .................................................................................................
1.1 The cosmic microwave background .........................................................
1.2 CMB p o larizatio n ..........................................................................................
1.3 Scope of thesis .............................................................................................
1
1
6
12
2
ANALYSIS OF 2005 DATA FROM C A P M A P .................................................
2.1 In tro d u c tio n ...................................................................................................
2.1.1 Focal plane and scan s tr a t e g y ........................................................
2.1.2 P olarim eters........................................................................................
2.2 Pipeline o v erv iew ..........................................................................................
2.2.1 P o in tin g ...............................................................................................
2.2.2 Gain m o d e l ........................................................................................
2.2.3 Noise m o d e l........................................................................................
2.3 S im u latio n .......................................................................................................
2.4 Cycle finding ................................................................................................
2.5 D ata se le ctio n ................................................................................................
2.6 Map m a k in g ...................................................................................................
2.7 Likelihood a n a ly s is ......................................................................................
2.7.1 Bandpower estimates: maximum likelihood.................................
2.7.2 Bandpower uncertainties: conditional likelihoods.......................
2.7.3 Bandpower uncertainties: marginalized likelihoods....................
2.8 Combining maps ..........................................................................................
2.9 Null m a p s .......................................................................................................
2.10 Null test s u i t e ................................................................................................
2.11 Final power sp ectru m ...................................................................................
15
15
15
18
21
23
25
25
29
30
31
33
37
39
42
47
51
55
57
65
3
PURE PSEU D O -Q ESTIMATORS .................................................................
66
3.1 In tro d u c tio n ...................................................................................................
66
3.2 Notation and C o n v e n tio n s.........................................................................
68
3.3 Pseudo-C^ e stim a to rs...................................................................................
70
3.4 Constructing pure pseudo-C^ estimators ................................................
72
3.4.1 Constructing pure pseudo-Q estimators 1: continuum limit .
73
3.4.2 Constructing pure pseudo-Q> estimators 2: finite pixelized maps 75
vi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
vii
3.5
3.6
3.7
3.8
4
An example with homogeneous n o i s e ......................................................
An example with inhomogeneous n o i s e ...................................................
Detectability of gravity wave B-modes ...................................................
C onclusion.......................................................................................................
80
87
91
96
98
A GENERAL SOLUTION TO THE E-B MIXING P R O B L E M ..........
4.1 In tro d u ctio n ...................................................................................................
98
4.2 Preliminaries ................................................................................................ 100
4.2.1 Spin two n o ta tio n ............................................................................... 100
4.2.2 Small-angle approxim ation.............................................................. 101
4.2.3 The ambiguous m o d e s ..................................................................... 102
4.3 Avoiding leakage by measuring x b ......................................................... 106
4.3.1 A very blue spectrum: the need for a p o d iz a tio n ....................... 107
4.4 Relation to the pure pseudo-C /s ............................................................ 109
4.5 Choosing the optimal weight fu n c tio n ...................................................... 113
4.5.1 D eriv atio n ........................................................................................... 116
4.5.2 A variational form of the a n s a tz ..................................................... 117
4.6 Some analytic solutions................................................................................ 119
4.6.1 Case 1: £2 power sp e c tru m .............................................................. 119
4.6.2 Case 2: Z4 power s p e c tru m .............................................................. 121
4.7 Numerical so lu tio n s...................................................................................... 121
4.8 A realistic e x a m p le ...................................................................................... 126
4.9 D iscu ssio n ...................................................................................................... 134
A SPIN-1 SPHERICAL HARMONIC TRANSFORM S.......................................
138
B FINITE DIFFERENCING IN AN IRREGULAR SPHERICAL PIXELIZAT I O N .................................................................................................................. 139
C CORRELATION FUNCTIONS FOR FIELDS OF ARBITRARY SPIN . .
141
D COMPUTING TRANSFER M A T R IC E S .................................................... 147
D .l Transfer matrices for tem perature p s e u d o -C /s ...................................... 147
D.2 The transfer m atrix K ^ f u r e ...................................................................... 149
D.3 Efficient calculation of the transfer matrix in te g r a ls ............................ 157
E
COMPUTING NOISE B IA S ...........................................................................
F
FISHER MATRIX EVALUATION WITH AZIMUTHAL SYMMETRY . .
G RELATION TO FK P
159
..........................................................................................
H NUMERICAL OPTIMIZATION OF WEIGHT F U N C T IO N S .............
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
160
163
166
REFERENCES
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
LIST OF FIG U R ES
1.1
1.2
1.3
2.1
2.2
2.3
2.4
2.5
2.6
2.7
2.8
2.9
Measurements of the CMB tem perature power spectrum C j T from the
WMAP satellite and several ground-based experiments (from Hinshaw
et al. (2006)).....................................................................................................
Full-sky observations of the CMB tem perature anisotropy A T (n) from
the WMAP satellite, combined over several frequencies using the In­
ternal Linear Combination (ILC) method (from Hinshaw et al. (2006)).
Predicted TT, EE, and BB power spectra for the WMAP3 fiducial cos­
mology modified to include a gravity wave component with T / S = 0.2.
The BB spectrum contains contributions from gravity waves (the larger
contribution for t < 100), and gravitational lensing of the primary Emode signal (larger for I > 100)...................................................................
CAPMAP focal plane.....................................................................................
CAPMAP scan strategy. In one 21 second scan cycle, each polarimeter
scans a circle near NCP; full azimuthal coverage is obtained every 24
hours from the E arth’s rotation....................................................................
CAPMAP analysis pipeline .......................................................................
Raw CAPMAP timestream data, downsampled to 1 Hz and plotted
over an arbitrarily chosen 100 sec interval, showing the sinusoidal at­
mospheric signal on top of white noise fluctuations..................................
Conditional likelihoods for EE, BB and EB bandpowers, with remain­
ing bandpowers fixed to maximum likelihood values, for simulated Wband (black/solid) and Q-band (blue/dotted) full-season d ata..............
Conditional likelihood for lowest-^ EE and BB bandpowers (on the
x-axis and y-axis respectively), with remaining bandpowers fixed to
maximum likelihood values, for simulated Q-band full-season data. . .
Markov chain likelihood samples from the full-season simulated dataset
in Q-band. Each sample is a twelve-component vector, corresponding
to the twelve bandpowers shown in Fig. 2.5, but only the lowest-^ EE
bandpower (x-axis) and lowest-^ BB bandpower are shown here. The
equilibrium distribution of the chain is the marginalized likelihood, but
each sample is correlated to the previous sample......................................
Marginalized likelihoods for EE, BB, and EB bandpowers, obtained by
histogramming Markov chain samples, for simulated W -band (black/solid)
and Q-band (blue/dotted) full-season d ata................................................
Day/night null test for W -band CAPMAP data. The y 2 value (Eq. (2.53))
to zero for the EE, BB, and EB likelihoods is 0.01, 0.79 and 0.25 re­
spectively, with one degree of freedom in each y2.....................................
ix
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2
4
10
16
17
21
26
44
45
49
52
58
X
2.10 Final power spectrum measurement in from the full-season simulated
dataset, with marginalized uncertainties computed by Markov chain
sampling. The three sets of error bars represent W -band data alone
(left), Q-band data alone (middle), and both frequencies combined
(right).................................................................................................................
2.11 One-band likelihood in a multiple of the fiducial EE spectrum, for the
full-season simulated dataset with both frequencies combined. Zero
power is excluded at 13a................................................................................
3.1 The pure B-mode (3.34), for (£,m) = (10,1) (left) and (50,1) (right),
using apodization length r* = 5°. Only the real part is shown. In the
left panel, the second and third terms, which are concentrated in the
apodization region, dominate the main term .............................................
3.2 Transfer functions K ^ , (left) and K ^ ure (right), shown for varying
b', with the band b fixed at £rmn = 80, £max = 120. The apodization
length used was r* = 5°. The matrix entry K ^ , represents the mean
response of the BB estimator in band b to BB power in band £/; K ^ ,
represents the response of the BB estimator to EE power. As N s^ e —>
oo,
approaches zero...........................................................................
3.3 Contribution to ((■j^B ^Pure) fr0m the fiducial E-mode power spectrum,
for zero beam (left) and FWHM 25’ (right). This can be interpreted as
the estimated B-mode power which is contributed by E-modes due to
pixelization artifacts. A gravity wave B-mode spectrum with T / S =
0.01 is shown for scale. The apodization length used was r* = 5°. . .
3.4 RMS bandpower errors for pure pseudo-6^ estimators (blue/solid), un­
modified psuedo-Q estimators (green/dashed), and optimal Fisher er­
rors (red/dotted), shown for varying apodization length r*. In the left
panel, the band 10 < £ < 40 is shown; in this band, the performance
of unmodified p s e u d o -e s tim a to rs is very poor and the dashed curve
is not visible. In the right panel, the band 40 < £ < 80 is shown. The
noise level used was 20 //K-arcmin...............................................................
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
63
64
83
84
85
86
XI
3.5
RMS bandpower errors for pure p s e u d o -e s tim a to r s (blue/left), un­
modified pseudo-C/ estimators (green/middle), and optimal Fisher er­
rors (red/right). The survey region is a spherical cap of radius 13°
with uniform white noise. The top two panels are for a fiducial model
with T / S = 0.2 and noise level 20 /iK-arcmin; the bottom two panels
are for a fiducial model with T / S = 0.05 and noise level 10 /iK-arcmin.
In each pair of panels, bandpowers are shown for E-modes (left) and
B-modes (right). In the E-mode panels, multipoles t < 250 have been
replotted on a log scale for visibility. In the B-mode panels, the lensing
component of the power spectrum has been shown separately (dashed);
detecting the gravity wave signal requires measuring power in excess
of this level........................................................................................................
3.6 RMS bandpower errors for pure pseudo-G/ estimators (blue/left), un­
modified pseudo-C/ estimators (green/middle), and optimal Fisher er­
rors (red/right). The survey region is a spherical cap of radius 13°
with inhomogeneous noise given by (3.35). The top two panels are for
a fiducial model with T / S = 0.2 and noise level 20 /iK-arcmin; the
bottom two panels are for a fiducial model with T / S = 0.05 and noise
level 10 /iK-arcmin. In each pair of panels, bandpowers are shown for
E-modes (left) and B-modes (right).............................................................
3.7 Minimum T /S detectable at la, for a 13° spherical cap survey with
uniform noise, assuming a 25’ beam and Gaussian lensing contaminant
(left), and zero beam and no lensing contamination (right). The solid
line is the optimal Fisher value, the dotted line is the Monte Carlo
value using unmodified pseudo-G^ estimators (§3.3), and the dashed
lines are Monte Carlo values using pure pseudo-G^ estimators. For
the parameters in the right panel, finite-pixelization artifacts impose
a “floor” on the value of T /S which can be detected, but the value
can be made arbitrarily small by increasing the resolution. For the
parameters in the left panel, we find th at increasing the resolution
beyond N s^ e = 256 results in no significant improvement......................
88
92
93
4.1 Illustration of the leakage due to finite sky coverage in the ordinary
pseudo-C/ technique. A circular region 10 degrees in radius was used
to estimate the pseudo-G/’s. The leakage curve shows the R-modes
obtained even in the absence of any cosmological R-modes just from the
leakage of E. For comparison we also show the expected cosmological
signal for T / S = 0.1 and the noise power spectrum for an experiment
—
4.2
1/2
with Wp
= 5 .7 5 /iK-arcmin and an 8 arcminute beam ............................105
Optimal window functions W0pt(r), given by Eqs. (4.31), (4.33) for a
signal+noise power spectrum proportional to /2 (left panel) or /4 (right
panel).................................................................................................................... 120
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
xii
4.3
4.4
4.5
4.6
4.7
4.8
4.9
5-m ode weight functions for varying noise levels, for t = 20 and a 10°
spherical cap with homogeneous noise. As descrbed in the text, each
weight function consists of spin-0, spin-1, and spin-2 pieces, which we
show in the left, center, and right columns. The top three rows show
the 5-m ode weight functions produced by our optimization procedure
for noise levels 100, 20, and 5 //K-arcmin (from top to bottom). In the
bottommost row we show the analytic solution given by Eq. (4.33). .
Power spectrum errors for the 10° homogeneous mock survey described
in §4.7, using pseudo-Q power spectrum estimation with counterterms
(left/blue) or optimal estimators (right/red). For the 5 - mode power
spectrum, we compute power spectrum errors in a fiducial model with
T / S = 0 but also show a spectrum with T / S = 0.05 for comparison. .
Survey region, noise distribution, and point source mask for our fiducial
realistic experiment. The side length of the bounding square is 25°.
The map is based on preliminary simulations of the upcoming balloon
experiment EBEX............................................................................................
Weight functions for the fiducial realistic experiment. Each of the
three rows corresponds to a different bandpower, with multipole ranges
(^mim Anax) given from top to bottom by (30,70), (190,230) and (510,550)
W ithin each row from left to right, the 5-m ode weight function, the
spin-0 piece of the 5 - mode weight function, and the spin-1 and spin-2
pieces of the 5-m ode weight function are shown......................................
A zoomed view of the (4nin^max) = (190, 230) 5-m ode weight func­
tion near point sources. From left to right, the panels show the noise
map, the scalar piece of the weight function, and the two higher-spin
counterterms.....................................................................................................
Contribution of aliased 5-modes to the pseudo power spectrum C f B
for the fiducial realistic experiment with point source mask, with signal
and noise power spectra shown for comparison. Using counterterms
from our numerical optimization procedure, the level of 5 —►5 mixing
is well below the noise floor, in contrast to ordinary p s e u d o Comparison of three sets of pseudo-Q power spectrum errors for the
fiducial realistic experiment. Weight functions were optimized inde­
pendently in each of the three cases. Left/blue: P s e u d o - w i t h coun­
terterms, without the point source mask. Center/magenta: Pseudo-C^
with counterterms, with the mask. Right/red: Ordinary pseudo-C^,
with the mask. For 5-m ode power spectrum errors, the three cases
are so similar th at only one set of error bars is shown.............................
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
124
125
127
129
129
131
132
LIST OF TABLES
2.1
2.2
2.3
2.4
3.1
4.1
List of CAPMAP d ata channels, illustrated for W -band polarimeter
OB and Q-band polarimeter OQ....................................................................
Correlation m atrix between bandpowers, computed from 2D condi­
tional likelihoods with the remaining bandpowers held fixed to maxi­
mum likelihood values.....................................................................................
Null test suite for real CAPMAP data. The overall x 2 of the test suite
is 70.08 with 84 degrees of freedom, indicating th at the null test passes.
Null test suite for real CAPMAP data, with ground-synchronous mode
removal disabled in the map making stage, showing clear failure of the
test suite in Q-band.........................................................................................
20
46
60
62
Pixel weight functions used to construct E-mode estimators for the
mock surveys in this section..........................................................................
90
Forecasts for the lcr upper limit on ( T /S ) and the fractional error
on the amplitude Ajens of the lensing 5-mode, for various levels of
approximation to the fiducial realistic experiment......................................
135
xiii
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 1
IN T R O D U C T IO N
1.1
T he cosm ic microwave background
Throughout the last decade, the cosmic microwave background (CMB) has been at
the forefront of observational cosmology. On the theoretical side, the statistics of the
CMB (and in particular, the dependence on cosmological parameters) can be pre­
dicted to great accuracy because the underlying physics is mostly linear perturbation
theory of a photon-baryon fluid. At the level of current measurements of the CMB,
it is not necessary to understand astrophysical processes such as galaxy formation,
reionization, baryon cooling, and star formation in detail. The robust theoretical pre­
dictions have been accompanied by an astounding increase in experimental sensitivity.
In Fig. 1.1, we show current measurements of the CMB tem perature anisotropy power
spectrum (which we will define precisely below, Eq. (1.2)) from the WMAP satellite
and several ground-based experiments. The CMB has been measured at the percent
level and has provided precision constraints on several cosmological parameters. On
the other hand, it was only 15 years ago th at the first detection of a nonzero signal
was made by the COBE experiment (Smoot et al., 1992).
Since this thesis is primarly concerned with data analysis techniques, it would
take us too far afield to describe the physics which gives rise to the CMB in detail,
but we will make a few comments to help orient the reader. Many types of cosmolog­
ical observations have confirmed a “standard cosmological model” in which the early
universe consists of a hot dense plasma which expands and cools, consisting of radi­
ation, free electrons and protons, and dark m atter particles whose only interactions
are gravitational. Initially, the plasma is nearly uniform, but contains small (~ 10“ ^)
1
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
2
90°
6000
2°
A n g u la r s c a le
0.5°
0.2°
WMAP
A cb ar
B o o m e ra n g
CBI
1(1+1 )C t / 2 t t
[/xK2]
5000
4000
3000
2000
1000
10
100
500
1000
Multipole m om ent I
1500
Figure 1.1: Measurements of the CMB tem perature power spectrum C j T from the
WMAP satellite and several ground-based experiments (from Hinshaw et al. (2006)).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
3
inhomogeneities which propagate as sound waves.
When the tem perature of the
plasma becomes sufficiently low (< 3000 Kelvin), free electrons and protons combine
to form neutral hydrogen. This event is called “recombination” . After recombination,
the universe is nearly transparent. The inhomogeneities no longer behave as sound
waves, but slowly grow gravitationally to ultimately become the large-scale structure
in the universe observed today.
Photons which decouple from the plasma at recombination freestream through
the transparent universe to form the CMB which is observed today. The CMB is
a nearly uniform 2.7 Kelvin blackbody, with small (~ 10- 5 ) hot and cold spots in
temperature, which roughly correspond to overdense and underdense regions of the
plasma along the “surface of last scattering” corresponding to the epoch of recom­
bination along each line of sight. Let A T (n) denote the CMB tem perature, relative
to the average tem perature T q = 2.7 Kelvin, in direction ii. fn Fig. 1.2, we show
full-sky measurements from WMAP of the tem perature anisotropy AT. Note th at
the scale of the plot is 200 microkelvin, several orders of magnitude below Tq. (A
dipole anisotropy due to the E arth’s motion relative to the rest frame of the CMB
has also been subtracted.)
In the standard cosmological model, the initial fluctuations are Gaussian to an
excellent approximation (meaning th at any Appoint correlation function is determined
by the two-point correlation function using Wick’s theorem), ft is also true to an
excellent approximation th a t the observed CMB is linear in the initial fluctuations
(since the physics is described by linear perturbation theory), so (AT) can be treated
as a Gaussian field. Therefore, the statistics of the CMB are completely determined
by the two-point correlation function.
To describe the two-point correlations, it is convenient to decompose A T into
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
-200
T(nK)
+200
Figure 1.2: Full-sky observations of the CMB tem perature anisotropy A T (n) from
the WMAP satellite, combined over several frequencies using the Internal Linear
Combination (ILC) method (from Hinshaw et al. (2006)).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
5
spherical harmonics Y^m(n):
A T ( n )
=
^ 2 aJmYem{n )
(1.1)
£m
This operation (T —►ajm ) is the analog of the Fourier transform for a field defined on
the sphere. The most general two-point correlation allowed by rotational invariance
is of the form:
(a£ma£'m') = ^ t
^(1'^rnm'
(1-2)
where this equation defines the power spectrum C j T . For purposes of constraining
cosmological parameters, CMB experiments can be viewed as measuring the power
spectrum (Fig. 1.1); this contains all information in the underlying tem perature map
(Fig. 1.2).
The CMB power spectrum contains a wealth of cosmological information; a brief
(and incomplete) list of some cosmological observables follows:
1. The acoustic peaks in the power spectrum correspond to wavelengths in the
plasma which undergo an integer number of oscillations before recombination,
and have a characteristic angular scale
la = 7T—
(1.3)
where D * is the angular diameter distance to recombination today, and s* is
the total distance th at sound can travel before recombination.
2. The relative peak heights and the angular scale of the damping tail are deter­
mined by the ratios of radiation, baryons, and dark m atter before recombina­
tion.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
6
3. Through secondary gravitational effects such as lensing, and the decay of po­
tentials due to dark energy (the ISW effect), the CMB is sensitive to the physics
of the late universe.
4. The Sunyaev-Zeldovich effect, or Compton scattering of CMB photons from hot
electrons in galaxy clusters, produces small-scale anisotropy in the CMB which
is tied to growth of structure at all redshifts.
1.2
CM B polarization
In addition to the tem perature anisotropies previously defined, the CMB radiation is
also predicted to be linearly polarized at a small but detectable level. Polarization is
generated by Thomson scattering of the local tem perature quadrupole from free elec­
trons during recombination and reionization. Because the quadrupole anisotropy is
supressed (by the same mechanism, Thomson scattering) at times when free electrons
have high density, polarization is an even smaller signal than the CMB temperature.
To describe the polarization, consider a line of sight direction n and choose coor­
dinates so th at n points in the z direction. The incoming CMB radiation will produce
electric field components E x , Ey perpendicular to the line of sight. In this notation,
the CMB tem perature considered previously can be written as an expectation value
T(n) oc ( ^ ( n ) 2 + E y {n )2)
(1.4)
and linear polarization can be described by Stokes parameters Q, U defined by:
Q(n)
oc (Ex (n)2 - E y (n)2)
(1.5)
U{ n)
oc (2Ex (n)Ey (n))
(1.6)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
7
(There is a third Stokes parameter V describing circular polarization, which we have
omitted since the CMB is not expected to be circularly polarized.)
There is a multipole decomposition for CMB polarization, analogous to the tem­
perature decomposition in Eq. (1.1). The polarization case is more complicated be­
cause the Stokes parameters are not scalar quantities, but depend on the choice of
local frame x, y along the line of sight n. There are two formalisms commonly used for
describing linear polarization covariantly, which we will use interchangeably through­
out the thesis.
The first formalism (Kamionkowski et al., 1997b) is to represent the linear polar­
ization as a traceless symmetric tensor ll a^ on the sphere:
( 1. 7)
This is just the intensity tensor of the incoming radiation, with the trace projected
out. (The traceless part represents linear polarization and the trace represents tem­
perature.)
To define the multipole decomposition, we first define operators
( 1 .8 )
which take scalar fields to symmetric traceless tensors. We then define tensor spherical
harmonics Y ^ and Y ^ by:
y /{l ~
1)1(1
+ 1)(< + 2)£°i'V'!m
(.£m)ab
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
( 1. 9 )
8
(The prefactors normalize so th at |Y^m|2 and | | 2 have integral 1 over the sphere.)
It can be shown th at the tensor spherical harmonics form a complete orthonormal
basis, so th at we can expand:
iW n ) =
a fm Y lm
+ Y
Im
a lm YPm
(L1°)
tm
To motivate this decomposition into E and B components, note th at this is simply
the tensor analog of the gradient/curl decomposition of a vector field. The “gradient­
like” operator £ab is second-order because two derivatives are needed for the result
to transform as a tensor.
The second formalism for describing linear polarization covariantly (Zaldarriaga
& Seljak, 1997) is to use the machinery of spin-s fields. A spin s field (—oo < s < oo)
is a function (s f ) whose value at x depends on a choice of orthonormal basis vectors
{©1 ) 6 2 } at x - Under the right-handed rotation
e?2
(s f )
=
(cos0)ei + (sin 0 )e 2
=
—( s i n ^ ) e i + ( c o s ^ ) e 2
(1-11)
must transform as (s f )' = e~'is^(s f ) . In (Newman & Penrose, 1966; Goldberg
et al., 1967), spin-s harmonics sY^m(n) are defined. These form an orthonormal basis
for spin-s fields on the sphere, generalizing the ordinary spherical harmonics for s = 0.
To relate this to CMB polarization, note th at the quantity ( Q ± i U ) is a spin-(±2)
field, so we can expand:
Q(n) ± iU(n) = Y ( ~ a %n ^ i a ? m ) ( ± 2 Yt m ( n ) )
£m
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(U12)
9
It can be shown th a t this version of the E-B decomposition is equivalent to the tensor
version defined in Eq. (1.10).
The E-B decomposition of linear polarization is closely related to parity invariance.
Under the parity reversing operation n —> (—n), the E-mode component transforms
as:
afm
( - iA L
( i - 13 )
The same transformation law is obeyed by the tem perature multipoles a^m defined
in Eq. (1.1). On the other hand, the B-mode component transforms with an extra
minus sign under parity:
“fm -
( - 1)m “?rn
(1-14)
In the tensor formalism (Eq. (1.10)), one can also think of this as follows: a polariza­
tion field IIab(n) is an E-mode if it can be written as S^cj) for some scalar potential
0(n). A B-mode can be written B ^ ip for a pseuodscalar field
n).
Power spectra C f E , C EB are defined for polarization (in analogy with Eq. (1.2))
by:
(afm a£'m')
=
('alm af'm ')
=
=
C f E ^ll^m m '
0
(L15)
This is the most general two-point correlation which is invariant under rotations and
parity (a nonzero EB power spectrum would transform as C EB —►(—C E B ) un­
der parity reversal and is therefore disallowed). In Fig. 1.3, we show the EE and
BB power spectra th at are predicted for the WMAP fiducial cosmology, with one
modification: we have included a gravity wave component in the initial conditions,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
10
1000
Temperature
100
w
10
1
■S' o .i
E -m o d e polarization
0.01
0.001
: B -m o d e polarization
0.0001
10
100
Multipole 1
1000
Figure 1.3: Predicted TT, EE, and BB power spectra for the WMAP3 fiducial cosmol­
ogy modified to include a gravity wave component with T / S = 0.2. The BB spectrum
contains contributions from gravity waves (the larger contribution for I < 100), and
gravitational lensing of the primary E-mode signal (larger for t > 100).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
11
with amplitude T / S = 0.2, for reasons th at will be apparent shortly. (We param­
eterize the amplitude of the gravity wave component by the tensor-to-scalar ratio
T / S = (|h £ |2 + |h£ |2)/7^k|2) in the early universe.)
It is seen th at the E-mode power spectrum is 2-3 orders of magnitude larger than
the B-mode spectrum. This can be understood intuitively from parity considerations.
The CMB polarization consists entirely of E-modes provided th at two assumptions
are satisfied:
1. The processes giving rise to the CMB anisotropy can be described in linear
perturbation theory,
2. The initial fluctuations can be represented by scalar quantities.
B-modes can be generated when one of these assumptions breaks down. For the first
assumption, there are several nonlinear phenomena which make small contributions;
on the angular scales of Fig. 1.3, the most im portant is gravitational lensing, i.e. small
deflections of the E-mode signal due to gravitational potentials between the observer
and the surface of last scattering (Zaldarriaga & Seljak, 1998) This generates the
broad B-mode signal centered at i ~ 1000.
The second assumption breaks down if the early universe contains gravity waves
(or “tensor modes” ), which are a prediction of some early universe models. In this
case, the initial conditions include fluctuations dg^u in the spacetime metric which
are not scalar quantities and generate both E and B modes in CMB polarization
(Seljak & Zaldarriaga, 1997; Kamionkowski et al., 1997a). In Fig. 1.3, the B-mode
signal at i ~ 100 is generated during recombination by the gravity wave background.
(The second bump at i < 10 is also generated by gravity waves, but in the late
universe after reionization has occured.) CMB polarization will ultimately have better
sensivity to gravity waves than any other flavor of cosmological data, because the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
12
gravity wave contribution is “separated out” into its own observable, namely the BB
spectrum, where the only contaminating signals are from second-order effects such as
gravitational lensing.
1.3
Scope o f thesis
On the observational front, we are still in the early stages of measuring CMB po­
larization. Detection of a nonzero signal has been achieved by several experiments
(Kovac et al., 2002; Readhead et ah, 2004; Barkats et al., 2005; Montroy et ah, 2006).
The strongest detection to date excludes zero at the lOcr level (Sievers et ah, 0900).
Looking further ahead, the next decade should bring precision measurements of polar­
ization power spectra, just as the previous decade has brought precision measurements
of the tem perature spectrum. Many challenges, both instrumental and algorithmic,
must be met along the way. This thesis is concerened with the algorithmic compo­
nent: analysis methods which are used to obtain measurements of EE and BB power
spectra, starting from raw data written out by real-world experiments.
In the first half (§2) of the thesis, we describe the design and implementation
of the analysis pipeline for the 2005 observing season of CAPMAP. CAPMAP is
an experiment targeting the small-scale E-mode component of the polarization, by
surveying a ~ 8 deg2 region near the north celestial pole (NCP), with ~ 3 arcmin
angular resolution. This high resolution was achieved through use of the Lucent 7meter Cassegrain telescope at Crawford Hill, New Jersey. The CAPMAP detector
array consists of 16 correlation polarimeters with a total integrated sensitivity of
~ 250 nK, independently in W -band (90 GHz) and Q-band (40 GHz) frequency
channels.
The CAPMAP analysis pipeline is based on the likelihood formalism, which has
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
13
the advantage of giving optimal uncertainties on the final power spectra (and on
intermediate quantities, such as pixel-space maps).
The disadvantage of the full
likelihood treatem ent is th at the computational cost rises steeply with larger datasets
(for example, many dense matrix operations are needed with cost O(N^)).
The
CAPMAP dataset is small enough th at a likelihood-based pipeline is still relatively
inexpensive, but in upcoming generations of CMB polarization experiments, this will
no longer be the case, and fast approximate Monte Carlo based pipelines will be a
practical necessity.
Monte Carlo pipelines for CMB polarization suffer from a fundamental problem:
existing fast methods used to estimate E-mode and B-mode power spectra do not
preserve the E-B decomposition. Therefore, the presence of the much larger E-mode
signal can act as a source of noise when estimating B-modes. It was first shown by
Challinor & Chon (2005) th at this effect can become the limiting source of uncer­
tainty when measuring the BB power spectrum, when the instrumental sensitivity
becomes sufficiently small. On the other hand, BB power spectrum measurements
are a primary scientific target for future polarization experiments: the gravity wave
B-mode will open a new window on the initial conditions of the universe, and the
lensing B-mode will probe the growth of gravitational potentials on the largest ob­
servable scales. Therefore, this “E —> B mixing problem” poses a serious practical
concern for future experiments.
The second half of the thesis is concerned with solving this problem. We first
(§3) modfiy the existing fast power spectrum estimator construction so th at the E-B
decomposition is preserved. Using the modified estimators, the larger E-mode signal
does not contaminate estimates of the B-mode power spectrum; instrumental noise
and systematics are the only contaminants.
This construction solves the E —> B mixing problem in principle, but in prac­
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
14
tice one must confront a practical issue. The construction calls for a (scalar) weight
function W (x), and vector/tensor counterterms Wa(x), Wic(x) which combine to de­
termine the statistical weight of each pixel in the experiment, and must satisfy dif­
ferential equations which are difficult to interpret in an irregular pixelization on the
sphere. For the noise distribution of a realistic experiment, which may include irreg­
ular boundaries from Galactic or point source contamination, and ragged features in
the noise, it is not clear how to choose these weight functions in practice. (W ithout
our modification, a similar practical problem exists in Monte Carlo CMB analysis
pipelines, but is somewhat more tame, since only the scalar piece W (x ) must be
chosen.)
In §4, we propose an algorithmic approach for choosing weight functions based
on the noise distribution. Combined with the modified estimators from §4.4, this
provides a complete solution to the E —> B mixing problem which can be applied
to realistic surveys maps with complex features. We demonstrate the method for a
simulated balloon experiment which includes small-scale fluctuations in the noise and
point source masking, and show th at strong E-B separation is possible. We conclude
th at E —►B mixing in the analysis need not be a limiting factor for future surveys.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 2
ANA LY SIS OF 2005 DATA FRO M C A P M A P
2.1
Introduction
CAPMAP is a collaboration between Princeton University, the University of Chicago,
the University of Miami, and JPL (Jet Propulsion Laboratory) to measure the polar­
ization of the CMB using the 7m telescope in Crawford Hill, New Jersey.
The 2005 CAPMAP d ata consists of 1800 hours of data taken between November
2004 and May 2005. Two independent analyses of this dataset were carried out at
the University of Chicago and Princeton University. In this chapter, we will describe
the design and implementation of the Chicago pipeline. At the time of this writing,
results from the analysis have not yet been finalized. Therefore, we will illustrate the
pipeline by presenting results from a simulated full-season dataset. However, for some
intermediate results (such as the null test suite in §2 . 1 0 ), the analysis is sufficiently
finalized th at results from the real data can be presented.
2 .1 .1
F ocal p l a n e a n d s c a n s tr a te g y
The CAPMAP focal plane is shown in Fig. 2.1. Twelve W -band (90 GHz) polarimeters and four Q-band (40 GHz) polarimeters in four dewars are positioned at fixed
locations relative to the focal plane center; the entire focal plane is then swept across
the sky in a regular pattern. Due to the 7m telescope size, the angular resolution is
excellent and the beam shapes are well-controlled: to a good approximation, each polarimeter’s beam can be treated as Gaussian, with angular size $FWHM = 3.3 arcmin
in W-band, and
#FW H M
= 6.48 arcmin in Q-band.
The sky region surveyed by CAPMAP is a small patch of sky near the north
15
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
16
2Q
2D
2B
2C
OD
( ^ ) 3b ( .055 deg
.45 deg
OC
OQ
3C
3Q .108 deg
,8 deg
3D
OB
1C
1D
1B
1Q
Figure 2.1: CAPMAP focal plane.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
17
Figure 2.2: CAPMAP scan strategy. In one 21 second scan cycle, each polarimeter
scans a circle near NCP; full azimuthal coverage is obtained every 24 hours from the
E arth’s rotation.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
18
celestial pole (NCP). The scan strategy is a “ringscan” : the focal plane center under­
goes uniform circular motion (with scan period
21
seconds) along a fixed circle, with
radius 0.725 deg, approximately centered at NCP. The orientation of the focal plane
is fixed in telescope coordinates; thus relative to the telescope, each polarimeter only
“sees” a thin circular trajectory in each cycle. However, due to the E arth ’s rotation,
the coverage of each polarimeter in celestial coordinates sweeps out full azimuthal
coverage every 24 hours. (In telescope coordinates, one can imagine the sky slowly
rotating around NCP.) In celestial coordinates, the coverage of each polarimeter is
therefore an annular region centered at NCP. This is illustrated in Fig. 2.2. The total
sky area surveyed by CAPMAP is 7.4 deg 2 in W -band and 9.2 deg 2 in Q-band.
2 .1 .2
P o la rim e te rs
As the CAPMAP focal plane scans the sky, data from each of the 16 CAPMAP
polarimeters is archived at a sampling rate of 104 Hz. Each polarimeter takes noisy
measurements of the quantity
U = (2Ex (n)Ey (n)}
(2.1)
where n denotes the instantaneous sky direction pointed to by the polarimeter, E x , Ey
denote the components of the incoming electric field due to the CMB in the x, y
directions relative to the polarimeter. In the polarimeter frame, only the Stokes U
parameter is measured; but full Q /U coverage “on the sky” is obtained every
6
hours
due to the E arth ’s rotation.
Rather than writing a single 104 GHz data stream from each polarimeter, several
streams are written which contain more detailed information. First, each polarimete r’s frequency range is split into subranges. The W -band polarimeters are sensitive
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
19
to CMB polarization at frequency
uq
a 92 GHz, with bandwidth A u ~ 15 GHz. This
frequency range is split into low, medium, and high subranges with A u ~ 5 GHz.
In Q-band, the polarimeters are sensitive to
uq
~ 40 GHz with bandwidth A u « 10
GHz; this is split into low and high subranges with A u « 5 GHz.
Second, each polarimeter takes total power measurements in addition to polariza­
tion. There are two total power channels in each polarimeter, which measure (E%)
and (Ey) in the detector frame. In principle this includes contributions from the CMB
tem perature anisotropy, but the CAPMAP total power channels are only intended
to monitor the sky tem perature and do not have sufficient sensitivity to measure the
CMB tem perature power spectrum.
Finally, in each polarimeter, we generate “quadrature” versions of each of the po­
larization and total power channels previously described. The quadrature channels
are constructed in such a way th at they have zero response to sky polarization or tem­
perature, but share many of the noise properties of the ordinary (or “demodulated”
channels). To explain this in a little more detail, we remark th at the demodulated
channels are obtained by removing a 4 kHz square wave modulation th at is applied
early in the hardware in order to minimize ( 1 //) drifts. The quadrature channels are
obtained by putting the input and output 4 kHz square waves 180° out of phase, so
th at the combination is equivalent to modulating the sky signal with an
8
kHz square
wave, more than rapid enough to destroy any signal from the sky.
Thus the total number of 104 Hz data channels, including demodulated and
quadrature data, is ten for each W -band polarimeter and eight in Q-band.
The
channels are listed in Tab. 2.1, where we also introduce notation used to identify each
channel: a 2-character polarimeter name (e.g. 0Q or OB, see Fig. 2.1), followed by
a 2-character channel name (S0/S1/S2 for polarization channels or D0/D1 for total
power). We denote quadrature channels with a subscript “q” .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
20
0BS0
0BS1
0BS2
0BD0
0BD1
0BS0q
0BSlq
0BS2q
OBDOq
OBDlq
0QS0
0QS1
0QD0
0QD1
OQSOq
OQSlq
OQDOq
OQDlq
W -band polarimeter OB
Demodulated polarization, 87 ± 2.5 GHz
Demodulated polarization, 92 ± 2.5 GHz
Demodulated polarization, 97 ± 2.5 GHz
Demodulated total power, (E^) component
Demodulated total power, (E^) component
Quadrature polarization, 87 ± 2.5 GHz
Quadrature polarization, 92 ± 2.5 GHz
Quadrature polarization, 97 ± 2.5 GHz
Quadrature total power, (E^) component
Quadrature total power, (Ey) component
Q-band polarimeter 0Q
Demodulated polarization, 37.5 ± 2.5 GHz
Demodulated polarization, 42.5 ± 2.5 GHz
Demodulated total power, (E^) component
Demodulated total power, ('Ey) component
Quadrature polarization, 37.5 ± 2.5 GHz
Quadrature polarization, 42.5 ± 2.5 GHz
Quadrature total power, (E^) component
Quadrature total power, (Ey) component
Table 2.1: List of CAPMAP data channels, illustrated for W -band polarimeter OB
and Q-band polarimeter 0Q.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
21
| Simulation |
I
|Cycle Finding|
1
|Data Selection!
1
| Map Making |
________________ /
|Map Combining!
I
X
I Null Maps |
• Likelihood A nalysis!
/
X
Null Test
Suite
Final Power
Spectrum
Figure 2.3: CAPMAP analysis pipeline
2.2
P ip elin e overview
The CAPMAP analysis pipeline is a maximum likelihood pipeline which obtains
optimal power spectrum uncertainties (Bond et al., 1998) given models for pointing
and instrumental noise. The pipeline stages are shown in Fig. 2.3 and are briefly
described as follows:
1. Simulation (§2.3): in this stage, we simulate the raw CAPMAP timestreams
in the same file format as the real data. The simulations are used strictly for
testing the analysis pipeline.
2.
Cycle finding (§2.4): in this stage, we partition the timestream into ringscan
“cycles” based on pointing, and discard data whose pointing is anomalous.
3. D ata selection (§2.5): we discard data which fails cuts for weather effects,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
22
glitches, sun contamination, and anomalous gains.
4. Map making (§2.6): all timestream data which survives the cuts in steps 2-3 is
processed into a pixel-space map with associated noise covariance matrix.
5. Likelihood analysis (§2.7): from the map and noise covariance computed in step
4, power spectrum uncertainties are computed using a likelihood function. The
likelihood function can either be maximized to give a point estimate for the
power spectrum, or explored with Markov Chain methods to give marginalized
errors on bandpowers.
6
. Combining maps (§2.8): this is a general-purpose program for combining two
maps, made from disjoint subsets of data (for example, all W -band data and
all Q-band data), which optimally weights the two datasets.
7. Null maps (§2.9): similarly, two maps made from disjoint subsets of data can
be subtracted to obtain a null map. If the two input maps are consistent, then
the null map should be “pure noise” , i.e. contain zero signal.
8.
Null test suite (§2.10): we create a suite of null maps based on dividing the
d ata in different ways, and perform a likelihood analysis on each null map, to
test for consistency with zero power.
9. Final results (§2.11): we conclude by presenting final results obtained from the
simulated dataset and comment briefly on their implications.
Before diving into the details, it will be helpful to describe a few general features
of the CAPMAP analysis which will be used throughout the pipeline: the models for
pointing, gains, and instrumental noise.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
23
2 .2 .1
P o in tin g
In each 104 Hz sample i, the CAPMAP encoder records instantaneous values of the
elevation Elj and azimuth Az?; of the focal plane center. Due to systematic errors
in telescope pointing, from effects such as telescope sag and atmospheric refraction,
the true pointing will deviate slightly from the encoded values. There is a pointing
model for the CAPMAP telescope which includes many such effects, and the free
parameters in the model can be fit from observations of known point sources outside
the near-NCP survey region. We will not describe the pointing model in detail here,
but simply remark th at for observations near NCP, it is an accurate approximation
to condense the model into a pair of offsets A (El), A(Az) such that:
Elj (true)
=
Elj (encoder) —A(E1)
Azj(true)
=
Azj (encoder) —A(Az)
(2.2)
The pointing offsets in Eq. (2.2) can be applied at a very low level in the analysis
pipeline (e.g. when the samples get read from disk) and so the full complexity of the
pointing model can largely be ignored once the offsets have been computed. Whenever
we write “Elj” or “Azj” in the remainder of this chapter, it is assumed th at the
pointing offsets in Eq. (2.2) have been applied. The actual values of the offsets are
small, ~ 0.9 arcmin. Note th at Eq. (2.2) implies th at the many-parameter pointing
model is degenerate near NCP (the same is true in any small patch of sky), which is
one reason why it is necessary to fit the model using point source observations outside
the survey region. A second reason is th at the survey region has deliberately been
chosen to be free of point sources which would contaminate the CMB observations.
The encoder also writes a timestamp for each 104 Hz sample (in Universal Coor-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
24
dinated Time, UTC). Prom the sample timestamp, azimuth, and elevation, and the
positions of the polarimeters in the focal plane, the location of the polarimeter in
celestial coordinates (right ascension and declination) and its polarization angle can
be computed. In the analysis pipeline, we make the approximation th at the polarime­
ter measures a linear combination (determined by the polarization angle) of Stokes
parameters in whichever Healpix (Gorski et ah, 2005) pixel contains the polarimeter
location. (This approximation breaks down if the pixels are comparable to or larger
than the beam size; in §2.7 we will use a pixel window function to correct for the
effects of this approximation.) We typically use Healpix resolution N ns[^e = 1024,
corresponding to 3.5 arcmin pixels.
We introduce the pointing matrix Pip, where i indexes a timestream sample and
p indexes a sky pixel and choice of Stokes parameter, defined as follows:
Pip = contribution to sample i from pixel value p
(2-3)
By the discussion in the preceding paragraph, Pip is a sparse matrix with two nonzero
entries in each row, corresponding to the two Stokes parameters in the pixel containing
the instantaneous polarimeter location. As we will see in §2.6, it is convenient to
introduce the pointing matrix because the map making procedure becomes more
transparent in m atrix notation, e.g. the relation between the timestream data di and
the sky map sp can be compactly written:
di = PipSp + noise.
(2-4)
As the focal plane moves through its ringscan orbit, we define the ring angle p as
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
25
follows:
f
f™
\ Az i ~ —
Az oj \
p def+
= tan - i ^costE
li)—
( 2 .5 )
where Azq and E1q denote the azimuth and elevation of the ring center . The scan
direction is such th at p is increasing throughout the cycle.
2 .2 .2
G a in m o d e l
The raw d ata from the CAPMAP polarimeters is encoded as voltages, and each chan­
nel has an overall gain which relates the measured volatages to the sky polarization
in microkelvin.
There is a CAPMAP gain model which gives the gain of each channel as a func­
tion of time; in the pipeline, we write the gains to an auxiliary file as a preprocessing
step. We will not describe the gain model in detail, but just make a few brief com­
ments. At various points in the observing season, calibration data was taken, either
from observations of the known point source Tau A, or a nutating conductive plate
positioned in front of the secondary mirror. (Roughly consistent gain measurements
are obtained from both calibration sources.)
Empirically, we find three distinct gain periods in the data; some of the channels
experience abrupt gain changes from one period to the next. Throughout each gain
period, the time dependence of each channel’s gain is well-approximated by a fit of
the form (Go + G\t). We fit the calibration measurements independently in each of
the three gain periods.
2 .2 .3
N o ise m odel
First, we describe the CAPMAP noise model, which will be an im portant ingredient
throughout the pipeline. We find that, to an excellent approximation, the instrumen-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
26
Temperature
(mK)
20
.00
60
40
-5
-1 0
Figure 2.4: Raw CAPMAP timestream data, downsampled to 1 Hz and plotted over
an arbitrarily chosen 1 0 0 sec interval, showing the sinusoidal atmospheric signal on
top of white noise fluctuations.
tal noise is uncorrelated between channels. Therefore, it suffices to consider a single
channel in isolation.
In a single channel, we denote the timestream noise by nj, where the index
i = 1 , 2 , . . . , N toci runs over timestream samples.
We will assume th at the noise
is Gaussian, so th at the noise model is completely specified by the assumed noise
covariance N^j = Cov(rq, rij). The simplest possible noise model would be to assume
no correlations between samples i ^ j :
(A fo )ij
=
(2 .6 )
where we have allowed a time-dependent variance of.
The first deviation from the noise model in Eq. (2.6) th at is evident in the data is a
~ 1 pK sinusoidal atmospheric signal, shown in Fig. 2.4. (Although the atmosphere is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
27
expected to be unpolarized, the small level of tem perature leakage in the radiometers
combines with the large sky tem perature deviations over the ringscan, to create a
significant signal.) We can incorporate this in the noise model by adding a term
which assigns a huge variance to any “ring mode” , such as the cos(p) mode shown in
Fig. 2.4, which is to be marginalized.
5 -/V cycles
{NR)ij — ° f ^ij + e
R-iaRja
(2-7)
a= 1
where e is a small regulating constant. In Eq. (2.7), the index a runs over independent
ring modes which are to be marginalized, and R l0 denotes the value of mode a at
time i. In each ring cycle, we marginalize the lowest five Fourier modes in the ring
angle p: an additive constant (representing the DC level which is arbitrary in each
radiometer), sin(p), cos(p), sin(2p), and cos(2p). These five modes are independently
marginalized in each cycle; therefore the total number of marginalized ring modes is
(57VCyCjes) as indicated in Eq. (2.7).
Intuitively, marginalizing ring modes in each cycle amounts to a form of high-pass
filtering, which eliminates any slowly varying mode in the data. (We will see how this
works in more detail in §2.6.) Therefore, in addition to removing the atmospheric
contamination, marginalizing ring modes in the noise model does “double duty” by
filtering out ( 1 / / ) noise.
Let us briefly mention one subtlety here: the assumption th at the noise is uncor­
related in different channels breaks down for the lowest Fourier modes in each cycle.
This is because some contributions, such as the sky temperature, are common to all
channels. However, because these modes are marginalized, their values are completely
filtered out, and we can treat the noise as uncorrelated between values (since this is
true for all modes which are retained).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
28
There is still one more ingredient in the CAPMAP noise model which we will add
to Eq. (2.7). We marginalize any timestream mode which is ground-synchronous, i.e.
which has the same dependence on ring angle p in all cycles in a contiguous run. (This
ground-synchronous structure was an im portant contaminant in the 2003 CAPMAP
dataset (Barkats et al., 2005), and much smaller in the 2005 dataset analyzed here,
but can still be detected at many sigma using the null test suite to be described in
§2 . 10.)
To marginalize ground-synchronous modes, we divide the range of values for the
ringscan angle p into
= 108 “ring bins” between p = 0 and p = 2ir. Each ring bin
then corresponds to one mode which will be marginalized. More precisely, we take
the noise covariance m atrix to be:
5 ^ c y c le s
Nij = ai^ij
+
6
^ rb
RiaRja
1
a= 1
+
e
1 ^
SibSjb
(2 .8 )
6=1
The last term corresponds to ground-synchronous mode removal; the index b ranges
over ring bins, and the m atrix element S $ is equal to one if timestream sample i falls
in ring bin b, and zero otherwise.
Eq. (2.8) is the final noise model th at we will use when analyzing CAPMAP. The
per-sample variance af, which is time-dependent and different in each channel, is not
computed from a model, but rather obtained from the data, by assuming th at cr| is
constant in each ringscan cycle, and fitting its value independently for each cycle and
polarization channel.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
29
2.3
Sim ulation
The CAPMAP simulation pipeline produces binary data files in the same format as
the raw data. The telescope pointing and timestamps in the simulations can either
have simulated ideal values (with specified sampling rate, scan radius,center, and
speed)
or“tem plated” from the real data, in order to include realistic deviations
from ideality such as wobbly orbits and dropped samples. The simulated signal is the
sum of several contributions which we now describe in detail.
First, there is a CMB signal which is simulated by first randomly generating a
Gaussian CMB in harmonic space:
afm — Gaussian, mean=0, varianc e = C ^ E
afm = Gaussian, mean=0, varianc e = C f B
(2.9)
then taking the spherical transform to obtain the CMB in pixel space:
(Q ± iU)(x) =
T ia?m)(± 2 U m (x))
(2.10)
lm
We simulate the CMB at very high pixel resolution (Healpix
= 4096, corre­
sponding to 0.9 arcmin pixels) so th at no correction for the pixel size is necessary, but
we do include convolution with the Gaussian beam, by using beam convolved power
spectra when simulating the a^TO:s in Eq. (2.9).
convolved = i ^ l )unconvolved x exP ^
8~ln~2
^
(Recall from §2.1.1 th at we treat each CAPMAP beam as Gaussian with width
#FWHM = 3.3 arcmin in W -band and #f\VH M = 6.48 arcmin in Q-band.)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
30
When we include the CMB component in the simulated timestreams, we simply
use the value of the CMB at the center of whichever pixel contains the detector during
the timestamp of the sample. (That is, no interpolation is performed.) However,
we do perform parallel translation between the pixel center and detector location, to
avoid artifacts when the detector passes close to NCP. We checked th at this simulation
procedure has converged: if we double the resolution and rerun the end-to-end analysis
pipeline with identical RNG seeds, the effect on the final power spectrum is at most
1 0 - 3 cr.
In addition to the CMB component, the simulations include white noise, obtained
by adding an independent Gaussian random value to each sample. The noise level is
radiometer-dependent (and based on fits to data) but assumed constant in time.
2.4
C ycle finding
The simulation pipeline described in the previous subsection is not needed in principle
to analyze the data, but is invaluable in practice for code testing. Beginning with
this section, we describe the analysis pipeline which starts from the raw time-ordered
data, and ultimately produces measurements of polarization power spectra, as well
as several consistency checks. The first step is simple: we divide the timestreams to
be analyzed into a sequence of ring cycles.
We define cycles to start and end when the ring angle p (Eq. (2.5)) crosses zero.
Thus all cycles are kept “in phase” . The cycle-finding program contains pointing cuts:
a cycle is discarded if it contains one or more bad samples. A sample is considered
“bad” if any of the following are true:
1. The encoder UTC or az/el shows an “obvious” glitch (e.g. there are a few onesample events where UTC=0 was encoded). This removes a negligible fraction
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
31
of the data.
2. The distance between the encoder pointing and ring center differs from the
intended ring radius by more than 0.06 deg.
3. The time difference to the previous sample exceeds 0.02 sec, indicating dropped
buffers.
4. The value of p is less than or equal to the value from the previous sample. This
test catches extended periods where the telescope was stopped.
5. The value of p increased, relative to the value from the previous sample, by
more than
0 .8 6
deg.
Additionally, a cycle is cut if the final population count n{ for any ring bin i is outside
the range 0 .6 ( 2 1 8 8 /^ ^ ) < nl < 2.0(2188/A/ring), where N r-Ulg is the number of ring
bins.
The pointing cuts remove about 4% of the data, most of which consists of stopped
telescope motion.
2.5
D ata selection
In the next stage of the pipeline, each cycle found in the preceding stage is inspected,
and cuts for anomalous behavior are imposed. It is outside the scope of this thesis to
describe the data selection in complete detail, so we simply outline the procedure:
1.
Anomalous total power: we fit each CAPMAP total power channel to a linear
combination of the lowest five Fourier modes in the ring angle p, i.e., the fit is
of the form
Aq + A \ cos(p) + A 2 sin(p) + A 3 cos(2p) + A 4 sin( 2 p)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(2.12)
32
We sum the x 2 °f this fit over the nine most stable total power channels to
obtain a “grand x 2” value, and discard cycles whose grand x 2 is anomalously
high. The grand x 2 is a monitor for execess variance in the total power channels
as a monitor for excess variance in the total power channels which would indicate
bad weather or other sources of flaring up.
2
. Anomalous polarization: analogously, we fit Eq. (2 . 1 2 ) to the data in each
polarization channel, and sum over channels to obtain a polarization “grand
X2” .
In this case, we cannot cut too tightly without introducing bias, since the
cut is based on the same polarization channels used for the power spectrum
analysis, but simulations show th at a 3cr cut is safe.
3. Glitches: we exclude a cycle if the maximum excursion (maximized over samples
and channels) from the five parameter fit in Eq. (2.12) is anomalously high,
indicating a “glitch” from instrumental sources or isolated events such as birds
or airplanes.
4. Sun: we identified regions of data in which sidelobes from the Q-band beam
pass close enough to the Sun th at significant excursions are seen. The sun cuts
have 24-hour period, are only in Q-band, and are dewar-dependent.
5. Anomalous gains: several regions were cut by hand in which the gain of a
specific polarimeter was found to have an anomalous value.
The output is a set of “surviving” cycles together with a 44-bit mask in each cycle
indicating which radiometers are considered valid during the cycle. (In principle, the
pipeline allows the cuts to be independent in each channel, although in practice the
channel dependence is small, arising only through the sun and anomalous-gain cuts.)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
33
2.6
M ap m aking
Each CAPMAP timestream can be decomposed into signal + noise as follows:
di — PipSp + 7ii
(2.13)
Here, didenotes the timestream values, represented as a vector oflength N tocp The
CMB signal in pixel space is represented as a length-(2Apjx)vector
sp , where the
index p denotes a combination of a pixel and choice of Stokes parameter. The pointing
matrix Pip was defined in §2 .2 . 1 : it encodes the contribution to timestream sample i
due to the pixel value sp .
The noise rij is modeled as Gaussian, with noise covariance Njj = Cov(rq, rij)
given by Eq. (2.8):
N = N 0 + e~ 1R T R + e - 1S T S
(2.14)
where Nq is diagonal, and the matrices R, S contain ring modes and ground-synchronous
modes to be marginalized.
The objective of map-making is to invert Eq. (2.14) statistically, arriving at an
estimator f p for the CMB signal, with associated noise covariance matrix
Npq =f Cov(sp ,Sq)
(2.15)
It is well-known (Tegmark, 1997) th at there is an optimal choice of estimator 'sp,
defined by the pair of equations:
=
V
p* N v dJ
=
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(2-16)
34
In the map making stage of the pipeline, we compute s'p and Npq, starting from
the timestream data dt, the pointing data Pipi and the noise model in Eq. (2.14).
If dense m atrix algebra with fVrto(j-by-jVtocj matrices were computationally feasible,
then map making would be essentially trivial by direct use of Eqs. (2.16). The rest
of this section is devoted to presenting an algorithm for computing s' and N which is
mathematically equivalent, but whose computational cost is linear in the timestream
size N tod.
The (^(A^qj ) algorithm proceeds in three steps.
First, consider the following
matrix and vector quantities:
P N q 1P T
pn
^
s
1,
sn
0\
st
PN^d
SN ^d
P N q 1R t
R N q 1R t
SN ^
r
T
RN ^d
Note th at the first two are identical to the inverse noise N ~
(2.17)
1
and inverse noise
weighted map N - 1 s' defined in Eqs. (2.16), except th at the full noise covariance
N has been replaced by the diagonal version N q. (The motivation for considering the
remaining 7 quantities will be explained shortly.)
Because N q ^ is diagonal, and the matrices P, R, S are sparse, it is easy to see th at
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
35
each of the quantities in Eq. (2.17) can be computed in time 0 ( N t0ij). In implemen­
tation, we read each timestream sample from disk and accumulate its contribution
before reading the next sample; the contibutions from each sample are independent.
In the second step, we
iV” 1
define an inverse noise covariance
= lim(A^o + e“ 1JRTi ?)“ 1
= N q 1 - P N q 1R t {RN q 1R t ) - 1R N q 1P t
(2.18)
in which ring modes have been marginalized but not ground-synchronous modes. Now
consider the matrix and vector quantities:
P N ~ l P T
=
P
P
=
P N q 1S 7
N
^ S 7
SN ^S7
N
^ P
7
- P N q 1R T ( R N q 1R T ) ~ 1R N q 1P T
-
P
N q 1R
7
( R N q 1R
7 ) ~ l
R N q 1S
T
= S N q K s 7 - S N q 1R 7 ( R N q 1R T ) - 1R N q 1S T
P N ~ l d
=
P
N
^ d - P
N
Q
1R T ( R N ^ 1R 7 ) - 1R N Q 1d
SN ^d
= S N ^ d - S N Q 1R 7 (RN Q 1R 7 ) - 1R N Q 1d
(2.19)
On the right-hand side, we have written each quantity as the “diagonal” version
accumulated in the first step, minus a “ring mode correction” which can be expressed
in terms of diagonal quantities accumulated in the first step (Eq. (2.17)). It is easy
to see th a t the ring mode correction is independent for each cycle, and for the m atrix
quantities listed, amounts to a rank-5 update of a larger matrix. In implementation,
we apply the ring mode correction at the end of each cycle, to update the diagonal
quantities in Eq. (2.17) to the quantities in Eq. (2.19) which fully incorporate ring
mode removal.
Finally, for the third step, we consider the full CAPMAP noise model with ring
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
36
mode and ground-synchronous mode removal:
N~l
=
(N R + e~1S T S r
=
N r 1 - N ^ S t {£ + S N ^ S ^ S N j i 1
1
(2.20)
We have kept a small nonzero regulating constant e in the definition of 7W 1, rather
than taking the limit e —> 0 as in Eq. (2.18), for technical reasons: the operator
( S N ^ 1S t ) appearing on the right-hand side is not invertible, due to the presence of
a few timestream modes (such as a mode which is constant over the entire timestream)
which are removed by both ring and ground-synchronous mode removal.
Now consider the quantities:
PN~^Pt
= P N ]i1P T - P N - 1S T (e + S N - 1S T ) - 1S N ~ 1P T
(2.21)
P N ~ l d = P N ~ l d - P N ^ S T (e + S N ^ 1S T ) ~ 1S N ~ 1d
which incorporate both ring mode and ground-synchronous mode removal, and are
the output (Eq. (2.16)) of the map making pipeline stage. We have w ritten each
quantity as the “ring mode only” version accumulated in the second step, minus a
“ground-synchronous mode correction” which can also be computed from quantities
accumulated in the second step (Eq. (2.19)). In implementation, we apply the uni­
versal structure correction at the end of each contiguous series of raw data files, to
update the ring mode removed quantities in Eq. (2.19).
In summary, we have arrived at a three-stage algorithm for computing the opti­
mal map and associated noise covariance matrix in Eq. (2.16) with ring and groundsynchronous mode removal. We iterate over 104 Hz samples, accumulating the (in­
dependent) contribution of each sample to the “diagonal” quantities in Eq. (2.17),
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
37
which do not include mode removal. Whenever the end of a ringscan cycle is reached,
we use Eqs. (2.19) to update the diagonal quantities to their “ring mode corrected”
versions which incorporate ring mode marginalization but not ground-synchronous
marginalization. Whenever the end of a contiguous series of data files is reached,
we use Eqs. (2.21) to update the ring mode corrected quantities to their groundsynchronous corrected versions which incorporate the full CAPMAP noise model.
The final output is the optimal map and noise covariance m atrix given in Eq. (2.16).
This three-stage algorithm can be implemented efficiently; we find a running time of
25 CPU-min per 100 hours of raw data which is processed.
2.7
Likelihood analysis
Given a map and associated noise covariance (N , m ), which could either come di­
rectly from the map-making program (§2 .6 ) or from subsequent map combining and
differencing operations (§2.8, 2.9), we estimate the underlying power spectra using a
likelihood analysis. At an intuitive level, the likelihood analysis amounts to “looking
for excess covariance” in the data (beyond the noise covariance N ), by exploring the
posterior likelihood
C [ C f E , C f B , C f B \m]
(2.22)
for the polarization power spectra, given the observed map m. It can be proved (Bond
et al., 1998) th a t the likelihood analysis we will describe is an optimal procedure
for estimating power spectra, in the sense th at statistical lower limits on the power
spectrum uncertainties (the Cramer-Rao inequality) are saturated.
First, we need an expression for the signal covariance due to an assumed set of
power spectra C E E , C BB, C E B . This is computed in Appendix C, Eqs. (C.14), (C.15),
where the covariance between any pair of pixels x , y is computed in the “two-point”
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
38
frame in which the Stokes basis at each of the points x , y is aligned with the great
circle direction between the two points. Looping over each pair of pixels x, y, and
rotating back to the “global” Q /U basis, results in a (2 Np[x )-by- (2 N p[x) signal covari­
ance m atrix S which depends on the power spectra. (We mention one optimization:
in implementation, we sample the correlation functions with arcminute spacing in
6Xy = cos- 1 (x • y), and evaluate at the desired separation using cubic interpolation.
This allows us to compute signal covariance matrices “on the fly” in a few CPUseconds whenever needed throughout the analysis pipeline.)
Next, let us first write the likelihood £[m|5'] of obtaining the noisy map m, given
noise covariance N and signal covariance S. This likelihood is simply a multivariate
Gaussian with zero mean and covariance given by (S + N):
£[m\S] = det _1/ 2(5 + N ) exp
+ N ) ~ lrm ^
(2.23)
Using Bayes’ theorem, this can be inverted to give the posterior likelihood for a given
signal covariance S, given the noisy map m:
£[S'|m] oc det _ 1 / 2 (S' + N ) exp ^ - - r n ^ ( S +
P[S]
(2.24)
We have introduced a prior P[S] and used the symbol oc to indicate th at the quantity
on the right-hand side should be normalized to have integral 1 .
The likelihood function in Eq. (2.24) can be used to estimate power spectra and
assign power spectrum uncertainties, as we will now see.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
39
2 . 7 .1
B a n d p o w e r e s t i m a t e s : m a x i m u m lik e lih o o d
The first application of the likelihood function C[S\m] (Eq. (2.24)) is simple: byfinding the power spectra which maximize the likelihood, we can obtain estimates
for the power spectra given the noisy data m.
Since the small CAPMAP survey
area limits its spectral resolution to A I ~ 300, we parameterize the power spectra
by a small number of bandpowers A^; our default choice is to take 4 bandpowers for
each of { E E , B B , E B } , corresponding to the angular ranges (^mjn, £m&x) — (2, 500),
(500,800), (800,1100), and (1100,1400). Thus the signal covariance matrix S can be
written:
S = Y ,& b S b
b
(2.25)
where S is the signal covariance due to unit power (i.e. A^ = 1) in the single EE, BB,
or EB band indexed by b. By default, we use flat bandpowers: “unit power” means
th at £(£ + l)Cf/(2ir) is equal to 1 throughout the band. In EE, we occasionally use
bandpowers which are multiples of fiducial: “unit power” means th at
has the
predicted values (see Fig. 1.3) throughout the band. This allows us to easily check
consistency with the theoretical prediction: the measurement and theory agree if Aj, =
1 (within uncertainties) for EE bandpowers, and A^ = 0 for EB /BB bandpowers.
Substituting Eq. (2.25) into Eq. (2.24), the posterior likelihood for the bandpowers
A 5 can be written
logC[A\m] = ( —~Tr log(S + N) - ^ d T (5 + fV)- 1 ^
(2.26)
where we have assumed a uniform prior P[A^] in the bandpowers, and dropped an
overall constant on the right-hand side.
In this formalism, estimating the power
spectrum is equivalent to maximizing the multivariate function log £[A|m] in the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
40
variables A 5 . Several approaches to this maximization problem have been proposed
in the literature; as we now explain, we use a new approach.
Our basic strategy is to use the BFGS maximization algorithm (Press et al., 1992),
which requires two ingredients: first, an efficient procedure for computing the gradient
5(log£)/3A ^; and second, a procedure for maximizing log £ when restricted to an
arbitrary line in parameter space, i.e. finding the value of t which maximizes the
quantity
log£(f) = log£[A^0) + t A ^ }
(2.27)
where the vectors A ^°\ A ^ are arbitrary.
For the first ingredient, computing the gradient of log £ , straightforward differen­
tiation of Eq. (2.26) gives:
31og£[A|m]
dAb
ilY ( (5 + JV)-1^ ) + \ ™ T (S + N ) - l S b(S + N T ' m
(S + N)
1
+ mmT
(2.28)
where in = (S + A )_ 1 m.
In the second line, we have written the gradient in the form Tr(MS),); once the
matrix M has been computed, this can be computed in a few CPU-seconds using
the cubic interpolation trick from the beginning of §2.7, without needing to store the
matrices S &in dense form. (The linear operation M —> TV(M S 5 ) is the transpose of
the linear operation A& —>
^b^b- and any efficient algorithm for the latter can be
converted formally to an efficient algorithm for the former.) The computational cost
of computing the gradient in Eq. (2.28) is therefore dominated by the single dense
m atrix operation (S + N) —> (S + iV)- 1 .
The second ingredient needed for the BFGS algorithm, solving the single-variable
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
41
maximization problem in Eq. (2.27), initially appears to require many evaluations of
the likelihood function, but we introduce a trick. We compute the dense matrices
SS
=
(2.29)
b
S0 + N
=
( ^ A < 0 )s ( j + J V
(2.30)
and then compute relative eigenvalues Xt and eigenvectors vt , defined by
{SS)vi
= Xi (S0 + N ) v i
(2.31)
with normalization
v f ( S 0 + N)vj = 5ij
(2.32)
We also compute the components /i? = rnTvl of the map m in the eigenbasis. A short
calculation now shows th at the ID likelihood in Eq. (2.27), obtained by restriction to
a line in parameter space, is given by (up to an irrelevant overall constant):
logc(t) =
^iog(i + tXi) + Y + t x l j
(2-33)
This shows th at a single dense m atrix operation (computing relative eigenvalues and
eigenvectors) is sufficient to write the likelihood at all points in the ID line defined
in Eq. (2.27) in a simple closed form. Once this has been done, the likelihood can
be maximized along the line using brute force evaluation, at a negligble computa­
tional cost (because dense m atrix algebra is 0(N^-]yi) whereas the cost of evaluating
Eq. (2.33) is <9(7Vpix)).
Combining these procedures for computing the gradient (Eq. (2.28)) and solving
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
42
the ID maximization problem (Eq. (2.27)), we can use the BFGS algorithm to con­
verge to the maximum likelihood bandpowers. In comparison with other approaches
such as Newton-Raphson, this method has several advantages. First, it is fast: only
two dense m atrix operations (inversion and solving an eigenproblem) are needed in
each iteration. If we needed the matrix of second derivatives,
<92 log£[A |ra]
8 A bd A v
(2.34)
- d T (S + N ^ S b i S + iV)_ 1 5 6/(5 + N ^ d
then
m atrix operations would be required. Second, because each iteration is
based on solving the ID maximization problem (Eq. (2.27)) the value of the likelihood
is guaranteed to increase in each iteration, and the region of parameter space where
(S + N) is not positive definite (so th at the likelihood is undefined) is never entered.
This is because in the “fast” form Eq. (2.33)), the positive definite region is given
by (—1/A) max < t < ( —1/A)min, so we can easily avoid leaving it. In contrast, in
other iterative methods, it is possible to “bounce” outside the positive definite region,
which complicates the implementation.
2 .7 .2
B a n d p o w e r u n c e r t a i n t i e s : c o n d itio n a l lik e lih o o d s
The procedure described in the previous subsection produces maximum likelihood
estimates for the bandpower values A b. However, this procedure is incomplete without
some means of estimating bandpower uncertainties as well. This requires exploring
the likelihood function away from maximum.
The simplest method for estimating the uncertainty on bandpower A& would be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
43
to consider the conditional likelihood
\ogC[Ab\ m , A b^ b]
(2.35)
on A b. with the other bandpowers A y^.b fixed to their maximum likelihood values.
If the bandpowers were independent, this would preserve all information in the joint
likelihood, but in reality there are correlations (to be discussed in detail below) which
make this method approximate.
Because the conditional likelihood log C[Ab\m, A b/^Lb] is obtained by restricting
the joint likelihood to a ID line in parameter space, it can be computed efficiently
using the the eigenvalue trick from §2.7.1. As described there, after a single £>(Npix)
dense m atrix operation, the conditional likelihood can be w ritten in a simple form
(Eq. (2.33)) in which the cost of subsequent evaluations is 0 ( N pix ). This allows us
to compute the entire ID likelihood quickly, as illustrated in Fig. 2.5, where we show
conditional likelihoods log C[Ab\m, A b>^b\ for each band b, with the other bands b' ^ b
mixed to maximum likelihood values.
Because the bandpowers are not independent, this set of conditional likelihoods
does not contain all the information in the joint likelihood logC[Ab\m\. We can get a
quick look at band-band correlations by plotting the two-band conditional likelihood
on bands i»i, 6 2 , with the remaining bands b' 7 ^
, 62 fixed to maximum likelihood
values:
1°S
j A ^ IWl) A y ^ bl^2i
(2.36)
We sample this 2D likelihood efficiently by using the eigenvalue trick from §2.7.1 to
compute Eq. (2.36) on a collection of ID lines which pass through the maximum
likelihood point. This is illustrated in Fig. 2.6, where we show a contour plot of the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
44
r~n
M I | I I I I | I I I I | I I I ]_
LiII
0
2
4
6
8
E E / fi d [2 , 5 0 0 ]
0
E E / f i d [5 0 0 ,8 0 0 ]
j
i ] i i i | i m
i.
B B /fla t [5 00 ,80 0 ]
J.
B B / f l a t [ 8 0 0 ,1 1 0 0 ]
B B / f l a t [ 1 1 0 0 ,1 4 0 0 ]
J. Vi-li II h I i
jJJl
-5
10
E B / f l a t [ 2 ,5 0 0 ]
E E /fid [110 0,1 40 0 ]
| i i i | i i i | i r
-.A i i fr-i-i
0 10 2 0 30 40
B B / f l a t [ 2 ,5 0 0 ]
E E / f i d [ 8 0 0 ,1 1 0 0 ]
2 0 -1 0 0
10 20
E B /fla t [500,800]
20 40 - 5 0
0
50
E B / f l a t [ 8 0 0 ,1 1 0 0 ] E B / f l a t [ 1 1 0 0 ,1 4 0 0 ]
Figure 2.5: Conditional likelihoods for EE, BB and EB bandpowers, with remaining
bandpowers fixed to maximum likelihood values, for simulated W -band (black/solid)
and Q-band (blue/dotted) full-season data.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
45
20
15
10
5
0
■5
0
10
20
30
40
50
Figure 2.6: Conditional likelihood for lowest-^ EE and BB bandpowers (on the x-axis
and y-axis respectively), with remaining bandpowers hxed to maximum likelihood
values, for simulated Q-band full-season data.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
46
EE bandpowers
1
-0 . 0 1 0 . 0 0 -0 . 0 0
1
- 0 .0 1
-0.09 - 0 . 0 1
1
0 .0 0
-0.09
-0.06
1
- 0 . 0 0 - 0 . 0 1 -0.06
0
.
0
2
-0.09
0 .0 1
0 .0 0
0 .0 2
0.04 0.03 - 0 . 0 2
- 0 .0 1 0 . 0 0 -0.06 0 . 0 0
- 0 .0 1 0 . 0 1
0 .0 1
0.05
-0 . 2 2 -0 . 0 0 -0 . 0 1 0 . 0 1
- 0 .0 1 -0.17 - 0 . 0 0 - 0 . 0 1
- 0 .0 1 0 . 0 2 - 0 . 1 0 0.03
- 0 . 0 0 - 0 . 0 0 - 0 . 0 1 -0.26
BB bandpowers
-0.09 0 . 0 2 - 0 .0 1 - 0 .0 1
0 .0 1
- 0 . 0 1 0.04 0 . 0 0
0 .0 1
0.03 -0.06 0 . 0 1
-0 . 0 2 0 . 0 0
0 .0 0
0.05
-0.04 - 0 .0 1 0 . 0 1
1
-0.04
1
-0.17 - 0 .0 1
1
-0.17
- 0 . 0 1 -0.17
1
0 .0 1
- 0 . 0 1 -0.17
- 0 . 1 2 - 0 . 1 0 0.04 - 0 .0 1
-0.09 -0.27 0.07 0 . 0 0
0.04 -0.09 0 . 0 0
0 .0 1
0 .0 1
-0.03 0.05 -0.32
EB bandpowers
-0 . 2 2 -0 . 0 1 -0 . 0 1 -0 . 0 0
- 0 . 0 0 -0.17 0 . 0 2 - 0 . 0 0
- 0 .0 1 - 0 . 0 0 - 0 . 1 0 - 0 .0 1
0 .0 1
- 0 . 0 1 0.03 -0.26
-0 . 1 2 - 0 . 1 0 0 . 0 1
0 .0 1
- 0 . 1 0 -0.27 0.04 -0.03
0.04 0.07 -0.09 0.05
-0.32
-0 . 0 1 0 . 0 0
0 .0 0
1
-0 . 0 2 0 . 0 0
0 .0 0
-0 . 0 2
1
-0.15 0.03
1
0 .0 0
-0.15
-0 . 1 0
1
0 .0 0
0.03 - 0 . 1 0
Table 2.2: Correlation m atrix between bandpowers, computed from 2D conditional
likelihoods with the remaining bandpowers held fixed to maximum likelihood values.
2D conditional likelihood in Eq. (2.36) for the EE, BB bandpowers in the lowest t bin,
(^mim fmax) = (2, 500). About 10 radial lines are needed to obtain smooth contours.
This procedure can be repeated for each pair of bandpowers (&i, 6 2 ). The results
are summarized in Tab. 2.2, where we show the covariance m atrix between all pairs
of bandpowers (rather than plotting each full 2D likelihood).
Note th at the eigenvalue trick, which allows us to evaluate the likelihood “all
at once” along any ID line in parameter space, allows us to reduce the effective
number of dimensions by one (relative to, say, grid sampling) when computing a
conditional likelihood. For the ID conditional likelihood (Eq. 2.35), a single evalution
is required; for the 2D likelihood (Eq. 2.36), a one-parameter family of evaluations
is required (corresponding to the polar angle around the point of joint maximum
likelihood). However, we do not use this method to explore the likelihood function
when the number of parameters is larger than two; in this case, there is a more
efficient procedure: Markov chain sampling.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
47
2 .7 .3
B a n d p o w e r u n c e r t a i n t i e s : m a r g in a liz e d lik e lih o o d s
The techniques for computing ID or 2D conditional likelihoods described in the pre­
ceding subsection are useful for a “quick look” at power spectrum uncertainties, but
fail to capture all information contained in the joint likelihood log £[A|m]. For ex­
ample, for the “bottom line” result of the experiment, we would like to report uncer­
tainties on each bandpower A;, with the remaining bandpowers
marginalized
(rather than fixed to their maximum likelihood values). If the number of bands is not
small (say > 5), it is essentially impossible to compute marginalized uncertainties by
direct sampling of the likelihood function (as we did in §2.7.2 for conditional uncertanties) without making approximations. A huge number of likelihood evaluations
would be required, in order to obtain sufficient sampling to perform the multivariate
integral. However, we can introduce another technique which is ideally suited for this
problem: Markov Chain Monte Carlo (MCMC).
The basic idea of the Markov chain method is: rather than sampling the like­
lihood function at a fixed set of points, we generate a sequence of random sam­
ples A ^ \ A ^ , A ^ , . . . which are drawn from the multivariate likelihood (or PDF)
£[A|m].
Note th at each sample is a vector A^(i) , where the index b ranges over
bandpowers. In the Markov chain approach, it is easy to generate marginalized un­
certainties: if we simply discard bandpowers b' ^ b in each sample, then the resulting
sequence of numbers will be randomly sampled from the marginalized likelihood for
bandpower b. Similarly, if we discard bandpowers b' ^
6i , 62,
then the resulting
sequence of pairs (A ^ , A^2) will be randomly sampled from the joint likelihood on
bandpowers b\ , 62 with the remaining bandpowers b1
b\,
62
marginalized. (This is
useful for, e.g. computing marginalized correlation coefficients.)
In the Markov chain method, each vector A^(i) is generated randomly starting
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
48
from the previous vector
( i — 1)
. The transition probabilities are constructed in such
a way (to be described in detail below) th at the equilibrium distribution, after many
iterations, is the likelihood function £[A&|m]. However, consecutive samples are cor­
related: the conditional distribution P [ A ^ |A ^ ^], for a fixed value of A ^
will
not be equal to the equilibrium distribution. Therefore, if we have N consecutive en­
tries in the chain, they cannot be treated as statistically independent. (At a heuristic
level, the effective number of independent samples is N / ( , where ( is the “correla­
tion length” of the chain.) In practice, this simply means th at if we are using the
Markov chain to compute a particular quantity, such as the marginalized 95% confi­
dence region on a bandpower, or the marginalized correlation coefficient between two
bandpowers, we run several chains until they converge to the same answer, rather
than running a single chain and assuming 1 / y / N statistics.
How do we generate each random vector A^
from the previous vector A^
,
such th a t the equilibrium distribution after many iterations is £[A^|m]? There are
many ideas for doing this in the Markov Chain literature; we use the best-known
method, Metropolis-Hastings sampling, which we now explain in detail.
We start by randomly generating a “proposal” vector A^r using a proposal density
(or PDF) which is a multivariate Gaussian centered at A ^
P[A(t; - 1}
A f ] = det "V2(W)exp ( - j f A f 1 - A ^ J M ^ A f - A p 1’) )
(2.37)
The choice of covariance matrix M in the proposal density will be discussed below.
We then either accept the proposal (meaning th at we take the next sample in
the chain to be A ^ = Ajj*r), or reject (meaning th at we start over by generating
a new proposal vector to be accepted or rejected). This is decided randomly, with
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
49
o
o
'H
£
o
o
LTI
m
m
-4
4
6
8
10
EE [2,500]
14
12
(microK)
16
18
Figure 2.7: Markov chain likelihood samples from the full-season simulated dataset
in Q-band. Each sample is a twelve-component vector, corresponding to the twelve
bandpowers shown in Fig. 2.5, but only the lowest-^ EE bandpower (x-axis) and
lowest-^ BB bandpower are shown here. The equilibrium distribution of the chain is
the marginalized likelihood, but each sample is correlated to the previous sample.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
50
acceptance probability given by:
,
W
=
£ [ A f H \ I P [ A f - A i r 11] ,
..V , V .
L ,' 6
\ P 1 4 ~ " ^ A?]
J
(2
38)
Note th at, with the Gaussian proposal density defined in Eq. (2.37), the second factor
on the right-hand side of Eq. (2.38) cancels, but we have written Paccept in a more
general form which does not require P[a[* ^ —> Ajj*r] and P[A^r —> A ^
to be
equal.
This two step procedure, generating proposal samples (Eq. (2.37)) which are either
accepted or rejected (Eq. (2.38)), is known as the Metropolis-Hastings algorithm.
It can be shown th at this scheme leads to the equilibrium distribution £[A^|m] as
required, for any choice of covariance matrix M in the proposal density (Eq. (2.37))
However, efficiency considerations dictate the choice of M: if M is too small, then
consecutive samples are highly correlated and the correlation length of the chain is
large. If M is too large, then the acceptance probability (Eq. (2.38)) will be small on
average and many likelihood evaluations will be required to advance in each iteration.
We make the following choice of M:
L_ ( o A82aA~
=- = =
Mr}7 }/ =
bb
v'A'hand \ d A b d A y
„...
log C[A\m]
6
(2.39)
1 ' V A = ML
where the notation (-) a =ML means th at the second derivatives are evaluated
valueof A whichmaximizes the likelihood. Thus, before running the
at the
Markov chain,
we precompute maximum likelihood bandpowers and evaluate the m atrix of second
derivatives (Eq. (2.34)).
The computational cost of Metropolis-Hastings scheme is determined by the like­
lihood evaluations needed to compute the acceptance probability in Eq. (2.38). The
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
51
most efficient way to evaluate log C is to write
log£[A|m ]
— - - l o g det (S + N) — ^-mT (S + N ) 1m
=
( 1o§(Lm) + \ ( L ~ l m )i(L ~ l m ) i jS
-
(2-40)
where L is the lower-triangular Cholesky factor of the total signal + noise:
AT + ^ A t Sj = LLr
b
(2.41)
Using Eq. (2.40), the only 0 (A rpix) operation is Cholesky factorization (note th at
L ~ ^ m can be computed with cost 0 ( N ^ X)), which is significantly faster (by a con­
stant factor) than m atrix inversion or solving an eigenproblem, although the
0
(N^-x)
asymptotic complexity is the same.
It is interesting to note that, using likelihood evaluations as in §2.7.2, it is easy
to compute conditional likelihoods but difficult to compute marginalized likelihoods.
Using Markov chains, the opposite is true: it is easy to marginalize parameters (by
simply discarding their values from the chain) but difficult to impose constraints or
make the likelihood conditional (without running a new chain).
Marginalized likelihoods from the Metropolis-Hastings chain are shown in Fig. 2.8.
2.8
Com bining maps
It is convenient to have a general-purpose program for combining two maps (N\, m \)
and (N 2 , m 2 ), produced by the map-making procedure in the preceding section. (Note
th at each “map” consists of a length-(2 7Vpjx) vector m and a (2 iVpix)- by- (2 fVp;x) noise
covariance m atrix N.)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
52
i j i M i i i i | i i i j" i i i j i
LJ
0 20 4 0 60 80
E E / f l a t [ 2 ,5 0 0 ]
TT
_LX
L l!
E E / f l a t [ 5 0 0 ,8 0 0 ]
50 100
E E / f l a t [ 8 0 0 ,1 1 0 0 ]
0 50 100 150
E E / f l a t [ 1 1 0 0 ,1 4 0 0 ]
X
0 10 2 0 3 0 4 0
B B / f l a t [ 2 ,5 0 0 ]
B B / f l a t [5 0 0 ,8 0 0 ]
H T I I | I I I I ) I II I | I I I I j I I I I H
-10
0
10
20
E B / f l a t [ 2 ,5 0 0 ]
E B /fla t [500 ,80 0 ]
B B /fla t [800 ,11 0 0 ]
[_l
|
I I |-j
I I I |
I I I |
I |
|
M
E B / f l a t [ 8 0 0 ,1 1 0 0 ]
B B /fla t [1 1 00 ,14 0 0 ]
U_
|
I
|
|
|
|
|
I
|
|
|
|
E B / f l a t [ 1 1 0 0 ,1 4 0 0 ]
Figure 2.8: Marginalized likelihoods for EE, BB, and EB bandpowers, obtained by
histogramming Markov chain samples, for simulated W-band (black/solid) and Qband (blue/dotted) full-season data.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
53
One application of such a program is to distribute the computational cost of map
making across several CPU ’s, without needing to parallelize the linear algebra in
the map-making program (§2.6). We can simply divide the data into subsets, run a
separate instance of the serial map making program on each subset, then combine the
maps. The resulting map will be equivalent to the result of running the map making
program on the entire dataset.
A second application is to combine W -band and Q-band data, which have different
beam sizes. We do this by making two maps which correspond to all the desired data
at the two beam sizes, then combining the maps. (This is a m atter of convention:
the map making program in §2 . 6 only
emits maps with a single beam size in order
to keep the code as simple as possible; we introduce the extra complexity of distinct
beam sizes in the map combining program.)
We now describe our algorithm for combining maps. First, consider the case of
two maps ( N i , m i ) and (Ar2 , m 2 ) with the same beam size. Then the combined map
(N , m) is simply given by:
N~l
N ~lm
= N {l + N ^1
(2.42)
= N ^ lm\ + N ^ lm2
(2-43)
If the input maps cover a different set of pixels, the inverse noise covariance matrices
N F 1 are first extended (with zeroes) to the union of the pixelizations. In this case,
map combining is just a computational shortcut: the combined map (AT, m) is equal
to the map obtained by running the map-making program (§2 .6 ) on the union of the
timestream d ata used to obtain the maps (Ni, m \) and (N 2 , m 2 ).
If the beam sizes are different, then combining maps is more difficult. We emit a
“map” (TV, m) containing two distinct types of pixels corresponding to the two beam
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
54
sizes.
/
0
Ni
N
N<1 J
V °
(
\
m
(2.44)
Thus the pixel count roughly doubles when Q-band and W -band are combined.
In order to perform a likelihood analysis (§2.7) on a “doubled” map (such as
Eq. (2.44)) in which the Q-band and W -band pixels are kept separate in the noise covariance, all th at is needed is a prescription for computing signal covariance matrices.
The correct prescription is:
( 2 . 45 )
where S \ , S 2 denote signal covariance matrices with each respective beam size, and
S x denotes the signal covariance matrix computed with beam size
( 2 . 46 )
Eq. (2.45) includes correlations between the Q-band and W -band pixels which enforce
the constraint th at the CMB seen in W -band and Q-band is the same, except for the
beam size. If the off-diagonal blocks were zero (as in the noise covariance, Eq. (2.44)),
the W-band and Q-band d ata would be incorrectly treated as measuring independent
patches of sky.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
55
2.9
N ull maps
Given two maps ( N i , m \ ) and (N 2 , m 2 ), the map combining procedure in the last
section provides an optimal way for determining the power spectrum, given both
measurements.
It is also useful to have a procedure for determining whether the
two measurements are consistent with each other, i.e. whether they measure the
same underlying CMB. To determine consistency, we first compute a null map by
differencing the two maps, so th at the response to any shared CMB signal is zero.
Thus the null map should be “pure noise” if the two input maps are consistent. We
then subject the null map to the power spectrum analysis described in §2.7, to check
consistency with zero signal power. In this section, we describe our procedure for
obtaining null maps.
If the input maps ( N i , m i ) and (A 2 , m 2 ) have the same beam size, then differ­
encing is straightforward: the null map (N , m) is given by.
If the
N
= N \ + A^2
m
= m i —m 2
input maps cover a different set of pixels, the noisecovariance
(2.47)
matrices iVj
and maps mjmust first be restricted to the intersection ofthe twopixelizations.
In
contrast to the case of combining maps (§2 .8 ), it is not possible to apply inverse noise
weighting to the maps m i, m 2 , to optimize the signal-to-noise in the output map.
Straightforward differencing as in Eq. (2.47) is the only way to subtract the maps
which ensures zero response to an arbitrary CMB signal.
Now consider the case when the input beam sizes are different, say beam 1 is
larger than beam 2. We degrade map 2 to the beam size 1 (the details of the beam-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
56
degrading operation will be described shortly), then difference using Eqs. (2.47) and
give the output map the larger beam size. We do not keep separate pixels for the two
beam sizes as in Eq. (2.44). For a null map, this does not lose information, since one
is limited by the angular resolution of the map with the worst beam size, just as the
straightforward subtraction of maps in Eq. (2.47) means th at one is limited by the
noise level of the map with the worst sensitivity.
The beam-degrading operation in real space is given in detail as follows. We
first define a degrading operator A whose m atrix element between pixels p,p' with
separation Qppt is given by
exp(—
Fpp.
'V
< 2 '4 8 )
where the coefficient a is defined by
a
41n2 ^ ( a FWHM)2
(a FWHM)2 J
The denominator in Eq. (2.48) simply normalizes so that
^2’49^
A pp/ = 1. The degrad­
ing operation on a map (N, m) is then given by
N -» A N A t
(2.50)
m —»• A m
(2.51)
When degrading the beam size, we must throw away all pixels within a few beam
widths of the survey boundary.
One mightwonder whether itis safe to perform the beam-degrading operation in
pixel space, given th at the pixel-pixel separations are roughly beam size. This can be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
57
directly tested by asking whether the signal covariance matrices
Si
,
(A S 2A T )
(2.52)
corresponding to beam-degrading the signal covariance in harmonic and pixel space respecively, are distinguishable. This will depend on the signal-to-noise. For CAPMAP
sensitivity and beam size pixels, direct comparison of likelihood functions shows th at
the two signal covariances in Eq. (2.52) are only “distinguishable” by a fraction of a
sigma.
2.10
N ull te st suite
A typical application of the null map procedure described in the preceding section
would be to verify that, if we combine all W -band data taken during daytime, the
resulting measurement is consistent with all W -band data taken during nighttime.
We run the map-making program in §2.6 on disjoint “day” and “night” subsets of
the W-band d ata (imposing the data selection described in §2.5), obtaining maps
(-^day> m day) and (^night; m night)- This pair of maps is then taken as input to the
map differencing procedure (§2.9), obtaining a day-night null map (Nnun, mnuu).
We would like to check th at the null map is consistent with zero signal. We do
this using the likelihood analysis of §2.7: we fit EE, BB, and EB bandpowers to the
null map, obtaining the result shown in Fig. 2.9. Each fit is performed in a single
large £ band, rather than using a multiband as we do for the overall power spectrum
analysis (Fig. 2.5), because the effective sensitivity of the null map is significantly
worse than the total sensitivity of the experiment. (In units of integration time, the
loss of sensitivity is a factor ~ 4: the day/night submaps lose a factor ~ 2 relative to
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
58
1000
500
EE ( / x K 2)
BB
(/x K2)
-1 0
0
EB (/ x K 2)
10
Figure 2.9: Day/night null test for W -band CAPMAP data. The y 2 value (Eq. (2.53))
to zero for the EE, BB, and EB likelihoods is 0.01, 0.79 and 0.25 respectively, with
one degree of freedom in each y 2.
their sum, and differencing loses another factor ~
2
.)
To quantify the consistency with zero signal, we compute the probablity to exceed
zero (P T E q) for each of the three likelihood curves in Fig. 2.9. We obtain P T E q =
0.53, 0.19, 0.31 for EE, BB, and BB respectively. For combining many null tests, it
is convenient to convert each P T E q value to a y 2 with one degree of freedom:
y 2 = 2 ( e r r 1( l - 2 - P T E 0))
(2.53)
This conversion is trivial for a single likelihood curve: Eq. (2.53) has been constructed
so th at the RHS will be distributed as a y 2 with one degree of freedom in the absence
of any signal (assuming th at P T E q is uniform distributed). However, for N likelihood
curves, it is convenient to sum the y 2 for each curve, obtaining an overall y 2 with
N degrees of freedom which can be used to decide whether the overall null test suite
passes or fails. For the day/night null test in Fig. 2.9, we obtain an overall y 2 = 1.05
with 3 degrees of freedom, indicating th at the null test passes. Note th at the “neutral”
value of P T E q is 0.5, with values close to 0 or 1 indicating failure of the null test;
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
59
whereas the “neutral” value of y 2 is zero, with large values indicating failure.
The day/night null test we have described in detail is part of a larger null test
suite, where each null test is based on dividing the data into disjoint subsets. The
test suite is shown in Tab. 2.3; the individual tests are described as follows (in order
of appearance in the table):
1.
Period-period tests: We difference maps made from data from each of the three
gain periods described in §2 .2 .2 .
2. Per-dewar tests: We difference maps made from data in different dewars (Fig. 2.1).
3. Per-frequency tests: We difference maps made from data in different subfre­
quencies (Tab. 2.1).
4. Grand y 2: The “grand y 2” was introduced in §2.5 as a statistic for cutting
anomalous cycles during periods of bad weather or instrumental flareups. In this
null test, we divide channels into subsets based on whether the time-averaged y 2
of the channel is higher or lower than the median value. (This roughly monitors
the level of ( 1 / / ) noise in the channel.)
5. Universal structure y 2: Analogously, we divide channels into subsets based on
whether evidence for ground-synchronous structure is higher or lower than the
median.
6
. I —»■Q mixing: Each polarization channel has a characteristic level of I —>Q
mixing, or pickup of anisotropies in the sky temperature. In this null test, we
divide channels into subsets based on whether this level is higher or lower than
the median.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
60
W: period 1 vs 2
W: period 1 vs 3
W: period 2 vs 3
W: dewar 0 vs dewars 1+2+3
W: dewar 1 vs dewars 0+2+3
W: dewar 2 vs dewars 0+1+3
W: dewar 3 vs dewars 0+1+2
W: SO vs S1+S2
W: SI vs S0+S2
W: S2 vs S0+S1
W: good vs bad grand x 2
W: good vs bad US x 2
W: good vs bad I—>Q
W: day/night
W: high/low IF temp
W: period-period subtotal
W: per-dewar subtotal
W: S0 /S 1 /S 2 subtotal
W subtotal
Q period 1 vs 2
Q period 1 vs 3
Q period 2 vs 3
Q dewar 0 vs dewars 1+2+3
Q dewar 1 vs dewars 0+2+3
Q dewar 2 vs dewars 0+1+3
Q dewar 3 vs dewars 0+1+2
Q SO vs Si
Q good vs bad grand x 2
Q good vs bad US x 2
Q good vs bad I—>Q
Q day/night
Q high/low IF temp
Q period-period subtotal
Q per-dewar subtotal
Q subtotal
Grand total
0 .0 0 / 1
0.47/1
2.06/1
0 .0 2 / 1
0.34/1
2.99/1
1.03/1
0.65/1
0.30/1
0.17/1
0 .1 1 / 1
0.06/1
0.76/1
0 .0 1 / 1
1.25/1
2.54/3
4.38/4
1.13/3
10.23/15
2.18/1
0.33/1
2.70/1
0.50/1
0.19/1
0.18/1
2.91/1
0.32/1
0 .0 0 / 1
4.62/1
0 .0 0 / 1
0.18/1
0.87/1
5.21/3
3.78/4
14.99/13
25.22/28
0.99/1
0 .0 1 / 1
0.53/1
0 .1 2 / 1
0.98/1
0.03/1
0.09/1
0.51/1
3.77/1
0.33/1
0 .1 0 / 1
1.23/1
0.81/1
0.79/1
1.45/1
1.52/3
1.22/4
4.61/3
11.73/15
0 .0 1 / 1
0.55/1
0 .2 1 / 1
0.53/1
1 .2 0 / 1
0.38/1
0.81/1
0 .0 0 / 1
0.09/1
0.24/1
0.15/1
0.55/1
4.43/1
0.77/3
2.91/4
9.14/13
20.87/28
4.18/1
0.16/1
0.90/1
0.04/1
1.84/1
0.04/1
1.17/1
0.33/1
0.46/1
1.34/1
0 .0 0 / 1
0 .0 1 / 1
0.08/1
0.25/1
0 .0 0 / 1
5.25/3
3.08/4
2.12/3
10.80/15
0.78/1
0.40/1
0.34/1
1.24/1
3.70/1
0 .0 1 / 1
2 .0 2 / 1
0 .1 0 / 1
0 .2 2 / 1
2.30/1
1.38/1
0 .1 0 / 1
0.60/1
1.53/3
6.96/4
13.19/13
23.99/28
5.17/3
0.65/3
3.49/3
0.18/3
3.16/3
3.06/3
2.29/3
1.48/3
4.53/3
1.85/3
0.21/3
1.29/3
1.66/3
1.05/3
2.70/3
9.31/9
8 .6 8 / 1 2
7.86/9
32.76/45
2.97/3
1.28/3
3.25/3
2.26/3
5.09/3
0.56/3
5.74/3
0.42/3
0.31/3
7.16/3
1.52/3
0.83/3
5.91/3
7.51/9
13.65/12
37.31/39
70.08/84
Table 2.3: Null test suite for real CAPMAP data. The overall x 2 of the test suite is
70.08 with 84 degrees of freedom, indicating th at the null test passes.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
61
7. IF temperature: There was some evidence th at amplifier gains might depend on
the tem perature of the intermediate frequency (IF) component of the hardware,
so we include a null test based on dividing the timestream into subsets where
the IF tem perature is higher or lower than the median value.
The result of the null test suite can be summarized as a single chi-square value:
X2 = 70.08 with 84 degrees of freedom, indicating th at the testsuite passes. However,
there is an im portant caveat here. Although the chi-square values with one degree
of freedom computed in Eq. (2.53) are distributed as true x 2 variables, the same is
not true when summing over null tests: our “chi-square” values with many degrees of
freedom are not truly x 2 distributed, because of correlations between null tests. Such
a chi-square value with N degrees of freedom will still have mean N, but its variance
will be larger than (2N).
In principle, the true variance, including the effect of
correlations between null tests, could be determined by Monte Carlo simulations of the
entire null test suite. We have not done this here, since it would be computationally
expensive, and our overall chi-square value is only lcr away from the mean, even when
true x 2 statistics are assumed.
It is interesting to note th at the null test suite only passes in Q-band if groundsynchronous modes are marginalized in the map making stage (§2.6). In Tab. 2.4,
we show the result of the test suite if ground-synchronous mode removal is disabled
in the map maker. The total x 2 in Q-band increases from 37.31 to 145.19, with 39
degrees of freedom, indicating clear failure of the test suite and evidence for groundsynchronous contamination. In W-band, the total x 2 of the testsuite also gets worse,
but by a fraction of a standard deviation, so no conclusion can be drawn.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
62
W: period 1 vs 2
W: period 1 vs 3
W: period 2 vs 3
W: dewar 0 vs dewars 1+2+3
W: dewar 1 vs dewars 0+2+3
W: dewar 2 vs dewars 0+1+3
W: dewar 3 vs dewars 0+1+2
W: SO vs S1+S2
W: SI vs S0+S2
W: S2 vs S0+S1
W: good vs bad grand x 2
W: good vs bad US x 2
W: good vs bad I—>Q
W: day/night
W: high/low IF temp
W: period-period subtotal
W: per-dewar subtotal
W: S0/S1/S2 subtotal
W subtotal
Q period 1 vs 2
Q period 1 vs 3
Q period 2 vs 3
Q dewar 0 vs dewars 1+2+3
Q dewar 1 vs dewars 0+2+3
Q dewar 2 vs dewars 0+1+3
Q dewar 3 vs dewars 0+1+2
Q SO vs SI
Q good vs bad grand x 2
Q good vs bad US x 2
Q good vs bad I—>Q
Q day/night
Q high/low IF temp
Q period-period subtotal
Q per-dewar subtotal
Q subtotal
Grand total
0.38/1
0.04/1
1.27/1
0.33/1
0.82/1
2.97/1
1.81/1
0.37/1
1 .0 1 / 1
0.03/1
0 .0 0 / 1
0 .0 0 / 1
0.91/1
0.19/1
1.84/1
1.70/3
5.93/4
1.41/3
11.99/15
1.09/1
0 .8 6 / 1
1.87/1
5.57/1
1 .1 1 / 1
0.19/1
4.39/1
1.75/1
9.25/1
14.26/1
3.87/1
0.15/1
1.53/1
3.82/3
11.26/4
45.90/13
57.89/28
0.89/1
0.07/1
0.40/1
0 .0 1 / 1
1.38/1
1.56/1
0.03/1
0.55/1
4.40/1
1.27/1
0.09/1
1.74/1
1.08/1
1.17/1
0.63/1
1.35/3
2.98/4
6.22/3
15.26/15
0.32/1
0.08/1
0.08/1
8.43/1
3.80/1
1.57/1
2.52/1
3.25/1
21.92/1
0 .0 1 / 1
9.22/1
0.03/1
6.05/1
0.48/3
16.33/4
57.29/13
72.55/28
4.50/1
0.49/1
1 .0 2 / 1
0.26/1
2 .1 0 / 1
0.23/1
0 .2 1 / 1
0.46/1
0.03/1
1.97/1
0.28/1
0.03/1
0 .1 2 / 1
0 .1 0 / 1
0.06/1
6.02/3
2.80/4
2.46/3
11.88/15
0.32/1
0.24/1
1 .2 1 / 1
18.49/1
3.47/1
0.93/1
3.67/1
2.25/1
2.33/1
5.07/1
3.03/1
0.52/1
0.48/1
1.78/3
26.55/4
42.01/13
53.89/28
5.78/3
0.60/3
2.70/3
0.59/3
4.29/3
4.76/3
2.06/3
1.38/3
5.44/3
3.27/3
0.38/3
1.77/3
2.12/3
1.46/3
2.53/3
9.07/9
11.71/12
10.09/9
39.13/45
1.74/3
1.19/3
3.16/3
32.48/3
8.39/3
2.68/3
10.58/3
7.25/3
33.51/3
19.34/3
16.12/3
0.69/3
8.06/3
6.08/9
54.13/12
145.19/39
184.32/84
Table 2.4: Null test suite for real CAPMAP data, with ground-synchronous mode
removal disabled in the map making stage, showing clear failure of the test suite in
Q-band.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
63
M
b
~+
-5 0
M
c
CM
s.
o
-+ -5 0
w
S'
3
50
c
N
o
-+ -5 0
0
500
1000
Multipole 1
Figure 2.10: Final power spectrum measurement in from the full-season simulated
dataset, with marginalized uncertainties computed by Markov chain sampling. The
three sets of error bars represent W -band data alone (left), Q-band data alone (mid­
dle), and both frequencies combined (right).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
64
o
0.5
1
1.5
E E /fid u cia l (sin gle band)
2
Figure 2.11: One-band likelihood in a multiple of the fiducial EE spectrum, for the
full-season simulated dataset with both frequencies combined. Zero power is excluded
at 13o\
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
65
2.11
Final power spectrum
We conclude this chapter by presenting the final power spectrum measurements for
the full-season simulation, as a proxy for the final results of the analysis, to appear
shortly.
In Fig. 2.10, we show the measurement of the EE, BB, and EB power
spectrum in flat bandpowers, for W -band data alone, Q-band alone, and the two
frequency channels combined.
To assess the overall statistical significance, we perform a one-band fit for EE in the
“shape” of the fiducial model, with BB and EB set to zero. The resulting likelihood
curve is shown in Fig. 2.11; the lcr range for this measurement is (1.02 ± 0.16). This
suggests a 6 <r detection, but the actual significance of excluding zero power is much
higher (13cr). This is because the likelihood curve becomes very non-Gaussian at low
values due to the presence of a small number of modes with high signal-to-noise.
In conclusion, we have developed an optimal, likelihood-based analysis pipeline
for the CAPMAP dataset which incorporates a complex noise model without ap­
proximations and includes a variety of statistical techniques such as Markov chains.
Results from a null test suite (§2.10) have been finalized and indicate consistency of
the dataset when it is divided in many different ways. The final power spectrum result
is not yet available at the time of this writing, but the pipeline has been demonstrated
on realistic simulations, and final results are expected this fall.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CH APTER 3
P U R E P S E U D O -Q ESTIM A TO RS
3.1
Introduction
The CAPMAP pipeline described in the previous chapter performs a likelihood-based
analysis and obtains power spectrum uncertainties which are provably optimal (Bond
et ah, 1998). We have seen th at many dense m atrix operations, with computational
cost <3(lVpix), are needed. In CAPMAP, the m atrix size is typically ~ 5000, so such
operations are easily affordable. However, they will soon become a bottleneck for
upcoming experiments which will target B-modes with increasing sky coverage and
resolution. For future generations of polarization experiments, likelihood analysis will
probably be infeasible.
An alternative method, which has become the “industry standard” for CMB tem ­
perature experiments with Npix > 105, is using p s e u d o -q u a d r a tic estimators (Wandelt et ah, 2001). In the polarization context, pseudo-C^ estimators mix E into B,
in the following sense: for a noiseless CMB realization containing only E-modes, the
estimated B-mode bandpowers will be nonzero. In an ensemble of such realizations,
the estimated B-mode bandpowers will be zero in the mean (this is because pseudoC# estimators are unbiased by construction), but because they are nonzero in each
realization, E-mode signal power does contribute to the variance of the B-mode es­
timators. This extra variance can be thought of as a source of noise which is due to
the estimators alone. Recently, Challinor & Chon (2005) showed th at this “estimator
noise” can dominate the sample variance from lensing B-modes, so th at it becomes
the dominant contaminant if the instrumental noise is sufficiently small. For surveys
with f sky ~ 0.01, they show th at it limits the value of T / S which can be detected to
66
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
67
~ 0 . 05 .
The purpose of this chapter is to construct pure p s e u d o -e s tim a to rs ; these are
modified versions of polarization pseudo-Q estimators, which do not mix E into B
in the sense defined above. Using these estimators, the estimated B-mode power
will be zero for any noiseless CMB realization which contains only E-modes. Strictly
speaking, this is only true in the continuum limit; in a finite pixelization, the B-mode
estimators will acquire small nonzero values from pixelization artifacts. We show th at
these pixelization effects can be made arbitrarily small by increasing the resolution.
The basic idea of our construction was inspired by the pure B-mode formalism
of (Lewis et ah, 2002; Bunn et al., 2003). Because pure B-modes exist in any finite
patch of sky, a BB power spectrum estimator will receive no contribution from EE
signal power if it is built entirely out of these modes. Our pure p s e u d o -e s tim a to r s
have the property th at the observed polarization map always appears contracted
with a pure B-mode which is constructed from heuristically chosen weight functions.
The main technical difficulty is ensuring th at the fast algorithm for calculating the
transfer m atrix (Hansen & Gorski, 2003, Appendix E) still goes through after this
modification.
Throughout this chapter, we use a fiducial ACDM cosmology which is consistent
with WMAP (Spergel et ah, 2003), with parameters {
Qm h?
r > |A i?|2,
n s, w, rrij/ } = { 0.024, 0.14, 0.73, 0.17, 2.57 x 10- 9 , 1, -1, 0 }. For pixelized all-sky
maps, we use the Healpix coordinate system (Gorski et al., 2005) exclusively. We
use a phenomonological definition of T / S , defining it to be the ratio of temperature
multipoles Cf™™'/Cf™l*r .
This chapter is organized as follows. In §3.3, we briefly review pseudo-Q esti­
mators, before constructing pure p s e u d o -e s tim a to r s in §3.4. In §3.5, we consider
a spherical cap shaped mock survey with uniform white noise, using it to illustrate
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
68
general features of our estimators, and show th at our modification significantly im­
proves the performance of the estimators for noise levels < 20 /iK-arcmin. As a first
step toward more realistic instrumental noise, in §3.6 we consider a survey with inhomogeneous, but not spatially correlated, noise and show th at the same conclusions
apply. In §3.7, we study a range of noise levels and show th at our estimators are
roughly 80% of optimal, defined by the degradation in the value of (T/ S ) which can
be detected, for all noise levels considered. We conclude in §3.8.
3.2
N otation and C onventions
We represent CMB polarization as a symmetric traceless tensor
as in Kamionkowski
et al. (1997b), but we have changed some conventions to agree with those of CMBFAST and Healpix. The all-sky metric gab and antisymmetric tensor ea& are given
by
\
1
0
sin#
-s in #
0
0
9ab
(3.1)
eab
0
sin 2 0
)
We define basis polarization fields
1
0
0
-s in 2#
Qab
u ab
0
sin#
sin#
0
7
\
(3.2)
o
Given two points x, x', we define X a to be the vector at x which points away from x'
along the great circle arc which connects the two, vector Ya —
traceless tensors
and symmetric
= ( X aXi, —YaYi))/2, Ua&= ( X aYi)+YaXi))/2. The quantities X f,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
69
Y r, Q ', U 1 are
defined in the same way with x , x ; interchanged. Note th at we always
use lower case to distinguish the basis qab, uai)} which is globally defined (except at
poles), from the “two-point” basis Q ab, U ^ , which is defined relative to a pair of
points x , x.'.
E
and B modes are defined as follows. First, we define operators
£ ab
~
"b 2 ^ a^
&ab ~
(3-3)
2 eac^ C^ b + 2 ebc^C^ a
which take scalar fields to symmetric traceless tensors. E and B harmonics are defined
by
j ( e S T ) e ( e + i ) ( t + 2)£abYlm
y (fm)os
Y (£m)ab
=
^
( 3
' 4 >
_ 1 ) ( ( ( + 1)(< +
We will also need some analagous definitions for the spin - 1 case. The “two-point”
basis X a, Ya has already been defined; a global basis x a, Ha is defined by:
xa
=
1 0
Va
=
0 sin 6>
•
(3.5)
Spin- 1 harmonics, which we label G and C for “gradient” and “curl” , are defined by
i ) VaYem
5i ° m>a
Y (lm )a
=
~ ^ / e ( i + i ) ta b ^ bYem
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
( 3 ' 6 )
*3'7^
70
3.3
Pseudo-CV estim ators
Pseudo-Q> quadratic estimators for polarization power spectra have been previously
constructed by Hansen & Gorski (2003), who used the p s e u d o -fo rm a lis m (Wandelt
et al., 2001). In this section, we briefly recall the pseudo-Gf construction, largely for
the sake of establishing notation.
One first chooses a pixel-dependent weight function W (x) which is zero outside the
survey region. The choice is made heuristically, but the performance of the estimators
is improved (on angular scales with signal-to-noise ratio < 1) by choosing W (x) to be
smaller where the noise is larger, in order to downweight noisy regions. Frequently,
the weight function is also apodized near the survey boundary, in order to reduce
harmonic ringing. One then defines pseudo E and B multipoles by (the tildes signify
“pseudo” )
E lm =
^
2n
‘‘‘ (x)H /(x)y(^ ) a4 (x)
(3.8)
X
B tm =
^ 2 n « ‘ (x )iv (x )y (f ^ ( x ) .
and pseudo power spectra by
a
EE
(3.9)
m=—l
Cf B
=
m——l
It can be shown th at the expectation values of the pseudo power spectra are given by
+
\ A et'
w )
NiBB
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(3.10)
71
where K^,, are £max-by-£max transfer matrices and N B E , N BB are £max~by-1 vec­
tors which represent additive noise bias. There is an efficient algorithm, which will
be discussed in Appendix D, for exactly computing the transfer matrices from the
weight function. The noise bias can be computed exactly in cases where the noise is
uncorrelated between pixels (Appendix E), or by Monte Carlo in the general case.
Unbiased power spectrum estimators C E E , C BB can be obtained from the pseudo
power spectra C E E , C BB by simply subtracting the noise bias and applying the
inverse of the (2£imax)-by-(2£max) “grand unified transfer m atrix” :
(
def
V
a
BB
/
-1
K+
~
\ K ££'
KW
' Cf,E - 1V ,F ^
KW
+
rBB
\ CT/
(3.11)
N bb
The preceding construction has assumed th at the power spectrum is estimated at
every multipole £. For reasons of sky coverage or signal-to-noise, it is often necessary
to bin multipoles into bandpowers with A£ > 1. In this case, for each band b, one
defines pseudo bandpowers
c F = E
EE
C fB = E
p« s t
PMCf B .
(3.12)
where the m atrix P defines the £ weighting within each bandpower estimator. A
commonly-used choice is (Hivon et al., 2001):
1
1 (1 + 1 )
^ Jb+l)Jb)
Pbl
lo w
-r
11
Jb) < p Jb+1 )
How - 1 < how
lo w
0
(3.13)
otherwise.
One also introduces a m atrix P , which defines an interpolation scheme by which the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
72
signal power spectra depend on bandpowers A^:
c?B =E ^
z^EE \ ' p a E E
b
Cl
- 2^
BB
(3-14)
A commonly-used choice, corresponding to piecewise flat power spectra, is:
2?r
P tb =
<
:r p(b) < p
11 h a w - 1
(3,15)
otherwise.
0
The binned analogs
n(b+ 1)
< h ow
of the transfer matrices and noise bias vectors (Eq.
(3.10)) are defined by:
EE\
(Ci
Kw
(c?B)
Kw
Kw
Kw
AE E
n ee
(3.16)
+
a bb
and are related to the unbinned versions by: K
k n bb
/
= PyK^Pyy.
= P yN(. The
binned analogs of the unbiased estimators (Eq. (3.11)) are defined by:
P>E E
V
KW
C EE _ N EE
KW
PBB
rB B
V
K bb'
K bV
\ u b'
N, BB
(3.17)
b'
Strictly speaking, these are unbiased estimators of the bandpowers A^, when the
CMB power spectra are of the precise form (3.14).
3.4
C onstructing pure pseudo-C^ estim ators
We now construct a modified version of the B-mode power spectrum estimator which
is “pure” , in the sense th at the estimator is identically zero in any noiseless realization
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
73
of the CMB which contains only E modes. It is convenient to explain the construction
first in the continuum limit (§3.4.1), where sums over pixels can be replaced by
integrals, and then address implementational issues associated with finite pixelized
maps (§3.4.2).
3-4-1
C o n s t r u c t i n g p u r e p s e u d o - C i e s t i m a t o r s 1: c o n t i n u u m l i m i t
The underlying reason why the unmodified p s e u d o - e s tim a to r is not “pure” , in
the sense defined above, can be understood as follows. The pseudo B multipoles are
defined by
Blm
j d2x - f g 2 \ l ‘ \ x ) W ( x ) Y ^ )al>(x)
(3.18)
Since multiplication in position space mixes E and B modes, the product W ( X)Y^m^afj(x )
is a mixture of E and B, even though ^ / ^ ^ ( z ) is a B-mode. Therefore, Bgm receives
a nonzero contribution from E-mode power in IP*1.
The basic idea of our construction is now easy to explain. Writing out the defini­
tion of
from Eq. (3.4), the definition of B(m above can be rewritten
Bfm
=
—,
x
y / ( e - i ) e ( e + i)(£ + 2 )
(3.19)
d2x ^ 2 n ab(x)W (x )B ab( Y Z x ) )
f
Suppose th at this definition is modified by simply moving W (x) inside the operator
Bab-
B Z ‘
lm
=‘
1
-J{1 - ! ) < ( £ + ! ) ( < +
/
2)
x
d2x ^ g 2 H ab(x)Bab(W(x)Y;m (x))
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(3.20)
74
In addition, suppose that the weight function W (x ) and its gradient vanish on the
boundary of the region.
Then B ^ e will be identically zero in a noiseless CMB
realization which contains only E modes. One way to see this is from the perspective of
the pure B-mode formalism (Lewis et al., 2002; Bunn et al., 2003); since ( W (x)Y£m (x))
satisfies both Dirichlet and Neumann boundary conditions, Bab{W{x)Y^rn{x)) is a
pure B-mode. Alternately, a short, self-contained proof is given as follows. A noiseless
CMB realization which contains only E-modes can be written Hab = £ab(t), for some
scalar function 4>{x). Putting this into the definition of B ^ e and integrating by
parts twice, one gets
\ J { t —
+
!)(£ +
2)
~ p U re
(3.21)
lm
= J d2x ^ f g £ abcf)(x)
Bab(W(x)Y;m (x))
= J d2Xy/g (_ V aV 6 + l g abV 2)f{x)
x
= J d2x yjg eacVcVi,(V“V'' - L oi,V 2)0 M
X
W (x )Y ;m (x)
- (j) dcr ta (_ V aV 6 + \ g abV 2)(t>{x)
x V b(W{x)Ytm {x))
- j> da t a (Va + \ v aV 2)(j>{x) W ( x ) Y lm (x)
£
= 0.
Here, §da denotes integration around the survey boundary and ta denotes the unit
tangent vector. The boundary terms are zero because W (x) and V aIT (x) are assumed
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
75
to vanish on the boundary.
To avoid confusion, we will denote the “pure” pseudo B multipole defined in Eq.
(3.20) by B g ^ e, and denote the “mixed” multipole defined by Eq. (3.18) by
for the rest of this chapter. This one change, replacing
by B ^ e, is the only
modification we propose to the pseudo-C^ formalism of §3.3. In particular, we leave
the definitions of E^m and C f E unchanged.
By analogy with the definition of
tipole
, it is possible to define a “pure E” mul-
, but we expect th at incorporating this would worsen the performance
of the EE power spectrum estimator. This is because ambiguous modes, which are
useful for measuring E-mode power, would be discarded. Note th at we have named
our estimators “pure pseudo-Cf” even though we are really using a “mixed” estimator
for E-modes and a “pure” estimator for B-modes.
3 -4 - 2
C o n s t r u c t i n g p u r e p s e u d o - C t e s t i m a t o r s 2: f i n i t e p i x e l i z e d
m aps
Before moving on to construct pseudo power spectra and transfer matrices for the
modified estimators, we explain how the definition of B p^fie (Eq. (3.20)) is to be
interpreted when n a^(x) is a finite pixelized map, instead of an idealized continuous
field. The technical obstacle is making sense out of the object Bab{WY^m) which
contains covariant derivatives, and appears on the right hand side of Eq. (3.20). We
first expand Bab(WYim) using the product rule
B M M
=
( B a b h ) h + h ( B ubf 2 )
+2Totai( V 7 l ) ( V ‘i/2 ),
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(3.22)
76
where we have introduced the tensor
Tabcd = ~^eac9bd d" ~^ead9bc d" ~^bc9ad d" ~^f-bd9ac-
(3.23)
(Note th at contracting a spin-2 object with T results in a 45° rotation, e.g. TahC(iqcd =
~ u abi Tab c d ~ ~^ab-)
Performing the derivatives which act on Yj>m, and replacing the integral by a sum
over pixels, this results in the following form for B ^ e. (We do not include the pixel
area as an overall prefactor, since the normalization of B ^ e will eventually drop
out when we define unbiased power spectrum estimators in Eq. (3.30). However,
straightforward conversion of the integral to a sum does implicitly assume an equalarea pixelization.)
=
E
2n
° ^ W
X)y (?m)a(,W
(3-24)
X
+N't £
2T°bcdn ab(x)Wc(x)Y<’,* )(i(x)
X
+Nt £
2 T " n „ t (x)H U x)>7m(x)
X
where we have defined
W a = V aW
w ab =
(3.25)
(VaVj - (1/2)9o6V2)IV
N't = 2 / V ( « - l ) ( < + 2)
Nf =
+
+
In the form (3.24), the only covariant derivatives are those which act on the weight
function in the definitions (3.25) of W a, W ^ . If W (x) is of known analytical form (e.g.,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
77
Gaussian or cosine apodization), then W a, W ab can simply be computed analytically.
More generally, assuming th at W{x) is slowly varying compared to the pixel scale
(which will be the case for any sensible weighting), W a and
can be computed
from W (x ) by finite differencing. This is nontrival to implement for an irregular
spherical pixelization such as Healpix; we present one finite differencing scheme in
detail in Appendix B.
In either case, once a prescription for computing Wa, Wafj from W has been
specified, we use Eq. (3.24) as the definition of B ^ e in a finite pixelization. Note
th a t the first term on the right-hand side is the unmodified pseudo multipole B J ^ xed;
the second and third terms can be thought of as counterterms which cancel the E-B
mixing and involve spin - 1 and spin - 2 weight functions Wa, Waiy The relative weights
of the three terms are ^-dependent, in a way which downweights the counterterms
at high t. This is in qualitative agreement with the general picture of E-B mixing
arising from survey boundaries (Bunn, 2002; Lewis et al., 2002) in which the mixing
is a less significant contaminant at high I.
The definition (3.24) suggests an efficient algorithm for evaluating the pseudo
multipoles B ^ e. The first term on the right-hand side is computed, for all (£,
m) simultaneously, using a fast spin - 2 spherical harmonic transform of the weighted
polarization field W ( x ) l i ^ x ) . The second and third terms are analagously computed
using a spin - 1 transform of the vector field
2T abcdn abw c =
( n Qw Y - a u w x )xd
+ ( n Qiv x + n u w Y )yd
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(3.26)
78
and a spin - 0 (ordinary spherical harmonic) transform of the scalar function
2T abcdn abw cd = n QW u - i i u W q .
(3 .27)
(In Eqs. (3.26) and (3.27), we use subscripts X , Y to denote components of a vector
field in the “global” { x a,y aj basis, and subscripts Q,U to denote components of
a spin-2 field in the {qa5 , u ai} basis.) The prefactors Nj,, i\fy are applied after the
transforms. This algorithm computes every B ^ e from the map IIa 5 (x), with the
same asymptotic complexity 0{fimax) as in the unmodified pseudo-6 ^ formalism of
§3.3, but the overall constant is 2-3 times worse, since one spin-0, one spin-1, and
one spin- 2 harmonic transform are needed, rather than a single spin - 2 transform.
Since fast spin - 1 harmonic transforms are not implemented in the Healpix library, we
implemented them as part of this thesis; the details are presented in Appendix A.
We have now constructed “pure” pseudo multipoles B ^ e for finite pixelized
maps; weconclude this section by defining pure pseudo power spectra and unbiased
estimators of the power spectrum. This is done in complete analogy with §3.3. The
pseudo power spectrum (jBB,pure .g ^egnef|
d ? B,PUre - S T l
' t
SZ mB Z e
(3.28)
m = —£
The expectation values of ( jBB'Pure an(j the unmodified pseudo spectrum C f E are
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
79
given by
(C fE)
^BB,pure^
+
Kif
K\It'
-pure T^+pure
K]
Kw
N
+
( C EE ^
U£'
r BB
\
f E
(3.29)
BB,pure
\ Nt
An essential ingredient in the method is a fast algorithm for computing the new
transfer matrices K ^ f ure. In Appendix D, we present such an algorithm and show
th at the computational cost is 0( £max), which is the same as the cost of evaluating
the estimators once. (Noise bias is discussed in Appendix E.) The m atrix K ^ f ure
measures the contribution to (CBB,pure) by E-mode power, and is therefore expected
to be zero. Strictly speaking, this is only true in the continuum limit; in a finite
pixelization, K ^ f ure will acquire a small nonzero value from pixelization artifacts.
This will be studied quantitatively in §3.5.
Finally, we define pure pseudo-Cy estimators (jEE^Pure gBB,pure ^ sjmpiy sub­
tracting noise bias and applying the inverse of the (2 £max)-by- ( 2 (.max) transfer matrix:
( piEE,pnire \
u£
^BB,pure
KW
K
—pure
K
C
X
K,i f
+pure
f E
-
\
-l
(3.30)
N E E
q B B _ j^BB,pure
Note that, even though we have not changed the definition of the pseudo spectrum
C ¥ E , the definition of the unbiased estimator C ¥ E has been modified in our formal­
ism. This is because the m atrix inversion on the right-hand side of Eq. (3.30) mixes all
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
80
rows of the (2£max)-by-(2£max) matrix. In practice, we have found th at the change in
C f E is miniscule, but we do introduce the notation (jEE’Pure
distinguish between
the two versions.
3.5
A n exam ple w ith hom ogeneous noise
In the preceding section, pure p s e u d o -e s tim a to r s (jEE’P'u re; (jBB,pure^ wj1^cj1
not mix E and B modes have been defined. In this section, these estimators will
be studied in detail for a specific mock survey, namely a spherical “cap” of radius
r = 13°, with uniform white noise, and a Gaussian beam with 25’ FWHM. Parameters
similar to these were originally proposed for the QUIET experiment, which provided
the original motivation for the pure pseudo-Q construction. Power spectra will be
estimated in “flat” (Cg oc !/£(£ + 1 )) bands with A l = 40. An exception is the lowest
band, where we have restricted the £ range to 10 < I < 40; we have found th at trying
to estimate power at £ <
10,
which is above the survey scale, gives spurious results.
We will assess the performance of the estimators by comparing their Monte Carlo
covariance with the Fisher matrix,
FW = 1 t t ( S 6 (S 0 + N J -'S y fS o + N ) - 1)
(3.31)
Here, N is the (2Npix )-by-(2Npix ) noise covariance matrix, Sq is the signal covariance
m atrix in the fiducial model, and
is the signal covariance m atrix associated to
each flat bandpower in EE or BB. The Cramer-Rao inequality asserts th at the Fisher
m atrix is a lower bound on the covariance of any unbiased power spectrum estimator;
conversely, minimum-variance unbiased quadratic estimators (Tegmark, 1997) have
a covariance m atrix which is equal to the Fisher matrix, but require dense matrix
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
81
computations which are prohibitively expensive for the survey sizes considered in this
chapter.
The infeasibility of dense matrix algebra also means th at the Fisher matrix (3.31)
cannot be computed in a straightforward fashion. Our method for making the cal­
culation affordable is to exploit azimuthal symmetry of the survey region and noise.
Taking a Fourier transform in the azimuthal coordinate p, the matrices S, N will
be block diagonal in the azimuthal wavenumber m (S is still dense in the coordi­
nate 0), which makes the m atrix operations in Eq. (3.31) affordable. Details of the
method are presented in Appendix F. We note th at this method also could be used to
make optimal power spectrum estimation affordable, but it applies only for surveys
in which both the sky coverage and noise are azimuthally symmetric. Such surveys
therefore permit benchmark comparisons between pseudo-Q estimators and optimal,
likelihood-based methods, for survey sizes which are large enough th at comparison
would normally be infeasible.
To use the pseudo-6 ^ method, one must heuristically choose a weight function
W(x). We will make a modest effort to optimize this by hand, deferring a systematic
framework to §4. The only possibility considered for W (x ) will be cosine apodization,
1
for 9 < r —r*
for r —r* < 9 < r
\
0
(3.32)
for 0 > r
The parameter r* is an apodization length which will be chosen shortly. Note th at
this choice of W (x ) satisfies the requirement of §3.4: both W and its gradient vanish
at the survey boundary. We obtain the spin-1 and spin - 2 weights W a and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(Eq.
82
(3.25)) by analytic calculation, rather than finite differencing:
r —0
r*
r —0
r*
(3.33)
Qab
for r —r* < 9 < r, and 0 otherwise.
We note th at the “pure” multipole B ^ e is defined (Eq. 3.24) as the overlap
integral between 11°^(x) and the pure B-mode
(3.34)
+N ,T abaiWcd(x)Yl m (x).
It is illuminating to consider the behavior of this mode for varying £ and r*. In Figure
3.1, the mode is shown for (£, m) = (10,1) and (£, m) = (50,1), taking r* = 7°. At
£ = 10, the second and third terms in (3.34) dominate the first, and the mode is
concentrated in the apodization region (r —r*) < 9 < r, which is undesirable from a
signal-to-noise perspective. At £ = 50, the contribution of these terms is comparable
to the main term, and the statistical weight is distributed throughout the survey
region.
This illustrates a general point: the “pure” estimators defined in §3.4 require
apodization near the survey boundary. In the tophat limit r* —> 0, the derivatives
of W (x ) increase in magnitude; the third term in the mode (3.34) dominates the
others, and is concentrated at the boundary. In this limit, B ^ e formally receives
no contributions from E-modes, but achieves this using counterterms which are line
integrals around the boundary, to cancel the E-B mixing. (Compare Eq. (4) of Lewis
(2003).) Naturally, for noisy data, this ruins the performance of the power spectrum
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
83
10
5
0
-s
-1 0
IffJP ;
Figure 3.1: The pure B-mode (3.34), for (£,m) = (10,1) (left) and (50,1) (right),
using apodization length r* = 5°. Only the real part is shown. In the left panel, the
second and third terms, which are concentrated in the apodization region, dominate
the main term.
estimator. In a finite apodization, the counterterms in (3.34) are effectively “smeared”
over a nonzero area near the boundary, and the resulting power spectrum estimators
are sensible in the presence of noise. By comparison, unmodified pseudo-C^ estimators
behave reasonably even for a tophat weight function; apodization serves the milder
purpose of reducing Fourier ringing.
Another general feature of our method is th at more apodization is optimal at
low i. At i — 10, Figure 3.1 shows th at the counterterms in (3.34) are dominant,
if r* = 5°. Since increasing r* decreases the magnitude of the counterterms, this
suggests th at the optimal apodization length at £ = 10 is greater than r* = 5 ° . In
contrast, at £ = 50, Figure 3.1 suggests th at r* ~ 5° is roughly optimal.
Next we show the behavior of the transfer matrices K K ^ ure. In Figure 3.2,
the m atrix entries are shown for varying b', with the band b fixed at £min = 80,
tmax = 120. These m atrix entries can be interpreted as the contribution to (CE E )
and (c EB,pure) from E-mode and B-mode power in the band b'. The small nonzero
value of K by Ure represents contamination of the B-mode estimators by E-mode power
in a finite pixelization, and goes to zero in the continuum limit N s
—> oo.
The transfer m atrix formalism can also be used to calculate the total contribution
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
84
“I 1-1--1—I-1-1-1--1-1—I-1-1-1-]--1--1—I-1”
bb'
0.1
0.01
0.1
K; r
0.01
■bb'
0.001
0.001
0.0001
0.0001
K: r re N„,h = 1 2 8
10 - '
(N’
0
50
100
150
M u ltip o le 1
200
50
=512)
100
150
200
M u ltip o le 1
Figure 3.2: Transfer functions K ^ , (left) and K j^fure (right), shown for varying b',
with the band b fixed at £rnm = 80, £rnax = 120. The apodization length used was
r* = 5 ° . The m atrix entry K ^ , represents the mean response of the BB estimator in
band b to BB power in band b'\ K ^ , represents the response of the BB estimator to
EE power. As N sl(ie
oo, K ^ f ure approaches zero.
to {('jBB’Pure) from all E-mode power in the fiducial model, including power at i >
200. One can think of this as the pseudo power spectrum of E-mode power which is
aliased to B by the pixelization. The result is shown in Figure 3.3; for comparison,
a gravity wave B-mode spectrum with T / S = 0.01 is also shown. As expected, the
aliased power goes to zero in the continuum limit. The level of aliased power, relative
to the T / S = 0.01 gravity wave signal, suggests th at detecting such a small B-mode
signal requires a massive amount of overpixelization. We will quantify this better in
§3.7.
We now take up the issue of choosing the apodization length r* in Eq. (3.32).
In Figure 3.4, the Monte Carlo RMS scatter of the unbiased estimator (jB B ^Pure js
shown for varying r*, in two bands b: the lowest band 10 < £ < 40, and the second
lowest band 40 < £ < 80. For both bands, the estimator performance degrades
sharply when r* is chosen smaller than the optimal value, but the optimal value is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
85
0.1
0.1
= 128
0.01
0.01
s id e
0.001
s id e
0.001
/w
=1024
0.0001
0.0001
0
50
100
150
M u ltip o le 1
s id e
0
50
10 0
150
M u ltip o le 1
200
Figure 3.3: Contribution to ( C ^ ^ ’pMre) fro: n the fiducial E-mode power spectrum, for
zero beam (left) and FWHM 25’ (right). T his can be interpreted as the estimated Bmode power which is contributed by E-mo< les due to pixelization artifacts. A gravity
wave B-mode spectrum with T / S = 0 . 0 1 i:? shown for scale. The apodization length
used was r* = 5°.
r* ~
8°
for the first band and r* ~ 3° fo r the second. This is consistent with the
qualitative discussion after Eq. (3.34). For the higher bands, repeating this analysis
shows th at r* ~ 3° is roughly optimal. C'ur solution to this problem is to use one
weight function W (x), with r* = 8 ° for the lowest £ band, and a different W (x), with
r* = 3° for the higher bands.
We briefly describe the implementatioi lal details associated with using different
weight functions in different I bands. We ei nphasize th at these comments apply to all
types of pseudo-Q> estimators, not just th e pure estimators studied in this chapter,
To evaluate the estimators, one first compi ites a separate set of B(m for each weight
function. Then, in each band b, the pseui lo power spectrum Q, is computed using
the Bgm corresponding to the appropriate weight function. Generally speaking, one
could use any number N wt < N^an(i of weij £ht functions, at the expense of increasing
the computational cost, since N wt sets o: ? spherical harmonic transforms must be
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
86
0.03
0.03
cT 0.02
0.02
T
w
+
0.01
0.01
0
5
10
r. (d e g )
0
5
10
r, (d e g )
Figure 3.4: RMS bandpower errors for pure p s e u d o - e s tim a to r s (blue/solid),
unmodified psuedo-Cf estimators (green/dashed), and optimal Fisher errors
(red/dotted), shown for varying apodization length r*. In the left panel, the band
10 < £ < 40 is shown; in this band, the performance of unmodified p s e u d o - e s t i ­
mators is very poor and the dashed curve is not visible. In the right panel, the band
40 < I < 80 is shown. The noise level used was 20 //K-arcmin.
computed. (The increase in the cost is smaller than a factor N wt , since the transforms
are 0 { ^ ax). For the specific example in this section, using a different weight function
in the lowest band, with i max = 40, has a very small impact on the running time.)
When computing the transfer m atrix element
one uses the weight function
associated to band b.
We have now specified the weight functions which will be used for the mock
survey in this section. To make a fair comparison between the pure and unmodified
p s e u d o -e s tim a to rs , we optimized the apodization length r* independently for the
unmodified versions, but found th at a tophat weight function (r* =
0)
was optimal in
all bands. In the lowest t band (10 < £ < 40), we found th at unmodified pseudo-Cf
estimator performance was extremely poor (Figure 3.4). If E-modes are artificially
removed from the signal, then it improves dramatically, suggesting th at this is due to
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
87
severe E-B mixing in the lowest band.
In Figure 3.5, we have compared (Monte Carlo) RMS errors of pure pseudoestimators, RMS errors of the unmodified p s e u d o -e s tim a to rs , and the Fisher bound
(3.31). This was done for two choices of signal and noise level: first, a T / S = 0.2
gravity wave signal and noise level 20 /iK-arcmin, and second, a T / S — 0.05 gravity
wave signal and noise level 10 pK-arcmin. For E-modes, the performance of the pure
and unmodified estimators is the same, as stated at the end of §3.4, and close to
optimal except in the lowest band. For B-modes, the relative performance depends
on the noise level: for the first choice, the pure estimators perform slightly better than
the unmodified versions, and slightly worse than the Fisher bound, in all bands. The
exception is the lowest band, in which the performance of the unmodified estimator
is very poor. For the second choice of noise level, the performance of the unmodified
estimators has degraded significantly, but the performance of the pure estimators
remains about the same relative to optimal.
3.6
A n exam ple w ith inhom ogeneous noise
In the previous section, we studied a mock survey with homogeneous white noise. As
a first step toward more realistic noise models, in this section we study noise which
is inhomogeneous, but not correlated between pixels. We will use the same region as
in §3.5, namely a spherical cap of radius r = 13°, but the following special form for
the noise covariance:
<Q(x)Q(x')) = <f/(x)£/(x'))
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(3.35)
88
50
0.05
W 40
*
0.025
0
0
200
400
600
800
0
150
200
100
150
M u ltip o le 1
200
100
0.02
o
30
T
co
+
0.01
20
0
0
200
400
M u ltip o le 1
600
800
0
50
Figure 3.5: RMS bandpower errors for pure p s e u d o -e s tim a to rs (blue/left), unmod­
ified pseudo-Q estimators (green/middle), and optimal Fisher errors (red/right). The
survey region is a spherical cap of radius 13° with uniform white noise. The top two
panels are for a fiducial model with T / S = 0.2 and noise level 20 pK-arcmin; the
bottom two panels are for a fiducial model with T / S = 0.05 and noise level 10 pKarcmin. In each pair of panels, bandpowers are shown for E-modes (left) and B-modes
(right). In the E-mode panels, multipoles £ < 250 have been replotted on a log scale
for visibility. In the B-mode panels, the lensing component of the power spectrum has
been shown separately (dashed); detecting the gravity wave signal requires measuring
power in excess of this level.
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
89
Here, r) is a constant with units /iK-arcmin which we will use to quote the noise level.
The normalization r / ( l —cosr) is included so th at the total integrated sensitivity
will be the same as a homogeneous survey whose noise level is 77. As in §3.5, we will
study two sensitivity levels:
77
= 20 /iK-arcmin and rj = 10 /iK-arcmin. Because the
noise covariance (3.35) is azimuthally symmetric, optimal power spectrum errors can
be calculated using the method of Appendix F, and used to benchmark pseudo-C/
power spectrum estimation.
To motivate the form (3.35), consider a single detector which makes constantvelocity scans through the center of the survey region, at a variety of angles. If the
distribution of angles is uniform, the scans are rapid, and the timestream noise is
white, then the resulting noise in the pixel domain will be given by (3.35).
We must first choose pixel weight functions which will be used to construct pseudoCf estimators for the noise (3.35). Considering E-mode estimators first, we have
found th a t a tophat weighting is significantly suboptimal at high £, in contrast to the
homogeneous case. (A similar result for tem perature estimators appears in (Hansen
et al., 2002), where it is shown th at Gaussian apodization improves a tophat weight
function in a mock survey with inhomogeneous noise, but not in the homogeneous
case.) However, we were able to construct E-mode estimators which were 95% of
optimal, for all £ > 80, by using weight functions of the form
“
5 E (S JT ?
( 3 '3 6 )
The form of W t {0) was motivated by the observation that inverse noise weighting,
W{6) = l/sin (0 ), should be optimal in the noise-dominated limit; we included a
regulator e to smooth the singularity at 9 = 0. The values of e used are shown in
Table 3.1. We were unable to achieve the same level of EE estimator performance
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
90
77 = 20 //K-arcmin
£ range
weight function
£ < 440
tophat weighting
400 < £ < 560 W e{d), e = 0.070
560 < £ <680 W e(9), e = 0.025
£ > 680
W e(B), e = 0.010
?7 = 10 /iK-arcmin
£ < 600
tophat weighting
600 < I < 720 W t (Q), e = 0.070
720 < £ <800 Wc(0), e = 0.017
£ > 800
We(fl), e = 0.008
Table 3.1: Pixel weight functions used to construct E-mode estimators for the mock
surveys in this section.
using cosine apodization (3.32) in place of the apodization (3.36), or using fewer than
four different weight functions.
This example highlights a practical issue for pseudo-Q power spectrum estima­
tion: for a given survey, how does one choose pixel weight functions which optimize
performance of the estimators? This issue is separate from the E-B separation prob­
lem which is the focus of this chapter; we have seen here th at it is im portant even for
E-mode power spectrum estimation alone. In §4, will investigate this question sys­
tematically and devise algorithms for constructing optimized pixel weight functions.
Turning now to weight functions for B-mode estimators, we have found th at cosine
apodization (3.32) results in p s e u d o - e s tim a to r s whose performance (relative to
optimal) is similar to the homogeneous case. However, best results were obtained
using apodization length r* = r = 13° in all bands, in contrast to the homogeneous
case where we used r* = 8° for 10 < £ < 40 and r* = 3° for £ > 40.
We have now completely specified weight functions; the performance of the pseudoCg estimators is shown in Figure 3.6. The results are similar to the homogeneous case.
At sensitivity rj = 20 fiK-arcmin, the pure B-mode estimators perform slightly better
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
91
than the unmodified estimators, and slightly worse then optimal. At increased sensi­
tivity r) = 10 /iK-arcmin, the performance of the pure B-mode estimators remains the
same relative to optimal, but the unmodified estimators have significantly degraded.
Our conclusion in this section is th at inhomogeneous noise, at least for the specific
form (3.35) studied here, does not pose a problem in principle for power spectrum
estimation using pure pseudo-Q estimators, although the practical issue of choosing
suitable pixel weight functions remains.
3.7
D etectab ility o f gravity wave B -m odes
In this section, we will study the smallest gravity wave B-mode signal which can be
detected using our pure pseudo-C/ estimators, as a function of the noise level. We
use the survey parameters of §3.5: a 13° spherical cap with uniform noise, and define
p s e u d o -e s tim a to r s using the I bands and pixel weight functions defined in th at
section.
In the left panel of Figure 3.7, we have considered a “wide-beam” survey with
a 25’ Gaussian beam. The lensing B-mode signal is treated as an extra source of
Gaussian noise. The minimum (T/ S) which is detectable at l a is given by
c r / s ) 1<r=
(3.37)
where C ^ denotes the bandpower covariance of the estimators (including noise con­
tribution) in a fiducial model with T / S = 0, and (SE)^ denotes the estimator mean
contributed by a tensor B-mode signal with T / S = 1. We note th at (SE)^ can be
computed exactly using the transfer m atrix formalism of Appendix D, but C ^ must
be computed by Monte Carlo.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
92
50
0.05
40
3i 30
K
0.025
C\i
+
0
0
200
400
600
800
50
0
50
0
50
100
150
200
100
150
200
0.02
W 40
a.
0.01
+
o
0
200
400
M u ltip o le 1
600
800
M u ltip o le 1
Figure 3.6: RMS bandpower errors for pure pseudo-C/ estimators (blue/left), un­
modified pseudo-C/ estimators (green/middle), and optimal Fisher errors (red/right).
The survey region is a spherical cap of radius 13° with inhomogeneous noise given by
(3.35). The top two panels are for a fiducial model with T / S = 0.2 and noise level
20 /iK-arc:min; the bottom two panels are for a fiducial model with T / S = 0.05 and
noise level 10 /iK-arcmin. In each pair of panels, bandpowers are shown for E-modes
(left) and B-modes (right).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
93
0.1
1
0.01
0.001
0.01
0.0001
N„ h = 2 5 6
N*h = 1 0 2 4
10-®
0.001
0.1
1
N o ise (/u K -a r c m in )
10
0.1
1
10
N o ise ( /iK - a r c m in )
Figure 3.7: Minimum T /S detectable at la, for a 13° spherical cap survey with
uniform noise, assuming a 25’ beam and Gaussian lensing contaminant (left), and
zero beam and no lensing contamination (right). The solid line is the optimal Fisher
value, the dotted line is the Monte Carlo value using unmodified pseudo-Q estimators
(§3.3), and the dashed lines are Monte Carlo values using pure p s e u d o -e s tim a to rs .
For the parameters in the right panel, finite-pixelization artifacts impose a “floor”
on the value of T /S which can be detected, but the value can be made arbitrarily
small by increasing the resolution. For the parameters in the left panel, we find th at
increasing the resolution beyond N s^ e = 256 results in no significant improvement.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
94
In Figure 3.7 (left panel), we show the minimum detectable T / S for both pure and
unmodified pseudo-C/ estimators, with the optimal Fisher value (computed using the
method of Appendix F) shown for comparison. In all three cases, there is a floor to
the gravity wave signal which can be detected, arising from lensing contamination,
even in the limit of zero noise. W ith unmodified pseudo-C/ estimators, we find the
best possible l a detection is T / S = 0.042, which is reached at a total sensitivity of
~ 10 /iK-arcmin. This agrees well with the results of Challinor & Chon (2005). W ith
pure psuedo-C'/ estimators, the best possible lcr detection is T / S = 0.00170, which
is reached at total sensitivity ~ 2 /iK-arcmin. For comparison, the optimal value is
T / S = 0.00133, or 78%, so the performance of the estimators is close to optimal.
These (T/ S) values are specific to the survey region considered here, and depend on
the geometry in a way which involves boundary E-B mixing and does not scale as
—
1/2
fsky
as m°de-counting arguments would suggest (Amarie et al., 2005).
In principle, these values of (T/ S ) are not necessarily ultimate limits, since we have
arrived at them by treating the lensing B-modes as a Gaussian contaminant which
can not be separated from the gravity wave signal. In actuality, lensing B-modes are
non-Gaussian, and “delensing” algorithms have been proposed (Hu & Okamoto, 2002;
Okamoto & Hu, 2003; Hirata & Seljak, 2003) which exploit this to separate the gravity
wave and lensing components, reducing the level of contamination. The amount of
lensing contamination which can be removed in this way depends on the noise level
and beam size.
In this thesis, we make no attem pt to model residual delensing
errors realistically. Instead, we note th at the result of delensing will lie between two
extremes: no separation of the gravity wave and lensing B-modes (which has just been
considered), and complete separation. In the right panel of Figure 3.7, we consider
the latter extreme; we completely remove the lensing component of the B-mode power
spectrum, and show {T/ S ) \ a as a function of noise level. We have also used zero beam
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
95
size, to crudely reflect the fact th at delensing algorithms require measuring modes at
high t to separate the gravity wave and lensing signals at low I. Under these extreme
assumptions, there is no limit in our framework to the detectability of gravity wave
B-modes as the signal-to-noise improves, since we do not consider other sources of
contamination such as astrophysical foregrounds.
Using unmodified pseudo-C^ estimators, the smallest gravity wave signal which
can be detected is the same as in the left panel, T / S — 0.042; removing the lensing
component of the B-modes results in no improvement. Rephrasing, the extra BB
estimator covariance which is contributed by E-modes dominates the contribution
from lensing B-modes. Using pure pseudo-Q estimators, there is a floor to (T / S ) \ a
in any fixed pixelization, but the floor improves without limit as the resolution is
increased (Nsi^e = 256, 512,1024 are shown.) This is consistent with the discussion
in §3.5; in a finite pixelization there is some contamination of the B-mode estimators
by E-modes, but the contamination goes to zero in the continuum limit. For any fixed
noise level, one can choose a sufficiently high resolution so th at the contamination is
negligible. If this is done, then the value of ( T / S ) \ a obtained is approximately 80%
of the optimal Fisher value, for all noise levels considered in the right panel of Figure
3.7.
The results of this section illustrate a qualitative difference between our pure
pseudo-C^ estimators and the unmodified versions. As the signal-to-noise improves
in a fixed survey region, a floor is revealed to the gravity wave signal which can
be detected using unmodified pseudo-C^ estimators. This is not the case for pure
pseudo-C/s, although a second smaller floor is eventually revealed when the signalto-noise becomes good enough th at the sensitivity is limited by contamination from
lensing. Going beyond this will require separating the lensing signal using delensing
algorithms. We have crudely studied this regime by assuming perfect delensing and
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
96
zero beam size; under these assumptions, we find th at the estimators alone impose
no limit to the value of {T/S) which can be detected.
3.8
Conclusion
In this chapter, we have defined pure pseudo-Q estimators, which have the property
th at the estimated B-mode power receives no contribution from E-modes, even on a
cut sky. As a consequence, E-mode signal power does not contribute to the variance
of the B-mode estimators. The pixel weight functions for these estimators have spin - 1
and spin-2 components (3.25), which can be obtained from the spin-0 component by
covariant differentiation if its form is analytical, or finite differencing (Appendix B)
in general. In contrast to the usual pseudo-C/ formalism, the spin-0 weight function
must have a finite apodization near the boundary, and obey Dirichlet and Neumann
boundary conditions. We give an algorithm (Appendix D) for computing the pseudoCf transfer m atrix which is used for debiasing, and show th at its computational cost
is
0
(£^ia;r), which is the same as the cost of evaluating the estimators.
We have studied these estimators in detail for mock surveys on a 13° spherical cap,
using both homogeneous noise, and inhomogeneous noise of a specific form (3.35). In
both cases, we found th at for B-mode power spectrum estimation, pure pseudo-C/
estimators performed slightly better than the unmodified versions at noise level
20
//K-arcmin, and much better at noise level 10 /iK-arcmin. In the homogeneous case,
we considered a wide range of noise levels, and showed th at the pure estimators are
~ 80% of optimal, defined by the degradation in { T / S ) \ a which can be detected. The
80% figure is obtained in two limiting cases: assuming no separation of the gravity
wave and lensing signals, and perfect separation. In the latter case, there is no limit,
imposed by the estimators alone, to the value of {T/ S) which can be detected. In
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
97
constrast, using unmodified p s e u d o -e s tim a to rs , we have found th at the smallest
signal which can be detected in a patch with this geometry is { T / S ) \ a = 0.042, in
agreement with Challinor & Chon (2005).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
CHAPTER 4
A G EN ER A L SO LUTIO N TO TH E E-B M IX IN G
PR O BLEM
4.1
Introduction
In a finite patch of sky, the E - B decomposition framework needs modification. It
is no longer true th at any polarization field can be uniquely decomposed into pure
E and pure 5 - modes.
A new set of modes, so-called “ambiguous” , needs to be
introduced (Bunn et al., 2003). These new modes receive contributions from both
E and 5-modes. In practice because the E signal is expected to be so much larger
than the B one, the ambiguous modes will be dominated by E signal. Thus all the
information about the cosmological 5-modes is contained in the pure 5-modes, which
are orthogonal to both pure E and ambiguous modes. The leakage of E signal into
5 can be thought of as an additional source of noise on top of th at introduced by the
detector, but a source of noise th at can be eliminated by a suitable choice of variables.
Analyzing the d ata of the next generation of experiments is potentially quite chal­
lenging. Measuring the power spectra using a fully optimal likelihood based analysis
requires 0 ( N ^ X) operations and is probably not feasible for this next generation of
experiments. The alternative technique, widely used for CMB tem perature experi­
ments, is the pseudo-Q> quadratic method (Wandelt et al., 1998). Unfortunately in
a finite patch of the sky the pseudo-Cys do not isolate the pure 5-modes before con­
structing the quadratic estimators. As a result the quadratic estimator is contamined
by the leakage from the ambiguous modes. This contamination can be removed on
average by taking suitable linear combinations of the pseudo-Cy s. However the leaked
power still contributes to the variance of the estimators, significantly degrading their
98
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
99
ability to measure the amplitude of the cosmological B-modes.
In §3, we introduced an improved technique: pure pseudo-C^ estimators. This
technique preserves the simplicity of the pseudo-C /s but at the same time ensures
th at no E —>B mixing occurs.
In this chapter we will look at the pure pseudo-Q from a new perspective. Our
different approach will allow us to understand various features of the pure pseudo-CV’s
and also give guidance as to how to choose the weight function th at the technique
requires. The main result, a general ansatz for optimizing the pseudo-Q weight func­
tion given arbitrary signal and noise covariance, is presented in §4.5. Using this ansatz
to generate weight functions in an autom ated way, we study several mock surveys,
culminating in a realistic simulation of a fiducial experiment, with characteristics
based on the upcoming balloon based EBEX (Oxley et al., 2004). Our simulations
have inhomogeneous noise, complicated boundaries and point source masking. We
will argue th a t the pseudo-Q> estimators give near-optimal power spectrum errors
on all angular scales. This resolves a long-standing practical issue in the pseudomethod: the lack of an algorithm for choosing weight functions in complex situa­
tions. Previous studies have had to rely on optimizing weight functions by hand,
using expensive Monte Carlo simulations to compute the estimator variance for each
candidate weight function considered. At the same time, we preserve the strong E - B
separation of the pure pseudo-Q method, achieving excellent B-mode errors even
for the low noise levels anticipated for the upcoming generation of CMB polarization
experiments (Oxley et al., 2004; Taylor et al., 2004; Yoon et al., 2006).
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
100
4.2
4-2.1
Prelim inaries
S pin two notation
First we will introduce our notation. We will follow closely (Bunn et al., 2003). The
(linear) polarization of the CMB is described in terms of the Stokes parameters Q
and U. The definition of Q and U depends on the coordinate system chosen. In this
subsection we review definitions th at are valid for the full sky, so we will use spherical
coordinates to define Q and U.
We will follow the notation of (Zaldarriaga & Seljak, 1997). The Stokes parameters
can be combined to form a spin-2 combination (Q + iU) and a spin-(—2) combina­
tion (Q — iU). In the full sky these combinations can be decomposed using spin-2
harmonics,
Q + iU = y ] &2 ,lm 2 Yl m
Q — iU =
i
Im
a—2,lm (—2
(4-1)
Im
It is natural to introduce a scalar (E) and a pseudoscalar (B) field to describe po­
larization. The expansion coefficients of these two fields in (ordinary spin-0) spherical
harmonics are
aE,lm = ~ { a2,lm T a —2,lm)/‘l
i
aB,lm ~ i( a2,lm ~ a —2
(4-2)
These two functions completely characterize any polarization field on the sphere
(Zaldarriaga & Seljak, 1997).
They are im portant physically because cosmologi­
cal density perturbations cannot create 5-type polarization while gravity waves can
(Kamionkowski et al., 1997b; Zaldarriaga & Seljak, 1997).
The spin-2 harmonics in equation (4.1) can be related to the usual spin-0 spherical
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
101
harmonics by means of two first-order differential operators, the spin-raising ( 6 ) and
spin-lowering ( 6 ) operators (Zaldarriaga & Seljak, 1997). When applied to the
spin-weighted spherical harmonics, these operators yield the following identities:
& sYim
=
[(I ~ s ) (l + s +
& sYlm =
1)]1//2 s + l ^ l m
~[(l + s)(l ~ s + 1) ] 1//2 s-lYlm-
(4-3)
We can show that:
XE
=
[6 6 ( Q + iU) + 6 6 ( Q - i U ) \/2
=
- ^ [ ( ( + 2 ) ! / ( i - 2 ) ! ] 1/ V l m i ' l m
Im
Xb
=
i[$ 5 (Q + iU) - 6 6 (Q - i U) \/2
=
£ [ ( ( + 2)!/(( - 2
(
4
.
4
)
Im
Thus we can take linear combinations of second derivatives of the Stokes parameters
and obtain variables th a t depend only on £ or on S .
We pause to note th at the reason why E and B are the focus of attention instead
of XEi XB is partly a m atter of convention. Perhaps more importantly, E and B have
the same power spectrum on small scales as the Stokes parameters, while th at of %
differs by a factor of ~ Z4. As a result while white noise in Q and U translates into
white noise in E and B , it becomes colored noise for y.
4-2.2
Small-angle approxim ation
If one works over a small patch of sky, one can use the small-angle (flat-sky) ap­
proximation. When working in this limit, it is more natural to measure the Stokes
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
102
parameters with respect to a Cartesian coordinate system (x, y) instead of the usual
polar coordinate axis. In the flat-sky approximation polarization is expanded in terms
of Fourier modes,
C ix i'
\
d2l
E(l)
U( x ) J ■ / (EY-
cos 2(p
(
—sm 2(f)
•
+ m
^sin 2(f)j
cos 2(f)
o»lx
(4.5)
)\
where x = (x, y) and 1 = /(cos (b, sin (p). The differential operators reduce to simply
4-2.3
$
=
~ ( d x + idy),
(4.6)
&
=
~ { d x - i d y ).
(4.7)
The am biguous m odes
On a manifold without boundary, any polarization field can be uniquely separated
into an E part and a B part. But if there is a boundary (i.e., if only some subset Q
of the sky has been observed), this decomposition is not unique. In this section we
summarize the results of (Bunn et al., 2003).
Polarization fields living on 0 form a normed vector space with the inner product
f (QQ1+ UU')dVL,
JQ.
(4.8)
and we say th at two fields (Q+iU) and ( Q' +i Ur) are orthogonal if their inner product
vanishes. We refer to a polarization field as
• E if it has vanishing
x b ->
B if it has vanishing XEi
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
103
• pure 5 if it is orthogonal to all 5-fields, and
• pure B if it is orthogonal to all 5-fields.
On the complete sky, every polarization field can be uniquely represented as a
linear combination of an E field and a B field, and all E fields are perpendicular to
all B fields. In other words, the space of all polarization fields is the direct sum of
two orthogonal subspaces: the space of all E fields and the space of all B fields. In
this case, there is no distinction between an E field and a “pure 5 ” field.
But if only some subset of the sky has been observed,so th at flis a manifold with
boundary, then this decomposition is not unique. Oneway to see this is to note th at
there are modes th at satisfy both the 5-m ode and 5-m ode conditions simultaneously.
When we split a polarization field into an 5 part and a 5 part, these “ambiguous”
modes can go into either component.
In order to make the 5 / 5 decomposition
unique, we must first project out the ambiguous modes.
The existence of ambiguous modes has im portant implications at the time of
measuring the 5-m ode power spectrum. Here we concentrate on “simple” quadratic
methods th at make no attem pt to filter those ambiguous modes at the level of the
map (such as Wandelt et al. (1998)). A full likelihood analysis would automatically
deal with the ambiguous modes but it will probably be computationally prohibitive.
It might be feasible for low resolution maps, becoming the method of choice for
constraining the gravitational wave amplitude (Efstathiou, 2006).
It is easiest to understand the issue to focus on the pseudo-Q> method, but the
point applies generally. In the pseudo-C/ method one just measures the power spectra
of 5 and 5 as one would do in the full sky and simply masks out the unobserved
regions. After this procedure both C f and C f spectra are dominated by 5-modes.
An illustrative example is shown in figure 4.1. We considered a circular region 10° in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
104
radius. The 5 and B power spectra were measured by simply masking the unobserved
region but otherwise proceeding as if one had a full sky measurement. The calculation
was performed in the flat sky approximation which is adequate for this size region.
The input polarization field had 5-m odes only. In the figure we show the measured
B mode power spectrum spectra, which is produced by leakage of the 5-modes. For
comparison we show the expected 5-m ode power spectrum if T / S =
0 .1
together
with the 5-modes produced by gravitational lensing. Also for reference we plot the
—
1/2
detector noise power spectrum for wp '
= 5.75 /iK-arcmin and an
8
arcminute
beam. This level of noise is within the range of expected values for the upcoming
EBEX balloon experiment (Oxley et ah, 2004).
The pseudo-6 / method then takes linear combinations of the measured C /’s to
eliminate the 5-m ode contribution in the hopes of revealing the cosmological 5 modes. The point is th at this procedure makes the 5-m ode power spectra estimator
independent of 5 in the mean, but the variance of the estimator is still dominated
by the leaked power.
Perhaps an extreme example is helpful to clarify the problem further. Imagine
th a t cosmological parameters were known perfectly so th at there is no uncertainty
in the 5-m ode spectra. Then the leaked 5 power is perfectly predicted and can be
subtracted from the measured signal to leave an unbiased C f estimator. This is
analogous to what one does with detector noise. Unfortunately just as with noise the
variance of the estimator is increased by the leaked power. The pseudo-Q- technique
does even worse than this because it does not assume any knowledge of the C f so it
uses only linear combinations th at are on average independent of 5 for any choice of
C f . (In reality for a small patch this is not possible and some assumptions such as
constancy of the 5 power spectrum in bands must be used. These assumptions are
expected to be reasonably well satisfied.)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
105
10*
Lensing + GW
Leakage
N oise
10- *
io-»
Lensing
10
,GW
100
1000
Figure 4.1: Illustration of the leakage due to finite sky coverage in the ordinary
p s e u d o -te c h n iq u e . A circular region 10 degrees in radius was used to estimate the
pseudo-Q ’s. The leakage curve shows the 5-m odes obtained even in the absence of
any cosmological 5-m odes just from the leakage of E. For comparison we also show
the expected cosmological signal for T / S = 0.1 and the noise power spectrum for an
—
experiment with wp
1/2
= 5 .7 5 /iK-arcmin and an
8
arcminute beam.
Figure 4.1 illustrates the severity of the problem for experiments with noise levels
comparable to those of our fiducial experiment (5.75 pK-arcmin). The leaked power is
larger than the detector noise all the way to I ~ 400, so using the pseudo-C/> technique
is clearly wasteful. Note th at even on scales where the lensing signal dominates there
would be significant degradation.
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
106
4.3
A voiding leakage by m easuring
xb
When trying to measure the 5-m ode power spectrum in a finite patch there is a simple
way to avoid unwanted leakage from 5-modes: estimate the power spectrum starting
from the
xb
map.
In the next section we will show th at this is mathematically
equivalent to the pure pseudo-C*£ technique. It is important to realize th a t in practice
no information is being lost by doing this. By construction
xb
has no contribution
from either pure 5-modes or ambiguous modes. Although in principle there is some
information on the amplitude of 5-modes in the ambiguous modes, in practice th at
signal is completely swamped by the contribution of the 5-m odes so there is no real
loss of information by working with XBPerhaps an analogy helps at this point. One could consider a scalar field, like the
CMB tem perature, and wonder whether there is any loss of information if one were
to compute the Laplacian of the map and extract the power spectrum from that. In
the full sky there would be no loss of information but in a small patch one would
lose. In the full sky there is a one to one relation between the Laplacian map and the
underlying tem perature map (except for the monopole and dipole which are lost when
one takes the Laplacian, but they do not carry cosmological information); the only
difference is th at both signal and noise power spectra will be multiplied by £^(£ + l ) 2
which does not alter how well one can constrain each multipole. In a finite patch
the situation is different because there are non-zero tem perature fields on the finite
patch th at can have zero Laplacian. Those modes are filtered out by the “Laplaciantechnique” with the resulting loss of information. In our case the analogue of the
modes th at are being filtered out are the ambiguous modes which we want to remove
anyway. Thus for constraining the 5-m ode amplitude, starting with the XB field is
sufficient.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
107
Of course the fact th a t the XB maP contains for practical purposes all the cos­
mological information does not mean th at if we extract its power spectrum in a
non-optimal way we will still conserve all the information in the data. If in prac­
tice (as we will do in later sections) we use a pseudo-Cf technique to extract power
spectra, it may be th a t allowing some leakage is still advantageous. As long as the
leaked power is small compared to the detector noise it is most certainly harmless.
We will see th at it is indeed the case th at small amounts of leakage are preferred by
our technique. We will postpone discussion of this point to later.
4-3.1
A very blue spectrum : the need f o r apodization
The additional derivatives needed to construct
\B
make its power spectrum very
blue. We will show in this section th at this induces high levels of aliasing unless
appropriate apodizations are used. Thus apodization will be an integral part of any
technique th a t starts with xbTo understand the issue it is best to write down the formulas in the flat sky
approximation. We assume th at the Stokes parameters have been measured in a
certain region of the sky and th at XB was calculated by finite differencing. Then
the x b fidd is multiplied by a window function W (in real space) and the Fourier
components of
(W x b )
are measured and squared to estimate the power spectrum.
The weight W is zero outside the observed patch. If derivatives are taken using nearest
neighbors, one also loses an additional pixel around the edge of the map. Note that
this loss is directly related to the counting of ambiguous modes as described in Bunn
et al. (2003).
The resulting power estimates are the convolution of the XB power spectrum and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
108
the Fourier transform of the window,
0“
=
J
f
J il l
___
^ 2 IW
- O il2 C f B -
(4.9)
The XB power spectrum is now so blue th at unless the window function decreases
fast with I, all estimates of power will be dominated by the small scale modes.
One way of seeing why this is a problem is to consider normalizing the window
such th at ||W (0 )|p = 1. The estimator in equation (4.9) estimates the power at 1
but has aliasing from other modes. Even if one considered the idealized situation in
which the higher I power spectra were known so th at the aliasing could be calculated
and subtracted, the aliased power would still contribute to the noise of the estimator.
If the aliased power dominates the integral in Eq. (4.9), the additional variance is
large compared to the signal one is after with the resulting loss of information. Thus
one would want to avoid having such aliasing if possible.
The high I behavior of ||W(1 ) | | 2 is directly related to the number of normal deriva­
tives of W th a t vanish at the boundary of the patch. In fact
lim ||JT(1) | | 2 oc r 2(n+2),
I—>00
(4.10)
where n is the number of derivatives of the window th at vanish at the boundary with
n — 0 meaning th at the window itself is continuous. Thus requiring th at equation
(4.9) converges requires using a window th at is smoother at the boundary. For ex­
ample if we compare measuring the power spectra of the Stokes parameters without
apodization to measuring the power spectrum of XBi compensating the additional
/4
factor in the XB spectrum requires having both the window and its first derivative
vanish.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
109
Reducing the aliasing requires considering weights th at decrease continuously to
zero as one gets to the border; on the other hand, if too much apodization is used,
then one is effectively using a smaller patch of sky thus decreasing the signal to noise.
We will use an improved version of this argument to find an optimal choice of W in
section 4.5. Here we just want to mention th at when dealing with XBi because of its
very blue nature, we are forced to confront the issue of apodization straight on.
At this point the reader may wonder if using XB is such a good idea as one
has to take derivatives of the data, deal with extremely blue spectra, etc. In the
next section we will show th at our technique is equivalent to the “pure” pseudotechnique. Thus all the intuition we gain focusing on XB whl be directly translated to
th at other method. The pure pseudo-Cf formalism also clarifies the issue of statistical
weight; we will see that, even though the apodized window may suggest th at the edges
of the survey are downweighted, the statistical weight is in fact roughly uniformly
distributed for the optimal choice of apodization.
4.4
R elation to th e pure p seu d o-C /s
We have now seen th at the problem of estimating the B-mode power spectrum (with­
out contamination from B-modes) can be reduced to estimating the power spectrum
of a scalar field, by passing to the curl X B ■ In principle, any method for power spec­
trum estimation can be applied to XBi to obtain B-mode estimators which do not
suffer from E —> B mixing. We will consider a special case which will be the focus
of the rest of the chapter: estimating the power spectrum of XB using pseudo-Cyestimators.
Let us recall how pseudo-Q power spectrum estimators are defined for the scalar
field XB- First, one computes multipoles of the weighted field (or “pseudo multi­
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
110
poles”):
^ =
Ta- 1w‘+ d « + 2) / ' A a (,)iy w ^
w
(4'n)
where W (x) is a heuristically chosen weight function. The issue of choosing IT(x) will
be studied in detail in subsequent sections. In (4.11), we have included the prefactor
1/ y/(£ — !)£{£ + l ) ( t + 2) so th a t the normalization will match the B-mode power
spectrum C f ^ .
We briefly describe the construction of unbiased bandpower estimators from the
pseudo multipoles (4.11); for more details see (Wandelt et al., 1998; Hivon et al.,
2001). First we sum the multipoles in bins, obtaining pseudo bandpowers
^ dgf y 1
1(1 + 1)
~y*~x
0a 1^21+1
2 tr
^
Im Im
£&b
m=—£
,4 1 2 x
V4-14)
where the index a runs over £ bands. Then we define unbiased estimators by
c a = K ^ iC c
where N a is the (additive) noise bias of each
between pseudo bandpowers
- AW
(4.13)
Ca , and K aa/ is the transfer matrix
(4.12) and signal bandpowers A a/. (Moreprecisely, N
and K are defined by (Ca ) = K aai A ai + N a .)
The preceding construction defines a power spectrum estimator for polarization
which eliminates E —> B mixing in the strongest possible sense: the estimated Bmode power acquires no contribution from B-modes. In §3, “pure pseudo-Q estima­
tors” for polarization, which also eliminate B —►B mixing in this strong sense, were
defined. We now prove th a t the two estimators are mathematically equivalent.
Substituting the definition (4.4) of
xb
int° the definition of the multipole (4.11)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
111
and integrating by parts, one obtains:
(4.14)
+ c'c-
where we have defined the spin - 1 and spin - 2 quantities
Wi( x) = 6 W ( x )
W 2(x) = 6 d W ( x ) .
(4.15)
In the form (4.14), it is seen th at a*m agrees with the pure multipole defined in §3,
after changing from tensor notation to the spin-s notation used here. This shows that
the pseudo multipoles in the two constructions are equal; since the rest of the pseudoC( construction (Eqs. (4.12), (4.13)) is also the same in both cases, this completes
the proof th a t the \ B estimator and the pure pseudo-Q estimator are equal.
Since the estimators are mathematically equivalent, it is a m atter of preference
which to use, and we will switch between the two formalisms in the rest of this chapter
depending on which is more convenient. Using the XB estimator has the advantage
th at pure B-mode power spectrum estimation is equivalent to power spectrum esti­
mation of a scalar field with a very blue spectrum. As we will see shortly, this will
allow us to formulate a unified procedure for optimizing the weight function W (x),
which applies to both pure and ordinary p s e u d o -e s tim a to rs .
In the pure pseudo-C^ formalism, the terms have been organized (Eq. (4.14)) so
th at the derivatives act only the weight function, rather than the spherical harmonic or
the noisy polarization map. Provided th at the weight function varies slowly compared
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
112
to the pixel scale, this can make it easier to take derivatives numerically; one only
has to compute W i ( x ) , W 2 (x).
The first term in Eq. (4.14) is the usual pseudo-
Ci estimator for 5-modes; the remaining two terms are interpreted as higher-spin
counterterms which cancel the E —> B mixing. We will see th at this form of the
estimator is convenient when optimizing the weight function numerically given a
noise map (§4.7).
We conclude this section by discussing a general property of the estimator: the
weight function must be apodized so th at both W (x ) and its derivative vanish at the
boundary. It is illuminating to see how this requirement arises from both ways of
defining the estimator.
For the pure pseudo- 6 ^ estimator, it is seen (from the mode in brackets in Eq. (4.14),
which multiplies the Stokes parameters in a pixel) th at the statistical weight of a pixel
is given by a combination of W (x) and its first two derivatives. Therefore, if either
W( x) or its derivative have nonzero boundary values, the statistical weight will con­
tain a delta function on the boundary, which leads to a divergent estimator. (One
might try to cure this divergence by defining W \ , W 2 with the delta function terms
omitted from the derivatives; the estimator will then be finite, but since the rela­
tions (4.15) do not hold strictly, the estimator will mix E —> B .) This reasoning also
shows th a t even though W( x) is apodized near the boundary, the statistical weight
of pixels near the boundary need not be small, since the counterterms W\ and W 2
will be largest there. In fact, for the optimized weight functions th at we will consider
in subsequent sections, we have found th at the statistical weight is roughly uniformly
distributed.
For the XB estimator, the divergence arises in a different way.
Although the
estimator appears not to involve derivatives of W(x) , the field \ B h-as a noise power
spectrum on small scales which grows as t
If either W (x) or its derivative is nonzero
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
113
on the boundary, then jjkF(/) 112 decays as 1/T4 or slower for large I (Eq. (4.10)) which
causes the integral (4.9) to diverge. From this perspective, the need to apodize is
understood as a consequence of small-scale noise fluctuations in \ B with a very blue
power spectrum.
4.5
C hoosing th e optim al weight function
In pseudo-C'^ power spectrum estimation, an im portant practical issue is choosing
the pixel weight function W (x ) to minimize the variance of the estimator. Although
no general procedure has been proposed, several rules of thum b for optimizing W (x )
under different limiting conditions have appeared. In the noise-dominated limit, in­
verse noise weighting {W(x) = l /a^(x)) is optimal; in the signal-dominated limit,
uniform weighting is best possible (Challinor & Chon, 2005). If the power spectrum
is being estimated on scales smaller than fluctuations in the noise level, the FKP
ansatz (Feldman et al., 1994) has become the industry standard for galaxy surveys.
The preceding rules of thumb implicitly assume th at the signal + noise power
spectrum is white on small scales. However, we have seen th at when choosing the
weight function for pure 5-m ode estimation, one includes an extra factor t 4 in the
power spectrum. This qualitatively affects the optimization and invalidates the rules
of thumb. For example, W (x ) and its first derivative must vanish on survey bound­
aries, in contrast to the white noise case.
In this section, we will describe a general ansatz for the optimal weight function,
which makes sense for an arbitrary A'pix-by-Apjx matrix C representing the total
covariance (signal plus noise).
In particular, it can be applied to a noise power
spectrum which grows as any power of £, and so provides a uniform framework for
both pure and ordinary pseudo-Cy power spectrum estimation. We will see th at our
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
114
ansatz reproduces the above rules of thumb in the appropriate limits, and correctly
apodizes W( x ) if the power spectrum is blue on small scales.
First we introduce some notation.
The Apix-by-Apjx signal covariance in the
bandpower we are estimating will be denoted C a , where a is an index ranging over
£ bands, and we normalize it so th at
=
1.
If M\ and M 2 are two Apix-by-jVpjx matrices, then [M\ * Mq\ denotes the Arp;xby-Apix m atrix obtained by elementwise multiplication:
[Mi * M2]ij = (A/i
-
(4.16)
The multiplication is always performed in a basis where each row or column cor­
responds to one pixel; otherwise the operation defined by (4.16) would be basisdependent.
Now we can state the ansatz for optimizing the pseudo-6 ^ weight function which
will be the main result of this chapter: the optimal weight function W opt (thought
of as a length-Npix vector) is given by
W op, = [C“ . C ] - 1 l Vec
(4.17)
where l Vec deontes the length-A^pjx vector consisting of all l ’s.
It may seem strange th at the ansatz (4.17) includes the * operation (4.16) which
is specific to the pixel-space basis. Loosely speaking, in p s e u d o - p o w e r spectrum
estimation, we are constrained to use a filter (multiplying by the mask) which is
diagonal in pixel space, and so the pixel basis is given special significance. Before
giving the derivation of the ansatz, we note some general properties. W ithout using
the * operation, it seems to be impossible to achieve all of the following at once:
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
115
1.
If the noise is uncorrelated between pixels, and the noise dominates, then Wopt
corresponds to inverse noise weighting. This follows from Eq. (4.17) upon noting
th at C is diagonal in the pixel basis, say C jj = crfSij, so th at (C a *C ) = C and
(Wopt)t = <T_2- Note th at in this case, p s e u d o - p o w e r spectrum estimation
is optimal, in the sense th at the Cramer-Rao inequality is saturated.
2. If the signal + noise power spectrum grows as a power law I^ on small scales,
and A > 0, then Wopt is apodized near the boundary. The number of derivatives
which vanish at the boundary is given by (A/2 —1 ), in accord with the discussion
at the end of §4.3.1. As the wavenumber £q where one is estimating the power
spectrum increases, the apodization length decreases. These statements will be
proved in §4.5.2.
3. Eq. (4.17) reduces to the FKP prescription in the regime where FKP applies.
This is shown in Appendix G.
The matrix form of the ansatz (4.17) may suggest th at A/pjx-by-Apjx m atrix operations
are needed to compute the optimal weight function, which would be problematic
since such operations are computationally prohibitive in situations where pseudoestimators are used. (If they were feasible, then one could do a maximum likelihood
analysis instead.) However, in Appendix H, we will see th at a conjugate gradient
approach may be used to compute the RHS of Eq. (4.17) even in surveys where the
pixel count is large.
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
116
4-5.1
D e riva tio n
Now we give a heuristic derivation of the ansatz (4.17). The idea is to find the weight
function W such th at the pseudo-Q estimator (defined in Eq. (4.12))
Ca [d] =
diWiC^Wjdj
(4.18)
ij
is as close as possible to the optimal estimator
O a [ d] d=
X
^ C
- ^ C
- 1) ^ - •
(4 -1 9 )
ij
In Eqs. (4.18), (4.19) we have denoted the data vector (a length-A^pjx vector whose
covariance m atrix is C) by d.
How should “close” be defined? We first note th at both estimators can be written
as Monte Carlo averages
C o Ml =
^
§>c
Oa{d]
=
)
(4-20>
where the notation (-)$a means th at an expectation value is taken over an auxiliary
field &a with covariance C a . From the expressions (4.20), a natural definition of
“close” is th at the expectation value
e ={(E
- £ *“c.7dJ I )
ij
/
'
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
<4-21)
117
be minimized. Here, the expectation value is taken over both the auxiliary field <&a
with covariance C Q and the data vector d with covariance C . We minimize E by
setting its derivative with respect to the weight function to zero:
i
ij
d ,$ Q
13
(no sum on k).
(4.22)
i.e., W opt = ( C * C a ) - 1l vec. This completes the derivation of Eq. (4.17).
4 -5 .2
A variation al f o r m o f the a n sa tz
In Eq. (4.17), we have written the ansatz for the optimal weight function W opt as a
m atrix equation. There is an equivalent formulation as a variational principle which
connects to the discussion in §4.3.1 and will also be directly useful in later sections:
W opt is the weight function which minimizes the expectation value of the pseudo
bandpower Ca
JjWiWj C y C g
(4.23)
subject to the normalization condition
^
Wi = const.,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(4.24)
118
This is shown by differentiating Eq. (4.23) with respect to IT), obtaining
y~] Cj j C f j W j =
j
1
(no sum on i)
(4.25)
i.e., Wopt = (C * C ^ - i l v e c . (Strictly speaking, the RHS in Eq. (4.25) should be
equal to a Lagrange multiplier A, but the overall normalization of the weight function
is arbitrary.)
In the previous subsection we showed th at W opt is the weight function which
makes the pseudo-Cy estimator approximate the optimal estimator as closely as pos­
sible. The variational principle gives another interpretation of W opt which matches
the discussion from §4.3.1: W opt is the weight function which minimizes the total
aliasing from both signal outside the bandpower and noise. Even if known perfectly
on average, say because the power spectrum on other scales was given, this aliasing
will contribute to the noise in the estimator thus degrading the signal to noise.
The variational principle also gives a simple proof of property # 2 at the end of
§4.5, th a t (A/ 2 —1 ) derivatives of W opt vanish at the boundary if the power spectrum
is a power law
on small scales. From Eq. (4.10), if n is the number of derivatives
which vanish, then the contribution to the expectation value (Ca ) from small scales
(large I) is given by
(4.26)
Therefore, n > (A/2 — 1) is a necessary condition for the integral to converge, and
so W opt must satisfy this condition, since it is chosen to minimize the expectation
value.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
119
4.6
Som e analytic solutions
To build intuition for the ansatz (4.17), we give some analytic solutions in the case
where the survey region is a circle of radius R and the covariance C is given by a powerlaw power spectrum C , where A = 2 or 4. We assume uniform noise throughout the
survey region, so th at the covariance is completely described by a power spectrum,
but statistical isotropy is broken by the survey boundary.
More realistic surveys
and power spectra will be considered in subsequent sections. We use the flat sky
approximation throughout.
4-6.1
Case 1: Z2 p o w e r spectrum
Let us first consider the case where the power spectrum is proportional to i2 , so th at
the operator C is equal to (—V 2). Computing the operator (C * C a ) involves some
subtleties, so we show the details explicitly. In the position space basis, the matrix
elements of C and C a are given by
(C )^
=
- V 2S2( x - y )
{Ca)xy
=
J q{Zq\x ~ y\) ■
(4.27)
where £q denotes the wavenumber where we are estimating the power spectrum. From
the definition (4.16), the m atrix elements of (C * C a ) are
( C * C a )xy =
=
[ - V 262( x - y j \ j 0(e0\ x - y \ )
—V 2S2(x — y ) +
£ q S2 ( x
— y)
(4.28)
(4.29)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
120
0.8
^
0.8
10R-"
0.6
0.4
0.2
0.2
0.2
0.6
0.4
0.8
1
0.2
r /R
0.4
0.6
0.8
r/R
Figure 4.2: Optimal window functions WoPt(r), given by Eqs. (4.31), (4.33) for a
signal+noise power spectrum proportional to 12 (left panel) or / 4 (right panel).
where we have used Jo(0) = 1, J' ( 0) = 0, Jq (0) = —1/2. This calculation shows that
the operator (C * C a ) is equal to (—V 2 + ^q).
Combining Eqs. (4.17), (4.28), the optimal weight function satisfies the differential
equation
(—V 2 + £Q)Wopt(x) = const.
(4.30)
with Dirichlet boundary conditions (as proved in §4.5.2). The solution to (4.30) is
given by
( 4 ' 3 1 )
The solutions to (4.31) are shown in the left panel of Fig. 4.2. For the A = 2 power
spectrum, the weight functions are apodized near the boundaries with an apodization
length which decreases with increasing £q. This is in contrast to the white noise case
where the weight function would be uniform, independently of £q, if the noise is
homogeneous throughout the survey.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
121
4 -6 .2
Case 2: /4 p o w e r spectrum
We next consider the case of a signal + noise power spectrum which is proportional
to C. This case corresponds to pure B-mode estimation in the noise-dominated limit.
A calculation similar to (4.28) shows th at the differential equation for the weight
function is
( - 'V 4 + 4 ^ V 2 - 4 ) W = const.
(4.32)
with Dirichlet + Neumann boundary conditions. The solution to (4.32) is
w
=
* + W - r ) h ( e + R ) - e - I Q( £+r) h( £- R)
£+I0( £ - R) h ( £+R) - £ - I 0C + R ) h C - R )
opU ;
{
}
where
£ ± d= ( 2 ± V 3 )1/2£q
(4.34)
This solution is shownin the right panel of Fig. 4.2. Relative to the A = 2 case (left
panel), the apodization length is larger, and the apodization issuch th at both W( x)
and its derivative vanish on the boundary.
4.7
N um erical solutions
Let us summarize our results so far. We have shown that E —» B mixing in pseudoCi power spectrum estimation can be eliminated by taking the curl of the map, then
estimating the power spectrum of the resulting scalar field. Because the curl operation
results in a noise power spectrum which grows as Cg oc £4, the weight function must be
smoothly apodized near boundaries in order to control contamination from small-scale
power. This technique is mathematically equivalent to the pure pseudo-C^ formalism
from §3, in which “counterterms” involving spin-1 and spin-2 weights are added to the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
122
ordinary pseudo-C'^ estimator to cancel E —> 5 mixing. We have proposed a general
ansatz (4.17) for optimizing pseudo-C/; weight functions in the presence of arbitrary
signal and noise covariance, and constructed analytic solutions in special cases.
For practical application, one needs a method for solving the ansatz numerically,
giving a specification of the noise. The most general noise model we will consider
is uncorrelated between pixels, isotropic in each pixel, but with an arbitrary pixeldependent amplitude:
For this noise model, we have implemented a “black-box” procedure which starts from
the noise RMS a(x) in each pixel and outputs 5-m ode and 5-m ode weight functions
in each ^ band. The procedure uses the variational principle from §4.5.2, finding
weight functions which minimize the quantity (Ca )■ In this and the next section, we
will describe qualitative features of the solutions and study estimator performance,
deferring implementational details of the method to Appendix H.
We work in the pure pseudo-Q formalism, with one minor modification. As we
have described it in §4.4, the spin-1 and spin-2 weights are always given in terms of
the spin- 0 piece by:
W i(x ) = S W ( x )
W 2(x) = c> S W (x ),
(4.36)
In our numerical optimization procedure, we do not impose Eqs. (4.36) as constraints;
each 5-m ode “weight function” actually consists of three independent pieces: a spin0 (scalar) function Wq(x), a spin-1 (vector) field W\(x), and a spin-2 (tensor) field
W 2 {x). Note th a t for EE power spectrum estimation, we do not include higher-spin
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
123
counterterms, and so each .E-mode weight function is simply a scalar W(x).
In order to differentiate between different flavors of estimators, we will use the
following terminology in the rest of this chapter. If both counterterms are absent
(i.e., W\ = W 2 = 0), we will refer to the E-mode estimator as “ordinary pseudo-Cy’.
We will reserve the term “pure pseudo-Cy’ for the situation where the relations (4.36)
hold exactly, e.g. because W (x ) is of known analytical form and W \ , W 2 are obtained
by simply evaluating derivatives in each pixel. In this case the E-mode estimator
will have zero E —> E leakage (except perhaps from aliasing artifacts in a finite
pixelization). Finally, in a case where the relations (4.36) hold approximately, we
will refer to the E-mode estimator as “p s e u d o - w i t h counterterms” . This will
be the situation for the weight functions produced by our numerical optimization
procedure; since the relations (4.36) are not imposed as constraints, they do not hold
strictly, and so there will be some nonzero mixing in the estimator. However, we
will see shortly th at the relations will hold approximately, and the level of E —> E
mixing for optimized weight functions is small in practice. This is because reducing
the mixing helps minimize the quantity (Ca ), so our optimization procedure prefers
weight functions which satisfy Eqs. (4.36) to a good approximation. In fact, this is the
main reason why we have found it convenient to work in the counterterm formalism,
rather than passing to the curl x b (which we showed was equivalent in §4.4): the
variational optimization procedure sidesteps implementational issues associated with
taking derivatives in an irregular spherical pixelization.
In Fig. 4.3, we show the result of our optimization procedure for (. = 20, a spherical
cap shaped survey with radius 10°, and three choices of noise level: 100, 20, and 5
//K-arcmin. For the largest noise level, both higher-spin counterterms are small and
the weighting is roughly uniform. In this regime, our optimization procedure reduces
to ordinary p s e u d o - p o w e r spectrum estimation with tophat weighting. For the
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 4.3: 5-m ode weight functions for varying noise levels, for I = 2 0 and a 1 0 °
spherical cap with homogeneous noise. As descrbed in the text, each weight function
consists of spin-0 , spin- 1 , and spin - 2 pieces, which we show in the left, center, and
right columns. The top three rows show the 5-m ode weight functions produced by
our optimization procedure for noise levels 100, 20, and 5 /iK-arcmin (from top to
bottom). In the bottommost row we show the analytic solution given by Eq. (4.33).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
125
50
: E - m o d e p o w e r s p e c tr u m
"t 30
0
1000
500
2000
1500
0.015
0.1
"—
“
B - m o d e pow er s p e c tr u m
T /S = 0 (le n sin g BB)
T /S = 0 .0 5
0.08
0.01
0.06
0.04
0.005
0.02
0
50
100
150
200
200
4 00
600
6 00
1000
Figure 4.4: Power spectrum errors for the 10° homogeneous mock survey described
in §4.7, using pseudo-Q power spectrum estimation with counterterms (left/blue) or
optimal estimators (right/red). For the B -mode power spectrum, we compute power
spectrum errors in a fiducial model with T / S = 0 but also show a spectrum with
T / S = 0.05 for comparison.
smallest noise level, the weight function is apodized, both counterterms are present,
and all components of the weight function have nearly converged to the analytic
solution (4.33), which represents the limit E E
B B . Here our procedure reduces
to pure pseudo-Q with weighting th at can be understood by solving the differential
equation (4.32). Our optimization procedure therefore smoothly interpolates between
ordinary and pure p s e u d o - a s the noise level is improved, “turning on” the higherspin counterterms as they are needed to reduce E —»• B mixing below the noise floor.
(We will illustrate this in a different way in the next section.)
Next we consider power spectrum errors from our estimators for a spherical cap
shaped mock survey with radius 10°, homogeneous noise level 5.75 pK-arcmin and
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
126
Gaussian beam
$FWHM
=
8
arcmin. These values have been chosen to roughly model
the fiducial realistic survey which will be treated in more detail in the next section.
We have used an azimuthally symmetric mock survey so th at optimal, or max­
imum likelihood, power spectrum estimation will be computationally feasible, even
though the pixel count is large enough th at this would normally be impossible. The
computational speedup is obtained because both signal and noise covariance m atri­
ces are diagonal in m; for details see Appendix F. This allows us to compare our
estimators to optimal in a baseline survey which approximates our fiducial realistic
experiment (§4.8) within the constraint of azimuthal symmetry.
In Fig. 4.4, we show power spectrum errors for the mock survey using both pseudoC'i with weight functions produced by our optimization procedure, and optimal power
spectrum estimators. It is seen th at our method is slightly suboptimal at very low
I but rapidly becomes optimal. For example, the B-mode bandpower RMS is 72%
optimal in the lowest band,
88%
optimal in the second-lowest, and 91% optimal in
the third lowest. We emphasize th at this level of performance for 5-m odes is much
better than could be obtained at this noise level using ordinary pseudo-Q) estimators
(Challinor & Chon, 2005), and th at we have obtained it using a completely autom ated
procedure for generating weight functions from the noise distribution.
4.8
A realistic exam ple
In the preceding section we considered a homogeneous, azimuthally symmetric mock
survey whose noise distribution was much simpler than a real CMB experiment. We
conclude by considering a more realistic example. We will use noise maps made from
preliminary simulations of the EBEX experiment (courtesy of Sam Leach). The noise
map is shown in Fig. 4.5. The simulation used the scan strategy proposed by EBEX
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
127
12.04 ^K-arcmin 6.02
5.25
4.72
4.32
4.01
3.76
3.55
Figure 4.5: Survey region, noise distribution, and point source mask for our fiducial
realistic experiment. The side length of the bounding square is 25°. The map is based
on preliminary simulations of the upcoming balloon experiment EBEX.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
128
and a realistic focal plane configuration to estimate the number of hits for every
1
arcminute square pixel after a 14 day long duration balloon flight. The details of the
scan, focal plane configuration, number of detectors and ultim ate sensitivity are not
of particular interest here and are probably subject to change before the flight takes
place. However the simulations give a good illustration of the level of complexity of
the noise maps and sky coverage for upcoming ground and balloon based experiments.
Thus they serve as a nice test bed for our methods.
Since the map is not azimuthally symmetric, computing optimal power spectrum
errors is computationally prohibitive and we will not be able to compare our method
to optimal. Instead we will compare p s e u d o - p o w e r spectrum errors for the fiducial
realistic experiment to those obtained for the azimuthally symmetric mock survey
in the preceding section, which roughly approximates the sky coverage and noise
distribution.
In Fig. 4.5, we have generated randomly-positioned point sources with average
number density 0.04 deg- 2 , and masked each point source to radius 17 arcmin, cor­
responding to the 5a level of the beam (assumed Gaussian with
% W H M
— 8
arcmin).
W ith these parameters, 1% of the survey area is excluded by the point source mask.
The area th at will need to be cut out to sufficiently mask point sources to measure
5-m ode polarization is at this point rather uncertain. We have based our choice of
1% on point source simulations conducted at SISSA for the Planck satellite reference
sky (courtesy of Carlo Baccigalupi). Our choice should thus be seen as an illustrative
example to help us understand how our method would deal with the resulting holes.
We will analyze the map with and without the mask, to isolate the impact of point
sources when measuring 5-modes.
For analyzing a very inhomogeneous map such as this, with noise fluctuations on
large and small scales, an autom ated procedure for optimizing weight functions is a
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Figure 4.6: Weight functions for the fiducial realistic experiment. Each of the three
rows corresponds to a different bandpower, with multipole ranges (frnin , ^max) given
from top to bottom by (30,70), (190,230) and (510,550). W ithin each row from left to
right, the E-mode weight function, the spin-0 piece of the E-mode weight function,
and the spin-1 and spin-2 pieces of the E-mode weight function are shown.
Figure 4.7: A zoomed view of the (Anim4nax) = (190,230) E-mode weight function
near point sources. From left to right, the panels show the noise map, the scalar piece
of the weight function, and the two higher-spin counterterms.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
130
practical necessity. Using the optimization procedure described in §4.7, we generated
weight functions for both E-mode and E-mode power spectrum estimation; some
representative weight functions are shown in Fig. 4.6.
Considering E-mode weight functions first, our method generates weight functions
which are more inhomogeneous than those given by the FK P ansatz (Appendix G)
but similar in performance. There is a slight improvement at low £; e.g. for the lowest
.E-mode bandpower, the the bandpower RMS given by our estimator is ~ 10% better
than FKP. Note th at in §4.7, we did not discuss E-mode weight functions, since they
are very simple for the homogeneous mock survey.
Turning now to E-mode weight functions, one qualitative feature is th at their
statistical weight is concentrated near the center of the survey (compared to the Emode weight functions). This is for signal-to-noise reasons, since the E-mode signal
is weak and the center of the survey is least noisy.
Another feature of the E-mode weight functions is th at the higher-spin countert­
erms are relatively smooth across the map, even though the scalar piece of the weight
function has structure on small scales matching similar structure in the noise. Instead,
our numerical optimization procedure prefers to associate counterterms with “large
scale” features such as survey boundaries or point sources, but not with small-scale
features in the noise. From Fig. 4.6, one sees th at the counterterms are sourced mainly
by boundaries at low £ and by point sources at intermediate to high £. A zoomed-in
view of the E-mode weight function near point sources is shown in Fig. 4.7. The
behavior of the weight function near the “internal” boundaries of point sources is
similar to the behavior near external survey boundaries seen in Fig. 4.3: the scalar
piece of the weight function is apodized, and the counterterms are large near the
boundary.
One consequence of the counterterms being smooth, even though the scalar piece
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
131
E-*B c o n ta m in a tio n ( o rd in a r y p se u d o -C ,)
E-*B c o n ta m in a tio n (p s e u d o -C j w ith c o u n te r te r m s )
n o ise p o w er s p e c tr u m
le n s in g BB p o w e r s p e c tr u m
___ ____ —
^ i
/M
■“N
0.01
+
0.001
0.0001
200
400
600
800
1000
Figure 4.8: Contribution of aliased .E-modes to the pseudo power spectrum C f B for
the fiducial realistic experiment with point source mask, with signal and noise power
spectra shown for comparison. Using counterterms from our numerical optimization
procedure, the level of E —> B mixing is well below the noise floor, in contrast to
ordinary pseudo-Q.
of the E-mode weight functions have small-scale structure, is th at the relations (4.36)
are not strictly satisfied. Therefore, the estimator is not completely pure; there is
some nonzero level of E —►B mixing. This is quantified in Fig. 4.8, where we show
the mean contribution of aliased E -modes to the pseudo power spectrum Cg. In
the pseudo-C^ construction, this contribution is always removed in the mean by the
debiasing step, but acts as a source of extra variance. It is seen th at our optimization
procedure produces counterterms which reduce E —> B mixing well below the level
of the noise, even though strictly speaking, the mixing is nonzero, i.e. the estimator
is not pure. (In Fig. 4.8, some “stair-step” features can be seen which arise because
we do not generate a separate set of weight functions in every band, e.g. we use the
same weight function between £ = 510 and £ = 830 to save CPU time.)
Next we study the power spectrum errors which are obtained using these weight
functions. In Fig. 4.9, three sets of power spectrum errors are compared. Considering
the leftmost set of error bars first (i.e. pseudo-C^ with counterterms and without the
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
132
50
; E - m o d e p o w e r s p e c tr u m
40
ft- 30
0
1000
500
0.03
0.15
- B - m o d e p o w er s p e c tr u m
T /S = 0
T /S = 0 .0 5
01
2000
1500
0.02
S.
u„
0.01
0.05
+
0
50
100
150
200
200
400
600
800
1000
Figure 4.9: Comparison of three sets of pseudo-Q power spectrum errors for the
fiducial realistic experiment. Weight functions were optimized independently in each
of the three cases. Left/blue: P s e u d o - w i t h counterterms, without the point source
mask. Center/magenta: Pseudo-Q with counterterms, with the mask. Right/red:
Ordinary pseudo-C^, with the mask. For F'-mode power spectrum errors, the three
cases are so similar th at only one set of error bars is shown.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
133
point source mask), we would like to know, how do these errors compare to optimal?
As previously remarked, we are unable to make this comparison directly since the
fiducial experiment lacks azimuthal symmetry. However, the pseudo-C^ errors for
the fiducial realistic experiment can be compared to the p s e u d o - e r r o r s which were
obtained previously (Fig. 4.4) for an azimuthally symmetric mock survey with roughly
the same sky coverage and noise level. We find th at the two are roughly equal; the
RMS bandpower errors for the fiducial realistic experiment for both EE and BB are
~ 10% worse th an the mock survey at low I and ~ 10% better at high £. It seems
plausible th at this difference is simply due to inhomogeneity in the noise, and th at
our method remains near-optimal for the fiducial realistic experiment, although we
cannot say this with absolute certainty since direct computation of the optimal errors
is not feasible.
Now let us compare the errors with and without the point source mask, for pseudoC(t estimators with counterterms (i.e., the left and middle error bars in Fig. 4.9). We
find that the point source mask degrades the errors on BB by ~ 20% for £ < 250
but the effect becomes small for large £, or for EE on all angular scales. Even though
a tiny fraction of sky area is masked, we find th at the impact on B-mode errors is
not negligible. This is to be expected since the point source mask does decrease the
number of pure B-modes in the survey region. However, even taking a conservative
estimate for the number density of point sources, we have found th at this effect is not
large.
Finally, we compare pseudo-C^ with counterterms to ordinary p s e u d o - ( i . e . , the
middle and right error bars in Fig. 4.9). It is seen th at at this noise level, including
counterterms in the estimator is essential for E - B separation. If ordinary pseudoestimators are used instead, the RMS B-mode errors are worse by a factor ~2-3 at
intermediate to high £, and a factor ~ 10 at low £. If this comparison is repeated
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
134
without the point source mask, we find th at the two become equal at high t but
remain drastically different at low I. This suggests th at the suboptimality of ordinary
pseudo-C^ is mainly due to E —>B mixing from survey boundaries at low i and point
sources at high t.
4.9
D iscussion
We have given a general treatm ent of E —> B mixing and argued th at when estimating
B-mode power, no information is lost by passing to the curl X B ■ The analogue
for a scalar field, passing to the Laplacian, would lose the largest-scale modes on
a cut sky, but for a spin - 2 field it is the ambiguous modes which are lost in this
way. This presents a general approach to constructing estimators which are pure, i.e.
which filter out contributions from ambiguous modes. We considered the estimator
defined by applying p s e u d o - e s t i m a t i o n to the scalar field x jj, and proved th at
it is mathematically equivalent to the pure pseudo-6 ^ estimator from §4.4.
The
equivalence sheds light on properties of the estimator, e.g. the requirement of a
smoothly apodized weight function can be understood as arising from colored noise
in
xb-
Perhaps most importantly, it allows us to consider ordinary and pure pseudo-
Ci estimators in a unified way when studying the problem of optimizing the weight
function.
Our central result is a general ansatz for optimizing the pseudo- 6 ^ weight function.
The ansatz can be expressed either in m atrix form (Eq. (4.17)), or as a variational
principle: the optimized weight function minimizes the total expectation value (Ca )
of the estimator, subject to the constraint ^2x W (x ) = 1. Although our emphasis
has been on next-generation ground-based polarization experiments, and we have
only considered noise which is uncorrelated between pixels (Eq. (4.35)), the ansatz is
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
135
Mode-counting estimate (Eq. (4.37))
Homogeneous circular mock survey,
optimal estimators (Fig. 4.4)
Homogeneous circular mock survey,
p s e u d o - w i t h counterterms (Fig. 4.4)
“Realistic” noise map, no point source mask,
p s e u d o - w i t h counterterms (Fig. 4.9)
“Realistic” noise map, point source mask applied,
pseudo-C^ with counterterms (Fig. 4.9)
Ordinary pseudo-
( T /S ) lff
0.0032
^(-^lens)
0.095
0.0039
0.095
0.0043
0.096
0.0045
0.082
0.0056
0.0536
0.085
0.258
Table 4.1: Forecasts for the l a upper limit on (T /S ) and the fractional error on the
amplitude Aiens of the lensing 5-mode, for various levels of approximation to the
fiducial realistic experiment.
much more general: in principle it applies to arbitrary signal and noise covariance.
In particular, it should be useful for the tem perature power spectrum, but we have
not pursued this here.
We have shown our the ansatz can be solved numerically for an arbitrary spatial
noise distribution, and studied the performance of our estimators for several mock
surveys, culminating in a full simulation of a realistic experiment with characteristics
based on the upcoming balloon borne EBEX: complicated boundaries, inhomogeneous
noise and a randomly generated point source mask (Fig. 4.5). This is summarized in
Tab. 4.1, where we have compressed the 5 - mode power spectrum errors obtained for
each survey into two numbers: the l a upper limit on (T/ S ) and the fractional error
<r(yliens) on the overall amplitude of the lensing B - mode. In computing cr(Tiens),
we have used the WMAP3 best-fit model (Spergel et al., 2006) with <jg = 0.74; in
a different fiducial model it would scale roughly as cr(Ajens) oc a ^ 1. (We have also
treated the power spectrum covariance as Gaussian; including non-Gaussianity in the
lensing 5-m ode is expected to give a ~ 10% correction to cr(A|ens) at noise levels of
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
136
our fiducial experiment (Smith et al., 2006) but the effect can become large for lower
noise.)
As a baseline, we have shown in the first row of the table the naive “modecounting” estimate, given by assigning uncorrelated errors to each bandpower as
follows:
1
/
C BB
Cov(At , Mb) = - f sky £ ( 2 C + 1) \^C J
\ 2
+ N(J
(4-37)
where Afy is the noise power spectrum. In the last row, we have shown for comparison
the result of using ordinary p s e u d o - e s t i m a t o r s without counterterms.
Each row of the table isolates one effect which might be worrying for E - B separa­
tion: mixing from the survey boundary, suboptimality of pseudo-6 ^, inhomogeneous
noise with small-scale features, and the point source mask. Our conclusion is th at
each of these has a small impact using our method, although it is worth noting th at
when combined, these effects do result in a value of ( T / S ) i a which differs from the
baseline mode-counting estimate by 75%. Considering all of these in combination,
we have given a complete treatm ent of “geometric” effects which arise from the mask
and inhomogeneous noise coupling E and B in a realistic survey.
It should be mentioned th at there are other, non-geometric effects not considered
here which will have an impact in real CMB polarization experiments, such as ( 1 //)
noise, removal of ground-synchronous modes, systematic errors, and foreground con­
tamination. In particular, the power spectrum errors we have presented in §4.8 should
not be regarded as a “bottom line” forecast for EBEX, as these effects have not been
included. However, geometric mixing of E and B has been a practical concern for
next generation experiments; we have given a general solution to the problem and
demonstrated the method for noise distributions with the complexity of a real ex­
periment. Moreover, we have shown how to generate optimized weight functions in
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
137
a completely autom ated way, thus removing an outstanding practical obstacle in the
pseudo-Q method.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX A
SP IN -1 SPH ER IC A L H A R M O N IC T R A N SFO R M S
As discussed at the end of §3.4.2, our algorithm for fast evaluation of B^m requires an
implementation of fast spherical harmonic transforms for vector fields. This will also
be an ingredient in the algorithm, to be presented in Appendix D, for computing the
transfer matrices K ^ f ure. In this appendix, we write down the recursion and initial
conditions which are needed:
( l ^ + l ,m ) ^ + l ,m =
(i V
(Ye%
~
(co s0 )Y ^ —
(A-l)
h .j
= tabiYgf
=
{- ^ - J
(si/
- 1
e y l* [(cos 9)xa + iy«\
Here, (,A<m) = ( 1 /Q V (< 2 - 1) ( < 2 - m 2 )/(4<2 - 1).
An alternate approach to implementing fast spin - 1 transforms appears in (Lewis,
2005).
138
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX B
F IN IT E D IF F E R E N C IN G IN A N IR R E G U L A R
SPH ER IC A L PIXELIZATIO N
Our definition of B£m (3.24) requires a prescription for computing the spin-1 and
spin - 2 weights
W a , W ab
(3.25) from
W ( x ) .
If
W ( x )
is of known analytical form, then
a trivial “prescription” consists of carrying out the covariant derivatives by hand;
otherwise, a finite differencing scheme must be used. Since this is nontrivial in an
irregular spherical pixelization such as Healpix, we present the details of one such
scheme here.
At each pixel center x, we define a m atrix Ma^(x) by
(B.l)
Throughout this appendix, the notation ^ x/ denotes a sum over pixel neighbors x.'
to the pixel x, and (x; —x)-1- denotes projection of the 3-vector (x; —x) into the plane
perpendicular to the unit vector x. After the projection, (x; —x)-1- is a tangent vector
at x, and
M
is a rank-2 symmetric tensor, as the index notation suggests.
Now, if W (x) is any pixelized map, we define its “finite difference” gradient by
V f DW'(x) d=Sf M -^ x ) £ ( / ( x ' ) - /(x ))(x ' -
For purposes of this paper, we need the following extension.
x ) ±b
(B.2)
If ua(x) is a finite
pixelized (tangent) vector field, we define its “finite difference” covariant derivative
139
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
140
by
^ ( P x/ ^ x w(x ') - w(x ))b(x ' - x )
_Lc
(B.3)
where Vx/ ^ Kv(x') denotes the tangent vector at x obtained by parallel translating
v(x') along the great circle arc connecting x and x /. This parallel translation is needed
to covariantly difference tangent vectors at distinct points x, x ;. If one were to simply
difference their components in the 9, ip coordinate system instead, then the resulting
finite differencing scheme would suffer from coordinate artifacts near the poles.
Using this finite differencing scheme, one takes the spin - 1 weight W a in Eq. (3.25)
to be
and the spin - 2 weight Wa^ to be the traceless symmetric part of
We have tested this prescription for the case of cosine apodization,
by comparing the pure p s e u d o - e s t i m a t o r s which result from finite differencing
and analytic differentiation of the cosine weight function. For a fixed random CMB
realization, we find th a t the difference between the two versions is negligible.
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX C
CO RRELATIO N F U N C T IO N S FO R FIELDS OF
A R B IT R A R Y SPIN
For the transfer m atrix calculations in Appendix D, we need expressions for correla­
tion functions in terms of power spectra, for all combinations of spin-0 , spin- 1 , and
spin-2fields.
Wepresent these results in a form which generalizes to higher spins as
well. Aspin s field
( —00
< s <
00 )
is a function (s f ) whose value at x depends on a
choice of orthonormal basis vectors {ei, £ 2 } at x. Under the right-handed rotation
=
e '2 =
(cos 0 )ei + (sin 0 )e 2
(C.l)
—(sin 0 )ei + (cos#)e 2
(sf) must transform as ( s f ) ' = e~is^(sf ) . There is a spin-raising operator
6
and a
spin-lowering operator 6 which transform a spin s field into fields of spin (s + 1 ) and
(s —1) respectively. In the frame { e i , e 2 } = {xa,Ha}, these operators take the form
&(sf)
=
- s i n s (0 )
&(sf)
=
- s i n - s (0) ^
+ i c s c ( d ) - ^ j sin“ s (0 )(s/ )
- i c s c ( 9 ) - ^ j sins (0)(s/ )
An orthonormal basis for spin s fields is given by the spin harmonics
&: Penrose, 1966; Goldberg et al., 1967), which are given for s >
0
(Newman
by
,Ytm = \ l w T § . (6)S{Ylm)
141
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(c'2)
142
These are nonzero only for £ > |s|.
Correlation functions between fields of spin s, s' can be expressed in terms of the
following function, which generalizes the Legendre polynomial P^(x • x 7) in the case
s = s' = 0 .
P f V x ')
#
^
(S^ m (x ){ e1,e2}={X,y})*
m = —£
X (slY£ m ( x ' ) { e i,e2}={X',F'})
(C-3)
We emphasize th at the spin harmonics on the right-hand side are evaluated in the
“two-point” frame { e i , e 2 } = { X , Y } , { e^ e^ } = {X 7, Y7}, not the frame { e i , e 2 } =
{ %a,Ua}• W ith this choice of frame, the right-hand side is rotationally invariant and
therefore depends only on the separation (x • x 7), as implied on the left-hand side.
It can be shown (Ng & Liu, 1999, Eq. (3.17)) th at P^s is related to the reduced
Wigner D-function d{OU , by PVf s’(cos 9) =
( — 1 )sd{,(9).
o
Using standard results on
Wigner D-functions (Varshalovich et al., 1988), one obtains the spin-label symmetries,
orthogonality relation, and product rule:
Pess\ z ) = p f s (z) =
f\zp g {z)p g {.z)
( - 1 )1+sp Y
=
s\
=
( - l ) Sl + * i + S 2 + 4 ^ ( 2 ^ 3 + 1)
£3
l\
£2
\ -4
-4
z)
(C.4)
2^
(C.5)
/
Pg4(z)Pt*4(z)
-
£3
X
\
£\
i2
£3
-si
s2
si -I- S2
p si+s2,s[+s^z)
si + s'2 J
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
)
(C6)
143
Likewise, the following recursion and initial conditions may be used to evaluate Pj/S .
Using the spin-label symmetries (C.4), it suffices to give initial conditions for the
recusion assuming s > s' \:
P lV m W
— (2 ^ + 1 ) z
In the recursion (C.7), we have defined
= y/
— s 2 ) ( ^ 2 — s'2)/£.
The recursion (C.7) with accompanying initial condition (C.8 ) is the main result
of this appendix. We have used the language of spin-s fields, which permits the result
to be stated in a uniform way for all spins. Since the transfer m atrix calculations
of Appendix D use the (equivalent) tensor language in spins < 2, the rest of the
appendix is devoted to translating between the two. As a byproduct, we will obtain
expressions for the correlation functions between all combinations of fields with spins
s, s' < 2. In the case where both spins are nonzero, it will be convenient to use the
linear combinations
(C.9)
instead of P f s' , Pg'~S ■
The basic observation which relates the tensor and spin-s formalisms is th a t the
frame-dependent vectors
m=(ei+ie2)/2
m = (ei —ie2)/2
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(C.10)
144
have spins 1 and -1 respectively. (This notation follows Okamoto & Hu (2003) but
our normalization differs by a factor l/y/2.) In terms of these, the vector and tensor
harmonics Y ^ , Y ^ can be written
Y (£m)a
'E
Y d(£m)bc
m )b r
=
( - l Yt m ) ™ a
~ {lY l m ) m a
=
- ( - 2 Yern)m bm c - { 2 Yern)m bm c-
(C.ll)
These are parity-even combinations of {ma,m a} and {mjmc, m^nic}.
(n q\ of P f s :
The following sums can be evaluated using (C.ll) and the definition (C.3)
(o. 1 2 )
47T
m
~
m
E
4^r
,,/x
E YL ^ m ) b ' A ^ )
m
2^+1
=
— £ r Pe2(*)Q,bH
m
Y YU mi)a^
) * M' Y((tm)bld^
L w A * ' )' ~=
m
rE
, ,/X _
E^fmlfecW^fmlfc'c'^) _
m
^
1 (Q1
e 2( z ) X aQ'b,d + R}2(z)YaU ^ )
2 ^ + 1
4?r {Qf^QbcQ'b'd + RT (z)QbCUy^)
To evaluate analagous sums containing Y ^ or Y ^ , note th at the 90° vector rotation
Va —►—eaf)V^ sends Y ^ —>Y ^ , X —> Y , Y —> (—X); and the 45° tensor rotation
n a6 -*■ ( - eac n cfe ~ e6cn ca)/2 sends Y ^ -»■ Y ^ , Q -> {/, £/ -»■ (-£?). For example,
the following sum will be used in Appendix D:
E Yi Z ) a ( A y {f m )bcA ' ) =
m
~ R f W a Q 'v A
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(C.13)
145
It is obtained from the fifth equation in (C.12) by applying a 45° rotation to the
indices be, and no rotation to the index a.
As an application, using sums of the form (C. 1 2 ), one can write down all corre­
lation functions between Gaussian fields with spins < 2 which arise from parity-even
power spectra:
<r T > =
(x x > =
(c .i4 )
47T
+
I
(TX ) =
I
I
(TQ)
=
I
<w> = EI 2^ t
(c ! e Q?
W+cfBR f M)
<x = EI t ~(cF«12(z)+
<u ir> = E ^ f F ^ f w + c F o f w )
t
(Y U > = E ^ 47T
tF ^ w + c p ^ w )
I
Here, correlation functions on the left-hand side are defined in the “two-point” basis
from §3.2; e.g. (XQ) denotes the correlation between the X a component of a spin-1
field and the Q'a^ component of a spin - 2 field, at points x, x 7 with separation z =
(x-x7).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
146
Correlation functions for parity-odd power spectra are given by:
<T y > =
(c .i5 )
i
=
<X Y >
4?r
E
e
<xf/> = E ^ ( c?B«f2W- c f E RP( z))
t
= E ^ ( C f £«<2w - cfBflJ2W)
I
= E
^
47T
I
c f BM ? w - - R ? w )
For spins 0 and 2, these results have already appeared in the literature (Zaldarriaga,
1998; Tegmark & de Oliveira-Costa, 2001), with the following explicit formulas for
P®2, Q22, P 22 used in place of the recursion relation (C.7):
p 0 2 (2) =
*
- (lE + ^ )
[(<-l)<(<+l)(<’+ 2)]1/2
q22(2)
— !)£{£ + 1 )(^ + 2 )
™
R l
^
(£ +
~
~ 4
2) P
j_ , (z) - ( £ -
( £ - l ) £ ( £ + l ) ( £
+
1) z P j{ z )
2 )(l - z 2)
(C'16)
(See also (Challinor & Lewis, 2005) for some spin - 1 equations.) The advantage of the
present treatm ent is th a t it generalizes straightforwardly to all spins, including spins
higher than 2 , and allows use of a single uniform recursion relation (C.7) rather than
many explicit formulas of type (C.16). Our approach is similar to Ng & Liu (1999),
who compute correlation functions covariantly for spins
0
and 2 .
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX D
C O M P U T IN G T R A N S F E R M ATRIC ES
There is still one ingredient missing from our formalism: an algorithm for efficient
calculation of the transfer matrices K ^ f ure, which are needed to debias the power
spectrum estimators (Eq. (3.10)). We have found it more convenient to compute
transfer matrices using correlation functions than in harmonic space. Since this ap­
proach has not been used previously in the literature, we first illustrate the method
for the simpler case of tem perature pseudo-C^ estimators (§D.l), before treating the
case of pure polarization estimators (§D.2). In §D.3, we address implementational
issues and show th at the computational cost of computing the transfer matrices is
o (cL x )-
D .l
Transfer m atrices for tem perature p seu d o-C /s
In the temperature-only case, p s e u d o - e s t i m a t o r s are constructed in a way which
is completely analagous to the polarization case (Eqs. (3.8)-(3.11)). We briefly sum­
marize the necessary definitions; details can be found in Hansen et al. (2002).
Pseudo multipoles and power spectra are defined by
Ttm = ^T(x)W (x)yta (x)*
(D.l)
X
5 7
7
=
a
(d .2 )
m = —l
Assuming no noise bias, the (£max) -by- (£max) transfer matrix Kfji is defined by
(C7) =
Ka c y
V
147
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(D.3)
148
The purpose of this section is to compute K ^ .
We first establish a key result:
evaluation of a sum of the form
S = ^W(x)F(x-x')W(x'),
(D.4)
XX'
where F ( z ) is a polynomial in
2
of degree < (-max- Expanding F in Legendre poly­
nomials, F(z) = Yhi<l TTVCLCC
FgP^z), we evaluate the sum as follows:
S
=
J2
W(x)W(x!)FiPe(x • x')
xx'
(D.5)
Aqr
= E 2lTTH/(x)H,(x')F*Y«>"(x)*y<™
(x')
x x 'm
= E
£<£„
m
It will be useful to rewrite this as an integral involving F(z) instead of a sum involving
F f
S =
87 t 2
J
d z ( W W (z)F(z),
(D.6)
where
(W W {z)i l f 1
J2
\Wtm\2Pt (z)
(D.7)
£ < lmax
m
can be thought of as an estimator of the correlation function of W(x), pretending
th at W{x) is a Gaussian field on the full sky which is zero (by coincidence!) outside
the survey region.
Armed with the key result (D.6), the transfer m atrix is calculated as follows. By
definition, the m atrix element Kggi is the expectation value of C j T (Eq. (D.2)), given
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
149
the signal covariance (T (x)T (x/)) = (2£f + 1 ) /( 47 t)P^/(x • x'). Therefore,
Ku
=
11+
=
27r £
W ^ ) W ^ ' ) Yl m Y Y Yl m ( ^ Y
P^ (x -x ')
xx'm
d2
W
-
( D -8 )
We have now arrived at our desired expression for Kpji. We defer the issue of efficient
evaluation of Eq. (D.8 ) to §D.3. For now, we note th at Eq. (D.8 ) is equivalent to the
form commonly seen in the literature (e.g. Hivon et al., 2001, Eq. A31),
2
£ £! £"
47T
(D.9)
0
l"m"
0
0
by virtue of the definition (D.7) of ( w w (z) and the identity
/
J
£ £! £"
(D.10)
dz P i(z )P ti(z)Ptn(z) = 2
0
D .2
0
0
T he transfer m atrix K ^ f ure
Computing the transfer m atrix K ^ f ure defined in Eq. (3.29) is completely analagous
to the temperature-only case, but the bookkeeping is considerably more complicated
owing to the presence of spin-0, spin-1, and spin-2 weights W (x), W a(x), W}JC{x).
We will need to use the results in Appendix C, in which it is shown how to compute
correlation functions between fields of these spins from all parity-even power spectra.
Throughout this section, we will use the abbreviated notations
Ylm =f Ytrn (x )
Ylm =f Ylm (x )
z =f (x • x')
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
150
in equations which contain a pair of points x, x 7.
We first consider an analog of the “key result” Eq. (D.4), in which F is replaced
by a 3-by-3 m atrix which contracts all combinations of spin-0, spin-1, and spin-2
weights:
W(x)
S = X J ( W (x ) ^ a (x ) W bc(x) ) F ( x ,x /)
(D .ll)
W a' (x')
^ W h'c\ x ' ) )
We make the assumption th at F is built from parity-even power spectra which are
zero for multipoles larger than £max■ (This is the analog of the assumption, from the
preceding section, th at F ( z ) is a polynomial of degree < (-max-) To be precise, let
F f w , F f G, F f E , F GE, F e e , F GC, F g b , F b b be arbitrary power spectra. Note
th at we denote the spin-0 “field” by W , and consider only parity-even combinations
of fields. Then F is assumed to be of the form,
(
jpWWy* v 1
p£
F (x ,x ')
E
£“
E£max
m
ipWGy* y/G
I
£m (£m)a’
jpGGyG* ylG
£
(£m)a (£m)a1
-pGEyE* y/G
I
(£m)bc (£m)a'
£m £m
77'WGyG* \rl
£
(£m)a £m
pW EV E* \rl
£
(£m)bc £m
/
jpWWy* y /E
\
£
£m (£m)b'd
zpGEyG* y /E
£
(£m)a (£m)Vd
rpEEyE* y /E
£
(£m)bc (fmlbV /
0
0
0
n
jpCCyC * y/C
£
(£m)a (£m)a'
TpCByB* y/C
£
(£m)bc (£m)a'
xpCByC* y/B
£
(£m)a (£rn)b'd
zpBByB * y/B
£
(£m)bc (£m)b'c?
n
\
(D.12)
)
Note the bi-tensor structure of F: each m atrix entry carries a set of tensor indices
which contract with the weight functions in (D .ll) so that S is a scalar.
The significance of the sum (D .ll), with F-m atrix of the form (D.12), will appear
shortly. For now, we forge ahead and evaluate S, using the method of the preceding
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
151
section. First, plugging the form (D.12) for F into the definition (D .ll) of S, we get:
( pW W
=
£ (
tm
W?m < .
pWG pW E \
pWG
pGG
pGE
pW E
pGE
pE E
/
C
V
( pCC
WSn
7>BB
V
\
J
\
pCB \
t?BB
W(m
w ,B *
(D-13)
J
This expression for S is the analog of Eq. (D.5) from the preceding section. Our
goal is to get an expression analagous to Eq. (D.6 ), in which the functions
f W X {z), ... }. appear directly instead of the power spectra
(z),
. .. }.
We first define correlation functions which are constructed from the weight functions
in the same way th at one constructs correlation functions from parity-even power
R eproduced with permission of the copyright owner. Further reproduction prohibited without permission.
152
spectra (Eq. (C.14)):
-WW
h
(D.14)
Y.W 'tm W tm Pt)
lm
»XX
lm
cw x
- s
B
r t f f 1)
lm
cY Y
^
1+ C
k'&g J1)
lm
lm
CQQ
i E ( « « f +wfcwL*P)
lm
(XQ
^
Z ( w £ w fmQ}2 + w ' S w ' M 2)
lm
cUU
YU
C
t E<<>*wfmRf +w?ZwLQ?)
47T
it E f^ m
+ WeZ Wh n Q f )
lm
Eq. (D.13) can then be rewritten as an integral containing the functions { f W W (z),
f w x (z),...y.
s
=
8x2
f 1 dz t w w ( z ) f w w (z) + 2Cw x ( z ) f w x (z) + 2t;WQ( z ) f WQ(z)
+CX X (z)SX X (z) + ( Y Y ( z ) f Y Y (z) + i Y Y ( z ) f Y Y (z) + 2CX % ) / X % )
+ 2CY U ( z ) f Y U (z) + < « « ( * ) / « « M + ( V U { z ) f V U (z)
(D.15)
To show this, it is easiest to work backwards, substituting Eqs. (D.14) into Eq. (D.15),
and ending up with Eq. (D.13). One writes Q fs\ R f s> in terms of Pj?s> (Eq. (C.9)),
and uses the orthogonality relation for PgS (Eq. (C.5)).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
153
We have now arrived at our desired “key result” : an expression for S in terms
of the functions { f W W (z), f W X (z), .
.
We now proceed to calculate the transfer
matrix element K ^ f ure. (The case of K ^ f ure will be treated shortly.) This matrix
element is the expectation value (C
i d e / , r \ T i d ! e (x'))
'u J w _
{Ude(x)Ude
= ^
BB,pure\
(.R
given the signal covariance
f ( z ) Q d e Q ,d'e'
+
Q f { z ) U d e U ,d' e').
(D.16)
A short calculation, using only the definition of c ^ B,pure (Eqs. (3.24), (3.28)), shows
th at
(
Kt r e =
E ( W (x)
W a (x)
W 6c(x) ]F ^/(X ,X /)
W (x)
W
xx'
y
^
(D.17)
a\-x!)
w 6V(x ') j
where the F-m atrix is given by
F « /(x ,x ')
=
A liiy
[ R f ( z ) Q ^ Q ld'‘ '
+
Q f ( z ) U d ‘ U " t <! ]
\rlB
(£m)d'e'
B*
(£m)de
Y;
£
E
N'eTdj Y (^ )f
N ' F M , ' Y $ n)r
m=—£
\
Y £YdebcY (m
x
J
\
N £Td!e'b'dY£m
We will evaluate the right-hand side of Eq. (D.17) using the key result (D.15), but
first we simplify the F-m atrix using identities from Appendix C. We show the details
for the ( 1 ,2 ) m atrix entry,
/
^
r>(T2)
at/ 2£ -)-1 r p22f„\f\der\id!e' , r\22(„\riderr/d'e'~\ \ ' -i/-B* rp
{£m)f
U' ~ £n ( 2 e + l ) L-R V
Q
U
J 2 ^ Y{£m)deTd'e'a'J Y(£m)f
m=—£
(D.18)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
154
To do the sum over m, we use identity (C.13), obtaining
It is then straightforward to do the index contractions and obtain:
(D.20)
This procedure can be used to simplify the remaining entries of F ^ /; when the dust
has settled, one finds
(D.21)
We would like to apply the key result (D.15) with F-m atrix given by this form, but
first there is an annoying technicality: our derivation of Eq. (D.15) assumed th at the
F-m atrix was built from parity-even power spectra, in the sense th at Eq. (D.12) is
satisfied. We claim th at this is so for the F-m atrix in (D.21), with £max = £ + £'■
To make this statem ent intuitively plausible, note th at the F-m atrix in (D.21) is a
rotationally invariant, parity-even object which is constructed by multiplying objects
of spins £ and t ' . A formal proof can be given by writing Q fs' , R f s' in terms of P^s'
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
155
(Eq. (C.9)), and using the product rule for P^s' (Eq. (C.6 )). This can be done one
m atrix entry at a time; again we supply the details only for the ( 1 ,2 ) m atrix entry:
r( U )
•W
( * '* ')
=
Nf - j 0 j r ( Q ? Q ? + R ? R ? ) K >
=
N ' i % ^ ( pt 2pf
E
\i-e'\<e»<e+£'
+plr
(D.22)
l p ^ ) x 'u'
.(2£f + 1)(2£" + 1)
(47r)^
I
X
£'
£
plO y>
a'
, 2 £’
E
+ 1
£
£’
£
m
x Yp'm"
(x /)
(1 21
We have shown th a t F ^ ,’ ' is of the form which appears on the RHS of Eq. (D. 1 2 ),
with £max = £ + £'.
W ith this final technicality out of the way, we can use the key result (D.15), with
F-m atrix given by (D.21), to evaluate the right-hand side of Eq. (D.17). This gives
the transfer m atrix in the form:
K +Pure
= 27T
J
^d z
^
( M z ) Q p ( z ) + B £ (z)R ft (z ))
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(D.23)
156
where the functions Ap(z), Bp(z) are defined by
A e = ( w w Q f - 2 N e, CW X Q}2 + 2N£CWQP P
+ n £2(cx
y y rV)
x q\1 + C
(D.24)
- 2N£N £, ( XQ P$1
+ NfcQQPe°
B£ = Cw w R f - 2 N j , ( w x R e12 + N £2(( x x R e11 + ( y y Q}1)
(D.25)
—2N£N £( y u Pp1 + N 2( u u Pp®
To compute AT^fwre, one modifies the signal covariance (D.16) by exchanging Q22 <->•
R 22; this modification carries through to the end of the calculation and shows:
K e ? Ure
=2nJ l
dz ( ~ 4 / 1)
+ B e(z ) Q } h z ))
(D.26)
Eqs. (D.23) and (D.26) are the main results of this appendix, and show how to
compute the transfer matrices K ^ f ure for any choice of weight functions.
We emphasize th a t even though the derivation is lengthy, the final result is simple.
Each term in the integrals (D.23), (D.26) is a product of three functions: a generalized
Legendre polynomial of degree
a generalized Legendre polynomial of degree £, and a
correlation function { ( W W (z), ( w x (z), . .. } which depends only on the pixel weight
functions (D.14). The final form of the transfer matrices is simpler in position space
than in harmonic space, where the three-way multiplication would be replaced by a
sum involving 3j symbols, similar to Eq. (D.9), but with many terms.
Finally, we discuss the case in which multipoles are binned into bandpowers.
In this case, the bandpower transfer matrix K ^
is related to
by: K ^ , =
PbpK^jjPp/y, where the matrices P and P define the binning (§3.3). For each band-
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
157
power, we define “binned” versions of the functions Ag(z), Bg(z), Q“
g?(z), R p ( z ) :
A b(z)
=
Y, P b t M * )
B b(z)
=
I
Y
(D.27)
P btB dz)
I
Qv(*) = Yff (~
1 7 ^ ) pevQe(*)
v
/
M*) = Yt,
V
/
Pt>vR?(z)
In terms of these, the bandpower transfer matrices are given by:
D .3
K W
UTe
=
2*
f
K W
* ‘
=
^
j \ z
i z
(
( M z ) Q b '( z )
+
B b( z ) R u { z ) )
A
+
B b( z ) Q y ( z ) )
b( z ) R y ( z )
(D.28)
Efficient calculation o f th e transfer m atrix integrals
We have now shown th a t the transfer matrices K ^ , can be represented in integral
form (D.28). In this section, we give an algorithm for evaluating the integrals to
machine precision, whose running time is 0 ( 1 ^ ^ ) . Here, £max is the largest value of
(£ + £') for which a transfer matrix K ^ , must be computed. This is the same cost,
within a constant factor, of evaluating the estimators once (§3.4.2).
The first observation is that, when computing correlation functions { C ^ ^ (-2)?
^ 1 7 I(Z),
j. usjrig Eq (D.14), it is only necessary to sum over multipoles £ < £max-
This is because, as argued at the end of the preceding section, the F-m atrix can be
w ritten in the form (D.12), and by Eq. (D.13), only multipoles of the weight functions
with £ < £max contribute to S.
The second observation is th at the integrands in (D.28) are polynomials of degree
< 2£max. This can be seen term-by-term after plugging in the definitions of Ag{z),
Bg(z). For some terms, such as ( w w (z)Q‘p ( z ) Q ^‘ ( z ) , all three factors are poly­
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
158
nomials; for others, such as £
(z)Q^2(z)Q‘p ( z ) , the last is a polynomial and the
first two are polynomials times V l — z 2. There are no terms with an odd number of
V l
— z 2 factors.
Because of this second observation, the integrals can be done exactly using Gauss-
Legendre quadrature (Press et al., 1992, §4.5) with {(-max + 1) points. Our algorithm
for evaluating the integrals is therefore given as follows. First, we compute spherical
harmonic transforms {Wgm , W7^ . W^m , W ^ , W ^ } of the weight functions. Second,
we compute the correlation functions { ( W W {z), ( W X (z), ... } at each of the {(max +
1) quadrature points, using Eq. (D.14). Third, we compute the 4A^an^ values { A ^ z ) ,
Bb{z), Qb{z): Rb(z)} at each quadrature point z, using (D.27). Fourth, we loop over
bands b, £/, computing K ^ f ure by doing the integrals in (D.28) by Gauss-Legendre
quadrature.
Let us consider the running time of each of these stages. The first stage is 0{fimax),
using fast spherical harmonic transforms. The second stage can be done in time
0{£max) by first summing over m in Eq. (D.14)) at fixed I and then summing over I at
fixed z. The third stage is 0{Nband(max)> evaluating the generalized Legendre poly­
nomials by recursion (C.7). The fourth stage is 0 { N 2and( max) since Gauss-Legendre
quadrature is 0 { ( max) once everything has been precomputed at the quadrature
points. Putting this together, and noting th at Nband < (m a x , the computational cost
of computing the transfer matrices K ^ f ure is 0{(„iax).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX E
C O M P U T IN G NO ISE BIA S
In addition to the transfer matrix K^jfure, the estimators presented in this paper
also require computing noise bias terms N E E , f^BB^)ure (Eq. (3.29)). For real ex­
periments, which include such complications as 1 / / noise, noise bias for all types of
pseudo-Q estimators must be computed by Monte Carlo. Indeed, a practical ad­
vantage of the pseudo-Cy framework is th at unbiased estimators can be constructed
given only Monte Carlo simulations of the noise; no other representation of the noise
covariance is required. However, for theoretical studies, it is convenient to have an
exact formula for the noise bias in simple cases. In this appendix, we consider noise
which is uncorrelated between pixels, and isotropic in each pixel, but not necessarily
homogeneous:
(Q(x)Q(x)) = {U(x)U(x))
=
c^x)2^ /
(E.l)
Here, we represent the noise by its per-pixel RMS tem perature cr(x) (i.e., units fiK
rather than /iK-arcmin).
W ith noise covariance given by (E.l), the noise bias defined in (3.29) is given by
fiEE
=
(E.2)
x
pjBB,pure = ^
X
W (x ) 2 + i v 'V a (x)W a (x) + 2N$Wbc( x ) W bc(x)
47T
This is derived starting from the definitions of E^m , B i m (Eqs. (3.8), (3.24)) using
identity (C. 1 2 ) from Appendix C.
159
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX F
FISH ER M A T R IX EVALUATIO N W IT H A ZIM U TH A L
SY M M E T R Y
In this appendix, we present the details of our method for fast exact evaluation of
the Fisher m atrix (3.31), in the case of inhomogeneous, but azimuthally symmetric,
noise. A similar method, in the context of CMB tem perature, appeared in (Oh et al.,
1999). Since we only consider uncorrelated noise in this paper, the noise covariance
can be written
<Q(x), Q(x')) = (£/(x), f/(x')> = i((S)2<S<2>(x - x'),
(F.l)
where rj(9) is arbitrary. If we change variables from {Q, U} to i U = (Q ± iU), and
Fourier transform in the azimuthal coordinate </?, by defining
~
/‘2 7r
n±(0)=/
dv [Q±iU}(e,<P)ei’nf
J0
(f.2)
then the noise covariance (F .l) is still diagonal (in both m and 9):
(iC(tf)n+(s')) = (n-*(»)n-,(«')> =
- «')W
(f .3)
The point of this change of variables is th at the signal covariance is also diagonal in
m (but still dense in 9). Using the results of (Zaldarriaga & Seljak, 1997), one can
160
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
161
show th at the signal covariance is given by:
n + * (0 )
n™*(0 )
\
n m'
+ ,(0')
KJ
n m!
_ ,(0')
v '
xEe '
- 2 ,- i X t n (»•»')
^
^rnm!
/
^
(F.4)
where we have defined
SS ‘
x f j e , s')
=f
(cfE± c f B)(sYtm(e, mAimtf, o »
(F.5)
Eqs. (F.3) and (F.4) can be used to efficiently calculate the Fisher matrix
FbV = iTV S^So + N r 'S y fS o + N ) - 1)
(F. 6 )
For concrete calculation, the continuous coordinate 6 is replaced with a set of N
equally spaced values 6\, O2 , ■■•,
with spacing A 6. The (2iV)-by-(27V) noise and
signal covariances of the variables { II^ ( 0 i), • • •,
11^
( 0 ^ ) , n ^ ( 0 i), • ■•, I I ^ ( 0 jv)} are
then given by
N (m )
4 -7T
( V(9i)2 X
sin(0j) lJ
A6
0
\
'
c(m )
ij
^ E
i
0
^
(F.7)
V(6i)2 s .
sin(flj) ll )
- 2 ,- 2 * ,+m(S„S,)
2 - 2 X i m (ei ,ej )
- 2 ,2 * ^ ,« ,) N
2,2x em (oi ,oj )
R e p r o d u c e d with permission of the copyright owner. Further reproduction prohibited without permission.
162
The Fisher m atrix (F. 6 ) is given by summing over m and tracing over 9:
-i
Fw
=
E
5
771—
=
,&(s£m)(sjjm )+ N ) - 1 S<,m)(skm )+ N ) - 1)
(F. 8 )
T Y lm a x
i l r f s f (S<0) +
+ N ) - 1)
'W'Tnax
+ J 2 T r(s[m)(s[,m) + N ) _1s [7 l)(S^m) + N ) _ 1)
m=1
where “Tr” is a (<
2 N)-hy-(2N) trace. Evaluating the spin harmonics by recursion
in t, the computational cost of each Fisher m atrix element is 0 ( N Q m max), versus
O(Npix) for brute force calculation without exploiting azimuthal symmetry. (For the
mock surveys studied in this paper, with Healpix resolution N slc[e = 256, one has
N q ~ TO and Npix ~ 104.)
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX G
RELATION TO FK P
In this appendix we prove the assertion, stated without proof at the end of §4.5, that
our ansatz for the optimal weight function reduces to the FK P (Feldman et al., 1994)
approximation under appropriate conditions.
The FKP approximation is used to
incorporate the effect of inhomogeous noise when estimating the amplitude of modes
th a t are much shorter than the size of the survey and also of the typical variation
scale of the noise. FK P propose a weight
w f K P oc
t-i
,
<x? + S!-1C V
(G .l)
-5
where Iq is the wavenumber where the power spectrum is being estimated, Q, is the
solid angle of each pixel, and af the noise variance in pixel i. Working in the flat sky
approximation, we will now show th at under the FKP assumptions, W PKP satisfies
Eq. (4.25).
Assuming th a t we are estimating power in a narrow band a near Iq we have,
o
ij
CT
where lo =
Iq ( cos
_
2r
— °i ij "f"
=
/
f
d2l
( G
' 2 )
[ d(pior -ih-(xi-x-j)
// ' 2tt
ip\Q, sin</?i0). Plugging this into Eq. (4.25) we get,
E c v c ^ w f KP
=
J 2 ^ Stj + I ^ L c ie- ^ - ^
(/
2 tt
/a
+ f1-1(7,,
do
163
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(G.3)
164
The first term, coming from the noise contribution to the covariance, simply gives:
2
- 5—
-----■
o f + n - 'C io
(G.4)
To make progress on the second term we need to use some of the FKP assumptions.
First we willuse th a t we are interested in a situation in which the noise varies slowly
and write,
^ o - i n ~ / ( xi) = / ( x *) + V /(x*)(xj - Xj) + • ■• ,
aj + “ c lo
~2
(G.5)
so th at the second term in (G.3) becomes
£
/
(S n sr
( /(x ,) + V /(x ,)(Xj -
X i) + • • •)
(G.6)
3
For the first of these terms we get:
3
/21
+ O ( l//0i2)
(G.7)
h
where R is the size scale of the patch. Now let us check what happens with the
additional terms in Eq. (G.6 ) describing anisotropy of the noise.
The additional
contribution is proportional to:
V x / • V loq o,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(G.8 )
165
which is suppressed by 1/I qR n with respect to the leading term where Rjy gives the
scale of variation of the noise.
Putting all of this together, we see that
E C ijC?j W ' f K P = 1 + 0 ( \ / l aR) + O ( l / l 0R N )
j
(G.9)
i.e., for diagonal inhomogenous noise when considering modes much shorter than both
the scale of variation of the noise and the patch, our ansatz (4.17) for the pseudo-Q
weight function reduces to FKP.
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
A P P E N D IX H
N U M E R IC A L O PTIM IZA TIO N OF W E IG H T
F U N C T IO N S
In this appendix we supply the details of our procedure, described briefly in §4.7,
for numerically optimizing p s e u d o - w e i g h t functions given the noise RMS cr(x) in
each pixel. We first treat the more difficult case of optimizing the B-mode weight
function; as we will then see, optimizing the B-mode weight function can be obtained
as a special case. As described in §4.7, the B-mode “weight function” really consists of
three functions W (x), W\{x), W 2 (x) with spins 0, 1 , 2 respectively. These are varied
independently in order to minimize the total expectation value (Ca ) of the pseudoQ estimator (with contributions from signal and noise), subject to the normalization
constraint
E
w w
= 1-
(H 1 )
X
To solve this minimization problem, we first rewrite the expectation value (Ca ) to
be minimized in a form which makes the dependence on the weight functions easier
to understand. Considering first the noise contribution to (Ca ), a short calculation
shows th at
|W l(x )p
,
\Wi(x)\{C„)n„ise = ^ T , w e A * ) { w { x f + 4 J f
lx
,
v- l)(« + 2 )
(t - m i + l ) ( l + 1)
(H-2)
where W( denotes the t weighting within the bandpower under consideration. (Through­
out this paper we have taken this to be
= £(£ + 1)/(27t) inside the band and zero
outside the band, as in Eq. (4.12).)
The signal contribution to (Ca ) is more difficult to compute, but follows from the
166
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
167
results of Appendix D. In order to write it down, we decompose the pieces of the
weight function in spherical harmonics:
nw
£m
Y ,W (x )Y fJ x )
(H.3)
nG
£m
(J
fh S a h h iE L M '
a£m
X
nE
£rn
a£m
V
^
-
i
—
2
+ ~ 2Ye*m ^
1 2Y‘ m ( x ) ~ ~ 2Y‘m (x )
(The spin-2 piece is the E /B decomposition from Eq. (4.2), whereas the spin-1 piece
is the gradient/curl decomposition of a vector field.)
The signal contribution to (Ca } can then be written in the following form:
( Ge
r WW
(Ca) sig
=E
£m
nW*
£m
„G*
a£m
E*
£m
rG£WG
/~<WE
U£
+
0C* nB*
£m
£m
r cc
U£
U£
r CB
r BB
r'W G r W E \
r GG r GE
s-iGE
U£
r CB
s iE E
£
/
afm
J
( a„W
£m
G
a£m
nE
\ im
\
(H.4)
"?m )
Here, the power spectra which appear depend on the bandpower weighting Wg and
signal power spectra. For the latter, it is convenient to introduce the notation C E =
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
168
(c F ^ ',sig ± c f ^ ' 3*g). The power spectra in Eq. (H.4) are then given by:
c,w w
-L
21" + 1
dz y
21 "
a WG
=
L
Cp d22(z )d22(z ) ~ ^ v *^2 ,—2
167T
Vi"
d z T . (.
+
Ifcr y V(<" - ! ) « " + 2)
+ &£/ <
A - 2 ( Z) A . - 2 ( Z)
2t" + 1 \
-L
dzT ,(
^
m nn \
J
W'i»
t
V ( < " - W " + ! ) « ’" + 2)
X^ 0 2 (Z) —C,^ ^ 2 2 (-2:)^ 0 2 (2:) +
C\GG
-
VI"
X
r <CC
= L
d,2_2(z)dQ2(z )
2£" + l \
4W,
'I"
16tt y ( ^ - l ) ( £ " + 2 )
dzY.
{
m nn \
/
(^)
2 Vty>//
1'
X4 l ( 2:) _
n WE
2
(tyt1 Xk4-2( j i (^) - y. ^i._i:) '4, - 2 (
< f e W 2<" + 1i
i,- 1:-)
4W>
16 tt y ( £ " - l ) ( f ' + 2)
m nn \
£ 'l "
x |^C'^’rfil(^)^22(2:)^ ll( 2:) + Cp rfi!_ i(^)^2 ,-2 (2:) ^ l - l ( 2:)
y > /2 ^ '+ l\
a GE
2 Wff,
167r J ( £ " - l ) ( £ " + 2) x/£"(£" + l)
Zfjf\
x \C p di 2{z)d 22 (z) ~ C p d [ _ 2 (z)d 2 - 2 (z ) ^01 (z )
a
, y-^/2£" + l \
CB
- S .
dzY [
/~iEE
■
/
16tt J (£" - \)(£» + 2 ) y /£"(£" + 1 )
m nn \
X
\Cp d i2(z)d22(z) + C£l d{ _2(z)d2_2(z) ^oi(z)
1\
we„
16tt J (£"^L)f?"(f?" + !)(£" + 2 )
' 2 £" +
dz y
Vi"
x
c-BB
2Won
2 2 (^)^ 2 2 (z)
/
_ Cp d 2 - 2 ( z ) d 2 -2 ( z ) ^ 0 0 (z)
2£" + 1
-L
^ e
(
VI" v
16tt
y (^" - 1)£"(£" + l ) { £ " + 2)
x C ^ d 22 (z).d^{z) + Cp d 2 _ 2 (z )d 2 , - 2 (z ) ^ 0 0 (z )
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
(H.5)
169
where d? ,(z) is the reduced Wigner rotation m atrix element (Varshalovich et ah,
1988); note th at dg0(z) is just the Legendre polynomial Pg(z).
We have written
the power spectra as position-space integrals; if convenient these expressions can be
converted to harmonic space using the identity
/
1
/
//
d z 4 , ± A z)4,±2(z) 4 - s , ± 2 T A z) =
2
1 £
£'
s'
=F2
£"
±2
X
=F s' y
(H.6)
Armed with the expressions (H.2) and (H.4) for the noise and signal contribu­
tions to {Ca ), a procedure for minimizing (Ca ) — (Ca )sig + (Ca) noise subject to the
constraint (H .l) is not difficult to come by. Representing the three components of
the 5-m ode weight function by a single length-(5iVpjx) vector w, both (C'a )sig and
(Ca) noise are quadratic in
w, so we can write them compactly as
'
rp
rp
(Ca ) = (Cot)sig d" (Ca)noise = ^ Qs'W T w Qnw
(H.7)
where Qs and Qn are (5Arpix)-by-(5Arpjx) matrices. In this notation, the constraint (H.l)
can be written vT w =
1,
where v is a vector with Apjx entries equal to
1(corre­
sponding to the spin - 0 piece of the weight function) and (4Apix) entries equal to
0
(corresponding to the higher-spin pieces). The solution to this minimization problem
is simply
_
(Qs + Q n ^ V
W°P t ~ vT(Qs + Q n ) ~ lv
/TJ oN
(H 8)
Since dense inversion of a (57Vpix)-by-(5A^pjx) m atrix is computationally infeasible,
our approach is to compute (Qs + Q n)- 1 w using conjugate gradient inversion (Press
et al., 1992).
In order to use conjugate gradient inversion, two ingredients are needed. First,
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
170
one needs an efficient way to multiply a vector by ( Q s + Q n ) , even though this matrix
will be too large to store in dense form. This is easily obtained from Eqs. (H.4)
and (H.2 ); note th at multiplying by Q s requires transforming from pixel to harmonic
space and back. The second ingredient is a preconditioner, or approximate inverse
of (Qs + Q n ) , used to speed convergence; the better the preconditioner approximates
(Qs
+ Q n)- 1 , the more rapidly the conjugate gradient search will converge.
We use the following simple preconditioner, obtained by keeping only the diagonal
of (Qs + Q n ) , then inverting:
\ a0 + a \x )
a f + a 2(x)
0 2
+ a 2(x)
(H',9)
where we have defined
i
°\
4
v
7
= e ( ^
W
i
7
=
£
i
v
(
^
v
W
g + c?c)
£ + q b b ).
7
The performance of this preconditioner depends mainly on the noise level. For noise
levels > 20 /iK-arcmin, only a few CG iterations are needed; for the noise levels of our
fiducial experiment (~ 5.75 /iK-arcmin), around 100 iterations are needed; and for
noise levels < 1 /iK-arcmin, the CG search does not converge. A better preconditioner
may accelerate weight function optimization, but we have not attem pted to find one
in this paper.
In summary, we have now arrived at a method for numerically optimizing the
5-m ode weight function, given the bandpower (specified by the t weighting W#) and
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
171
noise RMS a(x). One precomputes the power spectra in Eqs. (H.5), then computes
(Qs
+
Q n)~lv
by conjugate gradient inversion, using the preconditioner (H.9), to
obtain the optimized weight function given by (H.8 ).
Finally, let us discuss the completely analagous but simpler problem of optimizing
the E-mode weight function. In this case, the weight function is just a scalar W(x).
The noiseand signal contributions to (Ca ) are given (in analogy to Eqs.
(H.2), (H.4)
above) by
(Ca)noise =
£ E
Wt <T2 ( x ) W ( x f
(H .ll)
£x
(C a)sit
(H-12)
=
£m
where
c ww
drf r1 dz ^2 (^± i) w><4w\c+42(z)dZ(z) + c^dl_2{z)4[_2(z)
£'£"
(H.13)
Interpreting these as an iVpjx-by-A/pix m atrix equation
(Ca) = (Ca)noise T (Ca)sig = w Qn'W T w QgW
(H.14)
the optimized E-mode weight function is given by
(Q s + Q n ) ~ l V
W^
= v T {Qs + Qn)- ^ -
<H' 15)
where v is a length-Epjx vector consisting of all l ’s. Eq. (H.15) is evaluated using conjugate gradient inversion with preconditioner given by W (x ) —> W ( x )/ (ctq
' 02 + cr(x)2).
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
R EFE R E N C E S
Amarie, M., Hirata, C., & Seljak, U. 2005, Phys. Rev., D72, 123006
Barkats, D., et al. 2005, Astrophys. J., 619, L127
Bond, J. R., Jaffe, A. H., & Knox, L. 1998, Phys. Rev., D57, 2117
Bunn, E. F. 2002, Phys. Rev., D65, 043003
Bunn, E. F., Zaldarriaga, M., Tegmark, M., k Oliveira-Costa, A. d. 2003, Phys. Rev.,
D67, 023501
Challinor, A., k Chon, G. 2005, Mon. Not. Roy. Astron. Soc., 360, 509
Challinor, A., k Lewis, A. 2005, Phys. Rev., D71, 103010
Efstathiou, G. 2006, Mon. Not. Roy. Astron. Soc., 370, 343
Feldman, H. A., Kaiser, N., &; Peacock, J. A. 1994, Astrophys. J., 426, 23
Goldberg, J. N., et al. 1967, J. Math. Phys.,
8
, 2155
Gorski, K. M., et al. 2005, Astrophys. J., 622, 759
Hansen, F. K., & Gorski, K. M. 2003, Mon. Not. Roy. Astron. Soc., 343, 559
Hansen, F. K., Gorski, K. M., k Hivon, E. 2002, Mon. Not. Roy. Astron. Soc., 336,
1304
Hinshaw, G., et al. 2006
Hirata, C. M., k Seljak, U. 2003, Phys. Rev., D 6 8 , 083002
Hivon, E., et al. 2001
Hu, W., k Okamoto, T. 2002, Astrophys. J., 574, 566
Kamionkowski, M., Kosowsky, A., k Stebbins, A. 1997a, Phys. Rev. Lett., 78, 2058
— . 1997b, Phys. Rev., D55, 7368
Kovac, J., et al. 2002, Nature, 420, 772
Lewis, A. 2003, Phys. Rev., D 6 8 , 083509
— . 2005, Phys. Rev., D71, 083008
Lewis, A., Challinor, A., & Turok, N. 2002, Phys. Rev., D65, 023505
172
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
173
Montroy, T. E., et al. 2006, Astrophys. J., 647, 813
Newman, E., & Penrose, R. 1966, J. Math. Phys., 7, 863
Ng, K.-W., & Liu, G.-C. 1999, Int. J. Mod. Phys., D 8 , 61
Oh, S. P., Spergel, D. N., &; Hinshaw, G. 1999, Astrophys. J., 510, 551
Okamoto, T., & Hu, W. 2003, Phys. Rev., D67, 083002
Oxley, P., et al. 2004, Proc. SPIE Int. Soc. Opt. Eng., 5543, 320
Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. 1992, Numerical
Recipes: The Art of Scientific Computing, 2nd edn. (Cambridge (UK) and New
York: Cambridge University Press)
Readhead, A. C. S., et al. 2004
Seljak, U., & Zaldarriaga, M. 1997, Phys. Rev. Lett., 78, 2054
Sievers, J. L., et al. 0900
Smith, K. M., Hu, W., & Kaplinghat, M. 2006, Phys. Rev., D74, 123002
Smoot, G. F., et al. 1992, Astrophys. J., 396, LI
Spergel, D. N., et al. 2003, Astrophys. J. Suppl., 148, 175
— . 2006
Taylor, A. C., et al. 2004
Tegmark, M. 1997, Phys. Rev., D55, 5895
Tegmark, M., & de Oliveira-Costa, A. 2001, Phys. Rev., D64, 063001
Varshalovich, D. A., Moskalev, A. N., & Khersonskii, V. K. 1988, Quantum Theory
of Angular Momentum (World Scientific)
Wandelt, B. D., Hivon, E., & Gorski, K. M. 1998
— . 2001, Phys. Rev., D64, 083003
Yoon, K. W., et al. 2006
Zaldarriaga, M. 1998, Astrophys. J., 503, 1
Zaldarriaga, M., & Seljak, U. 1997, Phys. Rev., D55, 1830
— . 1998, Phys. Rev., D58, 023003
Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.
Документ
Категория
Без категории
Просмотров
0
Размер файла
3 802 Кб
Теги
sdewsdweddes
1/--страниц
Пожаловаться на содержимое документа