THE UNIVERSITY OF CHICAGO COSMIC MICROWAVE BACKGROUND ANALYSIS FOR CAPMAP AND FUTURE EXPERIMENTS A DISSERTATION SUBMITTED TO THE FACULTY OF THE DIVISION OF THE PHYSICAL SCIENCES IN CANDIDACY FOR THE DEGREE OF DOCTOR OF PHILOSOPHY DEPARTMENT OF PHYSICS BY KENDRICK M. SMITH CHICAGO, ILLINOIS AUGUST 2007 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3273077 INFORMATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. ® UMI UMI Microform 3273077 Copyright 2007 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Copyright © 2007 by Kendrick M. Smith All rights reserved Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ABSTR ACT A major frontier for cosmology in the coming decade will be making precision mea surements of the cosmic microwave background (CMB) polarization, complement ing existing measurements of the CMB tem perature anisotropies. The E-mode, or gradient-like component, of CMB polarization will break param eter degeneracies from CMB tem perature alone and improve constraints on reionization history and initial conditions in the standard cosmological model. The B-mode, or curl-like component will permit strong constraints on growth of structure from CMB lensing, and probe new physics by measuring the gravity wave content of the early universe. In the first half of this thesis, we describe design and implementation of the analysis pipeline for the 2005 observing season of CAPMAP, an experiment to measure CMB polarization on small angular scales using coherent polarimeters and the Lucent 7 meter telescope in Crawford Hill, New Jersey. Although the results of the analysis are not completely finalized, we present partial results obtained from the data, and full results for a full-sesaon simulation, in order to illustrate the measurement th at will be obtained. The CAPMAP analysis pipeline uses a likelihood formalism which is computation ally expensive, but results in measurement uncertainties which are provably optimal. An optimal analysis will be computationally infeasible for upcoming generations of CMB polarization experiments, in which the problem size will be larger by several orders of magnitude. Therefore, fast approximate methods have been proposed. In the second half of this thesis, we show th at in their originally proposed form, these methods fail to preserve the E-B decomposition, and this failure ultimately acts as a limiting source of noise when measuring B-modes, and propose modifications which solve this problem. iii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ACKNOW LEDGM ENTS First and foremost, I would like to thank my advisors Wayne Hu and Bruce Winstein for their constant ideas and advice throughout the last five years. Wayne’s ability to answer detailed questions “cold” from any subfield of cosmology served as both a source of inspiration and humility. Bruce’s great enthusiasm for physics was always highly contagious and led to frequent Friday afternoon conversations about oddball problems and thought experiments. Very few scientists would have the excitement and motivation to start over as a “graduate student” in a new area, after reaching prominence as one of the leading figures in one field of physics. This thesis would not have been possible without the hard work and additional perspective of my collaborators. The CAPMAP experiment was designed, built and operated by Denis Barkats, Colin Bischoff, Phil Farese, Todd Gaier, Josh Gundersen, M att Hedman, Lewis Hyatt, Glen Nixon, Dorothea Samtleben, Suzanne Staggs, Eu genia Stefanescu, Keith Vanderlinde, and Bruce Winstein. The CAPMAP analysis effort was distributed among many members of the collaboration; but I am especially indebted to Bruce and Lewis, who I worked most closely with over the course of the analysis. Thanks to my collaborators throughout my time at Chicago: Olivier Dore, Wayne Hu, Dragan Huterer, Manoj Kaplinghat, Mike Nolta, Oliver Zahn, and Matias Zaldarriaga. This research could not have happened without their contributions and feedback. Getting an unexpected email from Matias complimenting one of my papers was a great encouragement to a budding cosmologist. Thanks to all of my officemates and friends at Chicago who helped keep me sane over the years: Prahlad, Sunny, David, Andrei, Monica, Vikram, Ivo, Takemi, Zhaoming, Robert, Colin, Keith, Cora, as well as everyone else I’ve forgetten to iv Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. mention. I would like to thank my family for all of their support and encouragment. Finally, I would like to thank Anjali for reasons too numerous to mention, from helping me realize th at I always secretly wanted to be a physicist in the first place, to staying up late making figures the night before the thesis deadline. She is the center of my universe. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. TABLE OF C O N T E N T S ABSTRACT ................................................................................................................. iii A C K N O W L E D G M E N T S ........................................................................................... iv LIST OF F I G U R E S .................................................................................................... ix LIST OF T A B L E S ........................................................................................................ xiii 1 IN T R O D U C T IO N ................................................................................................. 1.1 The cosmic microwave background ......................................................... 1.2 CMB p o larizatio n .......................................................................................... 1.3 Scope of thesis ............................................................................................. 1 1 6 12 2 ANALYSIS OF 2005 DATA FROM C A P M A P ................................................. 2.1 In tro d u c tio n ................................................................................................... 2.1.1 Focal plane and scan s tr a t e g y ........................................................ 2.1.2 P olarim eters........................................................................................ 2.2 Pipeline o v erv iew .......................................................................................... 2.2.1 P o in tin g ............................................................................................... 2.2.2 Gain m o d e l ........................................................................................ 2.2.3 Noise m o d e l........................................................................................ 2.3 S im u latio n ....................................................................................................... 2.4 Cycle finding ................................................................................................ 2.5 D ata se le ctio n ................................................................................................ 2.6 Map m a k in g ................................................................................................... 2.7 Likelihood a n a ly s is ...................................................................................... 2.7.1 Bandpower estimates: maximum likelihood................................. 2.7.2 Bandpower uncertainties: conditional likelihoods....................... 2.7.3 Bandpower uncertainties: marginalized likelihoods.................... 2.8 Combining maps .......................................................................................... 2.9 Null m a p s ....................................................................................................... 2.10 Null test s u i t e ................................................................................................ 2.11 Final power sp ectru m ................................................................................... 15 15 15 18 21 23 25 25 29 30 31 33 37 39 42 47 51 55 57 65 3 PURE PSEU D O -Q ESTIMATORS ................................................................. 66 3.1 In tro d u c tio n ................................................................................................... 66 3.2 Notation and C o n v e n tio n s......................................................................... 68 3.3 Pseudo-C^ e stim a to rs................................................................................... 70 3.4 Constructing pure pseudo-C^ estimators ................................................ 72 3.4.1 Constructing pure pseudo-Q estimators 1: continuum limit . 73 3.4.2 Constructing pure pseudo-Q> estimators 2: finite pixelized maps 75 vi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. vii 3.5 3.6 3.7 3.8 4 An example with homogeneous n o i s e ...................................................... An example with inhomogeneous n o i s e ................................................... Detectability of gravity wave B-modes ................................................... C onclusion....................................................................................................... 80 87 91 96 98 A GENERAL SOLUTION TO THE E-B MIXING P R O B L E M .......... 4.1 In tro d u ctio n ................................................................................................... 98 4.2 Preliminaries ................................................................................................ 100 4.2.1 Spin two n o ta tio n ............................................................................... 100 4.2.2 Small-angle approxim ation.............................................................. 101 4.2.3 The ambiguous m o d e s ..................................................................... 102 4.3 Avoiding leakage by measuring x b ......................................................... 106 4.3.1 A very blue spectrum: the need for a p o d iz a tio n ....................... 107 4.4 Relation to the pure pseudo-C /s ............................................................ 109 4.5 Choosing the optimal weight fu n c tio n ...................................................... 113 4.5.1 D eriv atio n ........................................................................................... 116 4.5.2 A variational form of the a n s a tz ..................................................... 117 4.6 Some analytic solutions................................................................................ 119 4.6.1 Case 1: £2 power sp e c tru m .............................................................. 119 4.6.2 Case 2: Z4 power s p e c tru m .............................................................. 121 4.7 Numerical so lu tio n s...................................................................................... 121 4.8 A realistic e x a m p le ...................................................................................... 126 4.9 D iscu ssio n ...................................................................................................... 134 A SPIN-1 SPHERICAL HARMONIC TRANSFORM S....................................... 138 B FINITE DIFFERENCING IN AN IRREGULAR SPHERICAL PIXELIZAT I O N .................................................................................................................. 139 C CORRELATION FUNCTIONS FOR FIELDS OF ARBITRARY SPIN . . 141 D COMPUTING TRANSFER M A T R IC E S .................................................... 147 D .l Transfer matrices for tem perature p s e u d o -C /s ...................................... 147 D.2 The transfer m atrix K ^ f u r e ...................................................................... 149 D.3 Efficient calculation of the transfer matrix in te g r a ls ............................ 157 E COMPUTING NOISE B IA S ........................................................................... F FISHER MATRIX EVALUATION WITH AZIMUTHAL SYMMETRY . . G RELATION TO FK P 159 .......................................................................................... H NUMERICAL OPTIMIZATION OF WEIGHT F U N C T IO N S ............. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 160 163 166 REFERENCES Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. LIST OF FIG U R ES 1.1 1.2 1.3 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 Measurements of the CMB tem perature power spectrum C j T from the WMAP satellite and several ground-based experiments (from Hinshaw et al. (2006))..................................................................................................... Full-sky observations of the CMB tem perature anisotropy A T (n) from the WMAP satellite, combined over several frequencies using the In ternal Linear Combination (ILC) method (from Hinshaw et al. (2006)). Predicted TT, EE, and BB power spectra for the WMAP3 fiducial cos mology modified to include a gravity wave component with T / S = 0.2. The BB spectrum contains contributions from gravity waves (the larger contribution for t < 100), and gravitational lensing of the primary Emode signal (larger for I > 100)................................................................... CAPMAP focal plane..................................................................................... CAPMAP scan strategy. In one 21 second scan cycle, each polarimeter scans a circle near NCP; full azimuthal coverage is obtained every 24 hours from the E arth’s rotation.................................................................... CAPMAP analysis pipeline ....................................................................... Raw CAPMAP timestream data, downsampled to 1 Hz and plotted over an arbitrarily chosen 100 sec interval, showing the sinusoidal at mospheric signal on top of white noise fluctuations.................................. Conditional likelihoods for EE, BB and EB bandpowers, with remain ing bandpowers fixed to maximum likelihood values, for simulated Wband (black/solid) and Q-band (blue/dotted) full-season d ata.............. Conditional likelihood for lowest-^ EE and BB bandpowers (on the x-axis and y-axis respectively), with remaining bandpowers fixed to maximum likelihood values, for simulated Q-band full-season data. . . Markov chain likelihood samples from the full-season simulated dataset in Q-band. Each sample is a twelve-component vector, corresponding to the twelve bandpowers shown in Fig. 2.5, but only the lowest-^ EE bandpower (x-axis) and lowest-^ BB bandpower are shown here. The equilibrium distribution of the chain is the marginalized likelihood, but each sample is correlated to the previous sample...................................... Marginalized likelihoods for EE, BB, and EB bandpowers, obtained by histogramming Markov chain samples, for simulated W -band (black/solid) and Q-band (blue/dotted) full-season d ata................................................ Day/night null test for W -band CAPMAP data. The y 2 value (Eq. (2.53)) to zero for the EE, BB, and EB likelihoods is 0.01, 0.79 and 0.25 re spectively, with one degree of freedom in each y2..................................... ix Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 4 10 16 17 21 26 44 45 49 52 58 X 2.10 Final power spectrum measurement in from the full-season simulated dataset, with marginalized uncertainties computed by Markov chain sampling. The three sets of error bars represent W -band data alone (left), Q-band data alone (middle), and both frequencies combined (right)................................................................................................................. 2.11 One-band likelihood in a multiple of the fiducial EE spectrum, for the full-season simulated dataset with both frequencies combined. Zero power is excluded at 13a................................................................................ 3.1 The pure B-mode (3.34), for (£,m) = (10,1) (left) and (50,1) (right), using apodization length r* = 5°. Only the real part is shown. In the left panel, the second and third terms, which are concentrated in the apodization region, dominate the main term ............................................. 3.2 Transfer functions K ^ , (left) and K ^ ure (right), shown for varying b', with the band b fixed at £rmn = 80, £max = 120. The apodization length used was r* = 5°. The matrix entry K ^ , represents the mean response of the BB estimator in band b to BB power in band £/; K ^ , represents the response of the BB estimator to EE power. As N s^ e —> oo, approaches zero........................................................................... 3.3 Contribution to ((■j^B ^Pure) fr0m the fiducial E-mode power spectrum, for zero beam (left) and FWHM 25’ (right). This can be interpreted as the estimated B-mode power which is contributed by E-modes due to pixelization artifacts. A gravity wave B-mode spectrum with T / S = 0.01 is shown for scale. The apodization length used was r* = 5°. . . 3.4 RMS bandpower errors for pure pseudo-6^ estimators (blue/solid), un modified psuedo-Q estimators (green/dashed), and optimal Fisher er rors (red/dotted), shown for varying apodization length r*. In the left panel, the band 10 < £ < 40 is shown; in this band, the performance of unmodified p s e u d o -e s tim a to rs is very poor and the dashed curve is not visible. In the right panel, the band 40 < £ < 80 is shown. The noise level used was 20 //K-arcmin............................................................... Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 63 64 83 84 85 86 XI 3.5 RMS bandpower errors for pure p s e u d o -e s tim a to r s (blue/left), un modified pseudo-C/ estimators (green/middle), and optimal Fisher er rors (red/right). The survey region is a spherical cap of radius 13° with uniform white noise. The top two panels are for a fiducial model with T / S = 0.2 and noise level 20 /iK-arcmin; the bottom two panels are for a fiducial model with T / S = 0.05 and noise level 10 /iK-arcmin. In each pair of panels, bandpowers are shown for E-modes (left) and B-modes (right). In the E-mode panels, multipoles t < 250 have been replotted on a log scale for visibility. In the B-mode panels, the lensing component of the power spectrum has been shown separately (dashed); detecting the gravity wave signal requires measuring power in excess of this level........................................................................................................ 3.6 RMS bandpower errors for pure pseudo-G/ estimators (blue/left), un modified pseudo-C/ estimators (green/middle), and optimal Fisher er rors (red/right). The survey region is a spherical cap of radius 13° with inhomogeneous noise given by (3.35). The top two panels are for a fiducial model with T / S = 0.2 and noise level 20 /iK-arcmin; the bottom two panels are for a fiducial model with T / S = 0.05 and noise level 10 /iK-arcmin. In each pair of panels, bandpowers are shown for E-modes (left) and B-modes (right)............................................................. 3.7 Minimum T /S detectable at la, for a 13° spherical cap survey with uniform noise, assuming a 25’ beam and Gaussian lensing contaminant (left), and zero beam and no lensing contamination (right). The solid line is the optimal Fisher value, the dotted line is the Monte Carlo value using unmodified pseudo-G^ estimators (§3.3), and the dashed lines are Monte Carlo values using pure pseudo-G^ estimators. For the parameters in the right panel, finite-pixelization artifacts impose a “floor” on the value of T /S which can be detected, but the value can be made arbitrarily small by increasing the resolution. For the parameters in the left panel, we find th at increasing the resolution beyond N s^ e = 256 results in no significant improvement...................... 88 92 93 4.1 Illustration of the leakage due to finite sky coverage in the ordinary pseudo-C/ technique. A circular region 10 degrees in radius was used to estimate the pseudo-G/’s. The leakage curve shows the R-modes obtained even in the absence of any cosmological R-modes just from the leakage of E. For comparison we also show the expected cosmological signal for T / S = 0.1 and the noise power spectrum for an experiment — 4.2 1/2 with Wp = 5 .7 5 /iK-arcmin and an 8 arcminute beam ............................105 Optimal window functions W0pt(r), given by Eqs. (4.31), (4.33) for a signal+noise power spectrum proportional to /2 (left panel) or /4 (right panel).................................................................................................................... 120 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. xii 4.3 4.4 4.5 4.6 4.7 4.8 4.9 5-m ode weight functions for varying noise levels, for t = 20 and a 10° spherical cap with homogeneous noise. As descrbed in the text, each weight function consists of spin-0, spin-1, and spin-2 pieces, which we show in the left, center, and right columns. The top three rows show the 5-m ode weight functions produced by our optimization procedure for noise levels 100, 20, and 5 //K-arcmin (from top to bottom). In the bottommost row we show the analytic solution given by Eq. (4.33). . Power spectrum errors for the 10° homogeneous mock survey described in §4.7, using pseudo-Q power spectrum estimation with counterterms (left/blue) or optimal estimators (right/red). For the 5 - mode power spectrum, we compute power spectrum errors in a fiducial model with T / S = 0 but also show a spectrum with T / S = 0.05 for comparison. . Survey region, noise distribution, and point source mask for our fiducial realistic experiment. The side length of the bounding square is 25°. The map is based on preliminary simulations of the upcoming balloon experiment EBEX............................................................................................ Weight functions for the fiducial realistic experiment. Each of the three rows corresponds to a different bandpower, with multipole ranges (^mim Anax) given from top to bottom by (30,70), (190,230) and (510,550) W ithin each row from left to right, the 5-m ode weight function, the spin-0 piece of the 5 - mode weight function, and the spin-1 and spin-2 pieces of the 5-m ode weight function are shown...................................... A zoomed view of the (4nin^max) = (190, 230) 5-m ode weight func tion near point sources. From left to right, the panels show the noise map, the scalar piece of the weight function, and the two higher-spin counterterms..................................................................................................... Contribution of aliased 5-modes to the pseudo power spectrum C f B for the fiducial realistic experiment with point source mask, with signal and noise power spectra shown for comparison. Using counterterms from our numerical optimization procedure, the level of 5 —►5 mixing is well below the noise floor, in contrast to ordinary p s e u d o Comparison of three sets of pseudo-Q power spectrum errors for the fiducial realistic experiment. Weight functions were optimized inde pendently in each of the three cases. Left/blue: P s e u d o - w i t h coun terterms, without the point source mask. Center/magenta: Pseudo-C^ with counterterms, with the mask. Right/red: Ordinary pseudo-C^, with the mask. For 5-m ode power spectrum errors, the three cases are so similar th at only one set of error bars is shown............................. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 124 125 127 129 129 131 132 LIST OF TABLES 2.1 2.2 2.3 2.4 3.1 4.1 List of CAPMAP d ata channels, illustrated for W -band polarimeter OB and Q-band polarimeter OQ.................................................................... Correlation m atrix between bandpowers, computed from 2D condi tional likelihoods with the remaining bandpowers held fixed to maxi mum likelihood values..................................................................................... Null test suite for real CAPMAP data. The overall x 2 of the test suite is 70.08 with 84 degrees of freedom, indicating th at the null test passes. Null test suite for real CAPMAP data, with ground-synchronous mode removal disabled in the map making stage, showing clear failure of the test suite in Q-band......................................................................................... 20 46 60 62 Pixel weight functions used to construct E-mode estimators for the mock surveys in this section.......................................................................... 90 Forecasts for the lcr upper limit on ( T /S ) and the fractional error on the amplitude Ajens of the lensing 5-mode, for various levels of approximation to the fiducial realistic experiment...................................... 135 xiii Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 1 IN T R O D U C T IO N 1.1 T he cosm ic microwave background Throughout the last decade, the cosmic microwave background (CMB) has been at the forefront of observational cosmology. On the theoretical side, the statistics of the CMB (and in particular, the dependence on cosmological parameters) can be pre dicted to great accuracy because the underlying physics is mostly linear perturbation theory of a photon-baryon fluid. At the level of current measurements of the CMB, it is not necessary to understand astrophysical processes such as galaxy formation, reionization, baryon cooling, and star formation in detail. The robust theoretical pre dictions have been accompanied by an astounding increase in experimental sensitivity. In Fig. 1.1, we show current measurements of the CMB tem perature anisotropy power spectrum (which we will define precisely below, Eq. (1.2)) from the WMAP satellite and several ground-based experiments. The CMB has been measured at the percent level and has provided precision constraints on several cosmological parameters. On the other hand, it was only 15 years ago th at the first detection of a nonzero signal was made by the COBE experiment (Smoot et al., 1992). Since this thesis is primarly concerned with data analysis techniques, it would take us too far afield to describe the physics which gives rise to the CMB in detail, but we will make a few comments to help orient the reader. Many types of cosmolog ical observations have confirmed a “standard cosmological model” in which the early universe consists of a hot dense plasma which expands and cools, consisting of radi ation, free electrons and protons, and dark m atter particles whose only interactions are gravitational. Initially, the plasma is nearly uniform, but contains small (~ 10“ ^) 1 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 2 90° 6000 2° A n g u la r s c a le 0.5° 0.2° WMAP A cb ar B o o m e ra n g CBI 1(1+1 )C t / 2 t t [/xK2] 5000 4000 3000 2000 1000 10 100 500 1000 Multipole m om ent I 1500 Figure 1.1: Measurements of the CMB tem perature power spectrum C j T from the WMAP satellite and several ground-based experiments (from Hinshaw et al. (2006)). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 3 inhomogeneities which propagate as sound waves. When the tem perature of the plasma becomes sufficiently low (< 3000 Kelvin), free electrons and protons combine to form neutral hydrogen. This event is called “recombination” . After recombination, the universe is nearly transparent. The inhomogeneities no longer behave as sound waves, but slowly grow gravitationally to ultimately become the large-scale structure in the universe observed today. Photons which decouple from the plasma at recombination freestream through the transparent universe to form the CMB which is observed today. The CMB is a nearly uniform 2.7 Kelvin blackbody, with small (~ 10- 5 ) hot and cold spots in temperature, which roughly correspond to overdense and underdense regions of the plasma along the “surface of last scattering” corresponding to the epoch of recom bination along each line of sight. Let A T (n) denote the CMB tem perature, relative to the average tem perature T q = 2.7 Kelvin, in direction ii. fn Fig. 1.2, we show full-sky measurements from WMAP of the tem perature anisotropy AT. Note th at the scale of the plot is 200 microkelvin, several orders of magnitude below Tq. (A dipole anisotropy due to the E arth’s motion relative to the rest frame of the CMB has also been subtracted.) In the standard cosmological model, the initial fluctuations are Gaussian to an excellent approximation (meaning th at any Appoint correlation function is determined by the two-point correlation function using Wick’s theorem), ft is also true to an excellent approximation th a t the observed CMB is linear in the initial fluctuations (since the physics is described by linear perturbation theory), so (AT) can be treated as a Gaussian field. Therefore, the statistics of the CMB are completely determined by the two-point correlation function. To describe the two-point correlations, it is convenient to decompose A T into Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. -200 T(nK) +200 Figure 1.2: Full-sky observations of the CMB tem perature anisotropy A T (n) from the WMAP satellite, combined over several frequencies using the Internal Linear Combination (ILC) method (from Hinshaw et al. (2006)). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 5 spherical harmonics Y^m(n): A T ( n ) = ^ 2 aJmYem{n ) (1.1) £m This operation (T —►ajm ) is the analog of the Fourier transform for a field defined on the sphere. The most general two-point correlation allowed by rotational invariance is of the form: (a£ma£'m') = ^ t ^(1'^rnm' (1-2) where this equation defines the power spectrum C j T . For purposes of constraining cosmological parameters, CMB experiments can be viewed as measuring the power spectrum (Fig. 1.1); this contains all information in the underlying tem perature map (Fig. 1.2). The CMB power spectrum contains a wealth of cosmological information; a brief (and incomplete) list of some cosmological observables follows: 1. The acoustic peaks in the power spectrum correspond to wavelengths in the plasma which undergo an integer number of oscillations before recombination, and have a characteristic angular scale la = 7T— (1.3) where D * is the angular diameter distance to recombination today, and s* is the total distance th at sound can travel before recombination. 2. The relative peak heights and the angular scale of the damping tail are deter mined by the ratios of radiation, baryons, and dark m atter before recombina tion. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 6 3. Through secondary gravitational effects such as lensing, and the decay of po tentials due to dark energy (the ISW effect), the CMB is sensitive to the physics of the late universe. 4. The Sunyaev-Zeldovich effect, or Compton scattering of CMB photons from hot electrons in galaxy clusters, produces small-scale anisotropy in the CMB which is tied to growth of structure at all redshifts. 1.2 CM B polarization In addition to the tem perature anisotropies previously defined, the CMB radiation is also predicted to be linearly polarized at a small but detectable level. Polarization is generated by Thomson scattering of the local tem perature quadrupole from free elec trons during recombination and reionization. Because the quadrupole anisotropy is supressed (by the same mechanism, Thomson scattering) at times when free electrons have high density, polarization is an even smaller signal than the CMB temperature. To describe the polarization, consider a line of sight direction n and choose coor dinates so th at n points in the z direction. The incoming CMB radiation will produce electric field components E x , Ey perpendicular to the line of sight. In this notation, the CMB tem perature considered previously can be written as an expectation value T(n) oc ( ^ ( n ) 2 + E y {n )2) (1.4) and linear polarization can be described by Stokes parameters Q, U defined by: Q(n) oc (Ex (n)2 - E y (n)2) (1.5) U{ n) oc (2Ex (n)Ey (n)) (1.6) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 7 (There is a third Stokes parameter V describing circular polarization, which we have omitted since the CMB is not expected to be circularly polarized.) There is a multipole decomposition for CMB polarization, analogous to the tem perature decomposition in Eq. (1.1). The polarization case is more complicated be cause the Stokes parameters are not scalar quantities, but depend on the choice of local frame x, y along the line of sight n. There are two formalisms commonly used for describing linear polarization covariantly, which we will use interchangeably through out the thesis. The first formalism (Kamionkowski et al., 1997b) is to represent the linear polar ization as a traceless symmetric tensor ll a^ on the sphere: ( 1. 7) This is just the intensity tensor of the incoming radiation, with the trace projected out. (The traceless part represents linear polarization and the trace represents tem perature.) To define the multipole decomposition, we first define operators ( 1 .8 ) which take scalar fields to symmetric traceless tensors. We then define tensor spherical harmonics Y ^ and Y ^ by: y /{l ~ 1)1(1 + 1)(< + 2)£°i'V'!m (.£m)ab Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ( 1. 9 ) 8 (The prefactors normalize so th at |Y^m|2 and | | 2 have integral 1 over the sphere.) It can be shown th at the tensor spherical harmonics form a complete orthonormal basis, so th at we can expand: iW n ) = a fm Y lm + Y Im a lm YPm (L1°) tm To motivate this decomposition into E and B components, note th at this is simply the tensor analog of the gradient/curl decomposition of a vector field. The “gradient like” operator £ab is second-order because two derivatives are needed for the result to transform as a tensor. The second formalism for describing linear polarization covariantly (Zaldarriaga & Seljak, 1997) is to use the machinery of spin-s fields. A spin s field (—oo < s < oo) is a function (s f ) whose value at x depends on a choice of orthonormal basis vectors {©1 ) 6 2 } at x - Under the right-handed rotation e?2 (s f ) = (cos0)ei + (sin 0 )e 2 = —( s i n ^ ) e i + ( c o s ^ ) e 2 (1-11) must transform as (s f )' = e~'is^(s f ) . In (Newman & Penrose, 1966; Goldberg et al., 1967), spin-s harmonics sY^m(n) are defined. These form an orthonormal basis for spin-s fields on the sphere, generalizing the ordinary spherical harmonics for s = 0. To relate this to CMB polarization, note th at the quantity ( Q ± i U ) is a spin-(±2) field, so we can expand: Q(n) ± iU(n) = Y ( ~ a %n ^ i a ? m ) ( ± 2 Yt m ( n ) ) £m Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (U12) 9 It can be shown th a t this version of the E-B decomposition is equivalent to the tensor version defined in Eq. (1.10). The E-B decomposition of linear polarization is closely related to parity invariance. Under the parity reversing operation n —> (—n), the E-mode component transforms as: afm ( - iA L ( i - 13 ) The same transformation law is obeyed by the tem perature multipoles a^m defined in Eq. (1.1). On the other hand, the B-mode component transforms with an extra minus sign under parity: “fm - ( - 1)m “?rn (1-14) In the tensor formalism (Eq. (1.10)), one can also think of this as follows: a polariza tion field IIab(n) is an E-mode if it can be written as S^cj) for some scalar potential 0(n). A B-mode can be written B ^ ip for a pseuodscalar field n). Power spectra C f E , C EB are defined for polarization (in analogy with Eq. (1.2)) by: (afm a£'m') = ('alm af'm ') = = C f E ^ll^m m ' 0 (L15) This is the most general two-point correlation which is invariant under rotations and parity (a nonzero EB power spectrum would transform as C EB —►(—C E B ) un der parity reversal and is therefore disallowed). In Fig. 1.3, we show the EE and BB power spectra th at are predicted for the WMAP fiducial cosmology, with one modification: we have included a gravity wave component in the initial conditions, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 10 1000 Temperature 100 w 10 1 ■S' o .i E -m o d e polarization 0.01 0.001 : B -m o d e polarization 0.0001 10 100 Multipole 1 1000 Figure 1.3: Predicted TT, EE, and BB power spectra for the WMAP3 fiducial cosmol ogy modified to include a gravity wave component with T / S = 0.2. The BB spectrum contains contributions from gravity waves (the larger contribution for I < 100), and gravitational lensing of the primary E-mode signal (larger for t > 100). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 11 with amplitude T / S = 0.2, for reasons th at will be apparent shortly. (We param eterize the amplitude of the gravity wave component by the tensor-to-scalar ratio T / S = (|h £ |2 + |h£ |2)/7^k|2) in the early universe.) It is seen th at the E-mode power spectrum is 2-3 orders of magnitude larger than the B-mode spectrum. This can be understood intuitively from parity considerations. The CMB polarization consists entirely of E-modes provided th at two assumptions are satisfied: 1. The processes giving rise to the CMB anisotropy can be described in linear perturbation theory, 2. The initial fluctuations can be represented by scalar quantities. B-modes can be generated when one of these assumptions breaks down. For the first assumption, there are several nonlinear phenomena which make small contributions; on the angular scales of Fig. 1.3, the most im portant is gravitational lensing, i.e. small deflections of the E-mode signal due to gravitational potentials between the observer and the surface of last scattering (Zaldarriaga & Seljak, 1998) This generates the broad B-mode signal centered at i ~ 1000. The second assumption breaks down if the early universe contains gravity waves (or “tensor modes” ), which are a prediction of some early universe models. In this case, the initial conditions include fluctuations dg^u in the spacetime metric which are not scalar quantities and generate both E and B modes in CMB polarization (Seljak & Zaldarriaga, 1997; Kamionkowski et al., 1997a). In Fig. 1.3, the B-mode signal at i ~ 100 is generated during recombination by the gravity wave background. (The second bump at i < 10 is also generated by gravity waves, but in the late universe after reionization has occured.) CMB polarization will ultimately have better sensivity to gravity waves than any other flavor of cosmological data, because the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 12 gravity wave contribution is “separated out” into its own observable, namely the BB spectrum, where the only contaminating signals are from second-order effects such as gravitational lensing. 1.3 Scope o f thesis On the observational front, we are still in the early stages of measuring CMB po larization. Detection of a nonzero signal has been achieved by several experiments (Kovac et al., 2002; Readhead et ah, 2004; Barkats et al., 2005; Montroy et ah, 2006). The strongest detection to date excludes zero at the lOcr level (Sievers et ah, 0900). Looking further ahead, the next decade should bring precision measurements of polar ization power spectra, just as the previous decade has brought precision measurements of the tem perature spectrum. Many challenges, both instrumental and algorithmic, must be met along the way. This thesis is concerened with the algorithmic compo nent: analysis methods which are used to obtain measurements of EE and BB power spectra, starting from raw data written out by real-world experiments. In the first half (§2) of the thesis, we describe the design and implementation of the analysis pipeline for the 2005 observing season of CAPMAP. CAPMAP is an experiment targeting the small-scale E-mode component of the polarization, by surveying a ~ 8 deg2 region near the north celestial pole (NCP), with ~ 3 arcmin angular resolution. This high resolution was achieved through use of the Lucent 7meter Cassegrain telescope at Crawford Hill, New Jersey. The CAPMAP detector array consists of 16 correlation polarimeters with a total integrated sensitivity of ~ 250 nK, independently in W -band (90 GHz) and Q-band (40 GHz) frequency channels. The CAPMAP analysis pipeline is based on the likelihood formalism, which has Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 13 the advantage of giving optimal uncertainties on the final power spectra (and on intermediate quantities, such as pixel-space maps). The disadvantage of the full likelihood treatem ent is th at the computational cost rises steeply with larger datasets (for example, many dense matrix operations are needed with cost O(N^)). The CAPMAP dataset is small enough th at a likelihood-based pipeline is still relatively inexpensive, but in upcoming generations of CMB polarization experiments, this will no longer be the case, and fast approximate Monte Carlo based pipelines will be a practical necessity. Monte Carlo pipelines for CMB polarization suffer from a fundamental problem: existing fast methods used to estimate E-mode and B-mode power spectra do not preserve the E-B decomposition. Therefore, the presence of the much larger E-mode signal can act as a source of noise when estimating B-modes. It was first shown by Challinor & Chon (2005) th at this effect can become the limiting source of uncer tainty when measuring the BB power spectrum, when the instrumental sensitivity becomes sufficiently small. On the other hand, BB power spectrum measurements are a primary scientific target for future polarization experiments: the gravity wave B-mode will open a new window on the initial conditions of the universe, and the lensing B-mode will probe the growth of gravitational potentials on the largest ob servable scales. Therefore, this “E —> B mixing problem” poses a serious practical concern for future experiments. The second half of the thesis is concerned with solving this problem. We first (§3) modfiy the existing fast power spectrum estimator construction so th at the E-B decomposition is preserved. Using the modified estimators, the larger E-mode signal does not contaminate estimates of the B-mode power spectrum; instrumental noise and systematics are the only contaminants. This construction solves the E —> B mixing problem in principle, but in prac Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 14 tice one must confront a practical issue. The construction calls for a (scalar) weight function W (x), and vector/tensor counterterms Wa(x), Wic(x) which combine to de termine the statistical weight of each pixel in the experiment, and must satisfy dif ferential equations which are difficult to interpret in an irregular pixelization on the sphere. For the noise distribution of a realistic experiment, which may include irreg ular boundaries from Galactic or point source contamination, and ragged features in the noise, it is not clear how to choose these weight functions in practice. (W ithout our modification, a similar practical problem exists in Monte Carlo CMB analysis pipelines, but is somewhat more tame, since only the scalar piece W (x ) must be chosen.) In §4, we propose an algorithmic approach for choosing weight functions based on the noise distribution. Combined with the modified estimators from §4.4, this provides a complete solution to the E —> B mixing problem which can be applied to realistic surveys maps with complex features. We demonstrate the method for a simulated balloon experiment which includes small-scale fluctuations in the noise and point source masking, and show th at strong E-B separation is possible. We conclude th at E —►B mixing in the analysis need not be a limiting factor for future surveys. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 2 ANA LY SIS OF 2005 DATA FRO M C A P M A P 2.1 Introduction CAPMAP is a collaboration between Princeton University, the University of Chicago, the University of Miami, and JPL (Jet Propulsion Laboratory) to measure the polar ization of the CMB using the 7m telescope in Crawford Hill, New Jersey. The 2005 CAPMAP d ata consists of 1800 hours of data taken between November 2004 and May 2005. Two independent analyses of this dataset were carried out at the University of Chicago and Princeton University. In this chapter, we will describe the design and implementation of the Chicago pipeline. At the time of this writing, results from the analysis have not yet been finalized. Therefore, we will illustrate the pipeline by presenting results from a simulated full-season dataset. However, for some intermediate results (such as the null test suite in §2 . 1 0 ), the analysis is sufficiently finalized th at results from the real data can be presented. 2 .1 .1 F ocal p l a n e a n d s c a n s tr a te g y The CAPMAP focal plane is shown in Fig. 2.1. Twelve W -band (90 GHz) polarimeters and four Q-band (40 GHz) polarimeters in four dewars are positioned at fixed locations relative to the focal plane center; the entire focal plane is then swept across the sky in a regular pattern. Due to the 7m telescope size, the angular resolution is excellent and the beam shapes are well-controlled: to a good approximation, each polarimeter’s beam can be treated as Gaussian, with angular size $FWHM = 3.3 arcmin in W-band, and #FW H M = 6.48 arcmin in Q-band. The sky region surveyed by CAPMAP is a small patch of sky near the north 15 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 16 2Q 2D 2B 2C OD ( ^ ) 3b ( .055 deg .45 deg OC OQ 3C 3Q .108 deg ,8 deg 3D OB 1C 1D 1B 1Q Figure 2.1: CAPMAP focal plane. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 17 Figure 2.2: CAPMAP scan strategy. In one 21 second scan cycle, each polarimeter scans a circle near NCP; full azimuthal coverage is obtained every 24 hours from the E arth’s rotation. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 18 celestial pole (NCP). The scan strategy is a “ringscan” : the focal plane center under goes uniform circular motion (with scan period 21 seconds) along a fixed circle, with radius 0.725 deg, approximately centered at NCP. The orientation of the focal plane is fixed in telescope coordinates; thus relative to the telescope, each polarimeter only “sees” a thin circular trajectory in each cycle. However, due to the E arth ’s rotation, the coverage of each polarimeter in celestial coordinates sweeps out full azimuthal coverage every 24 hours. (In telescope coordinates, one can imagine the sky slowly rotating around NCP.) In celestial coordinates, the coverage of each polarimeter is therefore an annular region centered at NCP. This is illustrated in Fig. 2.2. The total sky area surveyed by CAPMAP is 7.4 deg 2 in W -band and 9.2 deg 2 in Q-band. 2 .1 .2 P o la rim e te rs As the CAPMAP focal plane scans the sky, data from each of the 16 CAPMAP polarimeters is archived at a sampling rate of 104 Hz. Each polarimeter takes noisy measurements of the quantity U = (2Ex (n)Ey (n)} (2.1) where n denotes the instantaneous sky direction pointed to by the polarimeter, E x , Ey denote the components of the incoming electric field due to the CMB in the x, y directions relative to the polarimeter. In the polarimeter frame, only the Stokes U parameter is measured; but full Q /U coverage “on the sky” is obtained every 6 hours due to the E arth ’s rotation. Rather than writing a single 104 GHz data stream from each polarimeter, several streams are written which contain more detailed information. First, each polarimete r’s frequency range is split into subranges. The W -band polarimeters are sensitive Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 19 to CMB polarization at frequency uq a 92 GHz, with bandwidth A u ~ 15 GHz. This frequency range is split into low, medium, and high subranges with A u ~ 5 GHz. In Q-band, the polarimeters are sensitive to uq ~ 40 GHz with bandwidth A u « 10 GHz; this is split into low and high subranges with A u « 5 GHz. Second, each polarimeter takes total power measurements in addition to polariza tion. There are two total power channels in each polarimeter, which measure (E%) and (Ey) in the detector frame. In principle this includes contributions from the CMB tem perature anisotropy, but the CAPMAP total power channels are only intended to monitor the sky tem perature and do not have sufficient sensitivity to measure the CMB tem perature power spectrum. Finally, in each polarimeter, we generate “quadrature” versions of each of the po larization and total power channels previously described. The quadrature channels are constructed in such a way th at they have zero response to sky polarization or tem perature, but share many of the noise properties of the ordinary (or “demodulated” channels). To explain this in a little more detail, we remark th at the demodulated channels are obtained by removing a 4 kHz square wave modulation th at is applied early in the hardware in order to minimize ( 1 //) drifts. The quadrature channels are obtained by putting the input and output 4 kHz square waves 180° out of phase, so th at the combination is equivalent to modulating the sky signal with an 8 kHz square wave, more than rapid enough to destroy any signal from the sky. Thus the total number of 104 Hz data channels, including demodulated and quadrature data, is ten for each W -band polarimeter and eight in Q-band. The channels are listed in Tab. 2.1, where we also introduce notation used to identify each channel: a 2-character polarimeter name (e.g. 0Q or OB, see Fig. 2.1), followed by a 2-character channel name (S0/S1/S2 for polarization channels or D0/D1 for total power). We denote quadrature channels with a subscript “q” . Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 20 0BS0 0BS1 0BS2 0BD0 0BD1 0BS0q 0BSlq 0BS2q OBDOq OBDlq 0QS0 0QS1 0QD0 0QD1 OQSOq OQSlq OQDOq OQDlq W -band polarimeter OB Demodulated polarization, 87 ± 2.5 GHz Demodulated polarization, 92 ± 2.5 GHz Demodulated polarization, 97 ± 2.5 GHz Demodulated total power, (E^) component Demodulated total power, (E^) component Quadrature polarization, 87 ± 2.5 GHz Quadrature polarization, 92 ± 2.5 GHz Quadrature polarization, 97 ± 2.5 GHz Quadrature total power, (E^) component Quadrature total power, (Ey) component Q-band polarimeter 0Q Demodulated polarization, 37.5 ± 2.5 GHz Demodulated polarization, 42.5 ± 2.5 GHz Demodulated total power, (E^) component Demodulated total power, ('Ey) component Quadrature polarization, 37.5 ± 2.5 GHz Quadrature polarization, 42.5 ± 2.5 GHz Quadrature total power, (E^) component Quadrature total power, (Ey) component Table 2.1: List of CAPMAP data channels, illustrated for W -band polarimeter OB and Q-band polarimeter 0Q. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 21 | Simulation | I |Cycle Finding| 1 |Data Selection! 1 | Map Making | ________________ / |Map Combining! I X I Null Maps | • Likelihood A nalysis! / X Null Test Suite Final Power Spectrum Figure 2.3: CAPMAP analysis pipeline 2.2 P ip elin e overview The CAPMAP analysis pipeline is a maximum likelihood pipeline which obtains optimal power spectrum uncertainties (Bond et al., 1998) given models for pointing and instrumental noise. The pipeline stages are shown in Fig. 2.3 and are briefly described as follows: 1. Simulation (§2.3): in this stage, we simulate the raw CAPMAP timestreams in the same file format as the real data. The simulations are used strictly for testing the analysis pipeline. 2. Cycle finding (§2.4): in this stage, we partition the timestream into ringscan “cycles” based on pointing, and discard data whose pointing is anomalous. 3. D ata selection (§2.5): we discard data which fails cuts for weather effects, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 22 glitches, sun contamination, and anomalous gains. 4. Map making (§2.6): all timestream data which survives the cuts in steps 2-3 is processed into a pixel-space map with associated noise covariance matrix. 5. Likelihood analysis (§2.7): from the map and noise covariance computed in step 4, power spectrum uncertainties are computed using a likelihood function. The likelihood function can either be maximized to give a point estimate for the power spectrum, or explored with Markov Chain methods to give marginalized errors on bandpowers. 6 . Combining maps (§2.8): this is a general-purpose program for combining two maps, made from disjoint subsets of data (for example, all W -band data and all Q-band data), which optimally weights the two datasets. 7. Null maps (§2.9): similarly, two maps made from disjoint subsets of data can be subtracted to obtain a null map. If the two input maps are consistent, then the null map should be “pure noise” , i.e. contain zero signal. 8. Null test suite (§2.10): we create a suite of null maps based on dividing the d ata in different ways, and perform a likelihood analysis on each null map, to test for consistency with zero power. 9. Final results (§2.11): we conclude by presenting final results obtained from the simulated dataset and comment briefly on their implications. Before diving into the details, it will be helpful to describe a few general features of the CAPMAP analysis which will be used throughout the pipeline: the models for pointing, gains, and instrumental noise. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 23 2 .2 .1 P o in tin g In each 104 Hz sample i, the CAPMAP encoder records instantaneous values of the elevation Elj and azimuth Az?; of the focal plane center. Due to systematic errors in telescope pointing, from effects such as telescope sag and atmospheric refraction, the true pointing will deviate slightly from the encoded values. There is a pointing model for the CAPMAP telescope which includes many such effects, and the free parameters in the model can be fit from observations of known point sources outside the near-NCP survey region. We will not describe the pointing model in detail here, but simply remark th at for observations near NCP, it is an accurate approximation to condense the model into a pair of offsets A (El), A(Az) such that: Elj (true) = Elj (encoder) —A(E1) Azj(true) = Azj (encoder) —A(Az) (2.2) The pointing offsets in Eq. (2.2) can be applied at a very low level in the analysis pipeline (e.g. when the samples get read from disk) and so the full complexity of the pointing model can largely be ignored once the offsets have been computed. Whenever we write “Elj” or “Azj” in the remainder of this chapter, it is assumed th at the pointing offsets in Eq. (2.2) have been applied. The actual values of the offsets are small, ~ 0.9 arcmin. Note th at Eq. (2.2) implies th at the many-parameter pointing model is degenerate near NCP (the same is true in any small patch of sky), which is one reason why it is necessary to fit the model using point source observations outside the survey region. A second reason is th at the survey region has deliberately been chosen to be free of point sources which would contaminate the CMB observations. The encoder also writes a timestamp for each 104 Hz sample (in Universal Coor- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 24 dinated Time, UTC). Prom the sample timestamp, azimuth, and elevation, and the positions of the polarimeters in the focal plane, the location of the polarimeter in celestial coordinates (right ascension and declination) and its polarization angle can be computed. In the analysis pipeline, we make the approximation th at the polarime ter measures a linear combination (determined by the polarization angle) of Stokes parameters in whichever Healpix (Gorski et ah, 2005) pixel contains the polarimeter location. (This approximation breaks down if the pixels are comparable to or larger than the beam size; in §2.7 we will use a pixel window function to correct for the effects of this approximation.) We typically use Healpix resolution N ns[^e = 1024, corresponding to 3.5 arcmin pixels. We introduce the pointing matrix Pip, where i indexes a timestream sample and p indexes a sky pixel and choice of Stokes parameter, defined as follows: Pip = contribution to sample i from pixel value p (2-3) By the discussion in the preceding paragraph, Pip is a sparse matrix with two nonzero entries in each row, corresponding to the two Stokes parameters in the pixel containing the instantaneous polarimeter location. As we will see in §2.6, it is convenient to introduce the pointing matrix because the map making procedure becomes more transparent in m atrix notation, e.g. the relation between the timestream data di and the sky map sp can be compactly written: di = PipSp + noise. (2-4) As the focal plane moves through its ringscan orbit, we define the ring angle p as Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 25 follows: f f™ \ Az i ~ — Az oj \ p def+ = tan - i ^costE li)— ( 2 .5 ) where Azq and E1q denote the azimuth and elevation of the ring center . The scan direction is such th at p is increasing throughout the cycle. 2 .2 .2 G a in m o d e l The raw d ata from the CAPMAP polarimeters is encoded as voltages, and each chan nel has an overall gain which relates the measured volatages to the sky polarization in microkelvin. There is a CAPMAP gain model which gives the gain of each channel as a func tion of time; in the pipeline, we write the gains to an auxiliary file as a preprocessing step. We will not describe the gain model in detail, but just make a few brief com ments. At various points in the observing season, calibration data was taken, either from observations of the known point source Tau A, or a nutating conductive plate positioned in front of the secondary mirror. (Roughly consistent gain measurements are obtained from both calibration sources.) Empirically, we find three distinct gain periods in the data; some of the channels experience abrupt gain changes from one period to the next. Throughout each gain period, the time dependence of each channel’s gain is well-approximated by a fit of the form (Go + G\t). We fit the calibration measurements independently in each of the three gain periods. 2 .2 .3 N o ise m odel First, we describe the CAPMAP noise model, which will be an im portant ingredient throughout the pipeline. We find that, to an excellent approximation, the instrumen- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 26 Temperature (mK) 20 .00 60 40 -5 -1 0 Figure 2.4: Raw CAPMAP timestream data, downsampled to 1 Hz and plotted over an arbitrarily chosen 1 0 0 sec interval, showing the sinusoidal atmospheric signal on top of white noise fluctuations. tal noise is uncorrelated between channels. Therefore, it suffices to consider a single channel in isolation. In a single channel, we denote the timestream noise by nj, where the index i = 1 , 2 , . . . , N toci runs over timestream samples. We will assume th at the noise is Gaussian, so th at the noise model is completely specified by the assumed noise covariance N^j = Cov(rq, rij). The simplest possible noise model would be to assume no correlations between samples i ^ j : (A fo )ij = (2 .6 ) where we have allowed a time-dependent variance of. The first deviation from the noise model in Eq. (2.6) th at is evident in the data is a ~ 1 pK sinusoidal atmospheric signal, shown in Fig. 2.4. (Although the atmosphere is Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 27 expected to be unpolarized, the small level of tem perature leakage in the radiometers combines with the large sky tem perature deviations over the ringscan, to create a significant signal.) We can incorporate this in the noise model by adding a term which assigns a huge variance to any “ring mode” , such as the cos(p) mode shown in Fig. 2.4, which is to be marginalized. 5 -/V cycles {NR)ij — ° f ^ij + e R-iaRja (2-7) a= 1 where e is a small regulating constant. In Eq. (2.7), the index a runs over independent ring modes which are to be marginalized, and R l0 denotes the value of mode a at time i. In each ring cycle, we marginalize the lowest five Fourier modes in the ring angle p: an additive constant (representing the DC level which is arbitrary in each radiometer), sin(p), cos(p), sin(2p), and cos(2p). These five modes are independently marginalized in each cycle; therefore the total number of marginalized ring modes is (57VCyCjes) as indicated in Eq. (2.7). Intuitively, marginalizing ring modes in each cycle amounts to a form of high-pass filtering, which eliminates any slowly varying mode in the data. (We will see how this works in more detail in §2.6.) Therefore, in addition to removing the atmospheric contamination, marginalizing ring modes in the noise model does “double duty” by filtering out ( 1 / / ) noise. Let us briefly mention one subtlety here: the assumption th at the noise is uncor related in different channels breaks down for the lowest Fourier modes in each cycle. This is because some contributions, such as the sky temperature, are common to all channels. However, because these modes are marginalized, their values are completely filtered out, and we can treat the noise as uncorrelated between values (since this is true for all modes which are retained). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 28 There is still one more ingredient in the CAPMAP noise model which we will add to Eq. (2.7). We marginalize any timestream mode which is ground-synchronous, i.e. which has the same dependence on ring angle p in all cycles in a contiguous run. (This ground-synchronous structure was an im portant contaminant in the 2003 CAPMAP dataset (Barkats et al., 2005), and much smaller in the 2005 dataset analyzed here, but can still be detected at many sigma using the null test suite to be described in §2 . 10.) To marginalize ground-synchronous modes, we divide the range of values for the ringscan angle p into = 108 “ring bins” between p = 0 and p = 2ir. Each ring bin then corresponds to one mode which will be marginalized. More precisely, we take the noise covariance m atrix to be: 5 ^ c y c le s Nij = ai^ij + 6 ^ rb RiaRja 1 a= 1 + e 1 ^ SibSjb (2 .8 ) 6=1 The last term corresponds to ground-synchronous mode removal; the index b ranges over ring bins, and the m atrix element S $ is equal to one if timestream sample i falls in ring bin b, and zero otherwise. Eq. (2.8) is the final noise model th at we will use when analyzing CAPMAP. The per-sample variance af, which is time-dependent and different in each channel, is not computed from a model, but rather obtained from the data, by assuming th at cr| is constant in each ringscan cycle, and fitting its value independently for each cycle and polarization channel. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 29 2.3 Sim ulation The CAPMAP simulation pipeline produces binary data files in the same format as the raw data. The telescope pointing and timestamps in the simulations can either have simulated ideal values (with specified sampling rate, scan radius,center, and speed) or“tem plated” from the real data, in order to include realistic deviations from ideality such as wobbly orbits and dropped samples. The simulated signal is the sum of several contributions which we now describe in detail. First, there is a CMB signal which is simulated by first randomly generating a Gaussian CMB in harmonic space: afm — Gaussian, mean=0, varianc e = C ^ E afm = Gaussian, mean=0, varianc e = C f B (2.9) then taking the spherical transform to obtain the CMB in pixel space: (Q ± iU)(x) = T ia?m)(± 2 U m (x)) (2.10) lm We simulate the CMB at very high pixel resolution (Healpix = 4096, corre sponding to 0.9 arcmin pixels) so th at no correction for the pixel size is necessary, but we do include convolution with the Gaussian beam, by using beam convolved power spectra when simulating the a^TO:s in Eq. (2.9). convolved = i ^ l )unconvolved x exP ^ 8~ln~2 ^ (Recall from §2.1.1 th at we treat each CAPMAP beam as Gaussian with width #FWHM = 3.3 arcmin in W -band and #f\VH M = 6.48 arcmin in Q-band.) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 30 When we include the CMB component in the simulated timestreams, we simply use the value of the CMB at the center of whichever pixel contains the detector during the timestamp of the sample. (That is, no interpolation is performed.) However, we do perform parallel translation between the pixel center and detector location, to avoid artifacts when the detector passes close to NCP. We checked th at this simulation procedure has converged: if we double the resolution and rerun the end-to-end analysis pipeline with identical RNG seeds, the effect on the final power spectrum is at most 1 0 - 3 cr. In addition to the CMB component, the simulations include white noise, obtained by adding an independent Gaussian random value to each sample. The noise level is radiometer-dependent (and based on fits to data) but assumed constant in time. 2.4 C ycle finding The simulation pipeline described in the previous subsection is not needed in principle to analyze the data, but is invaluable in practice for code testing. Beginning with this section, we describe the analysis pipeline which starts from the raw time-ordered data, and ultimately produces measurements of polarization power spectra, as well as several consistency checks. The first step is simple: we divide the timestreams to be analyzed into a sequence of ring cycles. We define cycles to start and end when the ring angle p (Eq. (2.5)) crosses zero. Thus all cycles are kept “in phase” . The cycle-finding program contains pointing cuts: a cycle is discarded if it contains one or more bad samples. A sample is considered “bad” if any of the following are true: 1. The encoder UTC or az/el shows an “obvious” glitch (e.g. there are a few onesample events where UTC=0 was encoded). This removes a negligible fraction Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 31 of the data. 2. The distance between the encoder pointing and ring center differs from the intended ring radius by more than 0.06 deg. 3. The time difference to the previous sample exceeds 0.02 sec, indicating dropped buffers. 4. The value of p is less than or equal to the value from the previous sample. This test catches extended periods where the telescope was stopped. 5. The value of p increased, relative to the value from the previous sample, by more than 0 .8 6 deg. Additionally, a cycle is cut if the final population count n{ for any ring bin i is outside the range 0 .6 ( 2 1 8 8 /^ ^ ) < nl < 2.0(2188/A/ring), where N r-Ulg is the number of ring bins. The pointing cuts remove about 4% of the data, most of which consists of stopped telescope motion. 2.5 D ata selection In the next stage of the pipeline, each cycle found in the preceding stage is inspected, and cuts for anomalous behavior are imposed. It is outside the scope of this thesis to describe the data selection in complete detail, so we simply outline the procedure: 1. Anomalous total power: we fit each CAPMAP total power channel to a linear combination of the lowest five Fourier modes in the ring angle p, i.e., the fit is of the form Aq + A \ cos(p) + A 2 sin(p) + A 3 cos(2p) + A 4 sin( 2 p) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (2.12) 32 We sum the x 2 °f this fit over the nine most stable total power channels to obtain a “grand x 2” value, and discard cycles whose grand x 2 is anomalously high. The grand x 2 is a monitor for execess variance in the total power channels as a monitor for excess variance in the total power channels which would indicate bad weather or other sources of flaring up. 2 . Anomalous polarization: analogously, we fit Eq. (2 . 1 2 ) to the data in each polarization channel, and sum over channels to obtain a polarization “grand X2” . In this case, we cannot cut too tightly without introducing bias, since the cut is based on the same polarization channels used for the power spectrum analysis, but simulations show th at a 3cr cut is safe. 3. Glitches: we exclude a cycle if the maximum excursion (maximized over samples and channels) from the five parameter fit in Eq. (2.12) is anomalously high, indicating a “glitch” from instrumental sources or isolated events such as birds or airplanes. 4. Sun: we identified regions of data in which sidelobes from the Q-band beam pass close enough to the Sun th at significant excursions are seen. The sun cuts have 24-hour period, are only in Q-band, and are dewar-dependent. 5. Anomalous gains: several regions were cut by hand in which the gain of a specific polarimeter was found to have an anomalous value. The output is a set of “surviving” cycles together with a 44-bit mask in each cycle indicating which radiometers are considered valid during the cycle. (In principle, the pipeline allows the cuts to be independent in each channel, although in practice the channel dependence is small, arising only through the sun and anomalous-gain cuts.) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 33 2.6 M ap m aking Each CAPMAP timestream can be decomposed into signal + noise as follows: di — PipSp + 7ii (2.13) Here, didenotes the timestream values, represented as a vector oflength N tocp The CMB signal in pixel space is represented as a length-(2Apjx)vector sp , where the index p denotes a combination of a pixel and choice of Stokes parameter. The pointing matrix Pip was defined in §2 .2 . 1 : it encodes the contribution to timestream sample i due to the pixel value sp . The noise rij is modeled as Gaussian, with noise covariance Njj = Cov(rq, rij) given by Eq. (2.8): N = N 0 + e~ 1R T R + e - 1S T S (2.14) where Nq is diagonal, and the matrices R, S contain ring modes and ground-synchronous modes to be marginalized. The objective of map-making is to invert Eq. (2.14) statistically, arriving at an estimator f p for the CMB signal, with associated noise covariance matrix Npq =f Cov(sp ,Sq) (2.15) It is well-known (Tegmark, 1997) th at there is an optimal choice of estimator 'sp, defined by the pair of equations: = V p* N v dJ = Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (2-16) 34 In the map making stage of the pipeline, we compute s'p and Npq, starting from the timestream data dt, the pointing data Pipi and the noise model in Eq. (2.14). If dense m atrix algebra with fVrto(j-by-jVtocj matrices were computationally feasible, then map making would be essentially trivial by direct use of Eqs. (2.16). The rest of this section is devoted to presenting an algorithm for computing s' and N which is mathematically equivalent, but whose computational cost is linear in the timestream size N tod. The (^(A^qj ) algorithm proceeds in three steps. First, consider the following matrix and vector quantities: P N q 1P T pn ^ s 1, sn 0\ st PN^d SN ^d P N q 1R t R N q 1R t SN ^ r T RN ^d Note th at the first two are identical to the inverse noise N ~ (2.17) 1 and inverse noise weighted map N - 1 s' defined in Eqs. (2.16), except th at the full noise covariance N has been replaced by the diagonal version N q. (The motivation for considering the remaining 7 quantities will be explained shortly.) Because N q ^ is diagonal, and the matrices P, R, S are sparse, it is easy to see th at Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 35 each of the quantities in Eq. (2.17) can be computed in time 0 ( N t0ij). In implemen tation, we read each timestream sample from disk and accumulate its contribution before reading the next sample; the contibutions from each sample are independent. In the second step, we iV” 1 define an inverse noise covariance = lim(A^o + e“ 1JRTi ?)“ 1 = N q 1 - P N q 1R t {RN q 1R t ) - 1R N q 1P t (2.18) in which ring modes have been marginalized but not ground-synchronous modes. Now consider the matrix and vector quantities: P N ~ l P T = P P = P N q 1S 7 N ^ S 7 SN ^S7 N ^ P 7 - P N q 1R T ( R N q 1R T ) ~ 1R N q 1P T - P N q 1R 7 ( R N q 1R 7 ) ~ l R N q 1S T = S N q K s 7 - S N q 1R 7 ( R N q 1R T ) - 1R N q 1S T P N ~ l d = P N ^ d - P N Q 1R T ( R N ^ 1R 7 ) - 1R N Q 1d SN ^d = S N ^ d - S N Q 1R 7 (RN Q 1R 7 ) - 1R N Q 1d (2.19) On the right-hand side, we have written each quantity as the “diagonal” version accumulated in the first step, minus a “ring mode correction” which can be expressed in terms of diagonal quantities accumulated in the first step (Eq. (2.17)). It is easy to see th a t the ring mode correction is independent for each cycle, and for the m atrix quantities listed, amounts to a rank-5 update of a larger matrix. In implementation, we apply the ring mode correction at the end of each cycle, to update the diagonal quantities in Eq. (2.17) to the quantities in Eq. (2.19) which fully incorporate ring mode removal. Finally, for the third step, we consider the full CAPMAP noise model with ring Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 36 mode and ground-synchronous mode removal: N~l = (N R + e~1S T S r = N r 1 - N ^ S t {£ + S N ^ S ^ S N j i 1 1 (2.20) We have kept a small nonzero regulating constant e in the definition of 7W 1, rather than taking the limit e —> 0 as in Eq. (2.18), for technical reasons: the operator ( S N ^ 1S t ) appearing on the right-hand side is not invertible, due to the presence of a few timestream modes (such as a mode which is constant over the entire timestream) which are removed by both ring and ground-synchronous mode removal. Now consider the quantities: PN~^Pt = P N ]i1P T - P N - 1S T (e + S N - 1S T ) - 1S N ~ 1P T (2.21) P N ~ l d = P N ~ l d - P N ^ S T (e + S N ^ 1S T ) ~ 1S N ~ 1d which incorporate both ring mode and ground-synchronous mode removal, and are the output (Eq. (2.16)) of the map making pipeline stage. We have w ritten each quantity as the “ring mode only” version accumulated in the second step, minus a “ground-synchronous mode correction” which can also be computed from quantities accumulated in the second step (Eq. (2.19)). In implementation, we apply the uni versal structure correction at the end of each contiguous series of raw data files, to update the ring mode removed quantities in Eq. (2.19). In summary, we have arrived at a three-stage algorithm for computing the opti mal map and associated noise covariance matrix in Eq. (2.16) with ring and groundsynchronous mode removal. We iterate over 104 Hz samples, accumulating the (in dependent) contribution of each sample to the “diagonal” quantities in Eq. (2.17), Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 37 which do not include mode removal. Whenever the end of a ringscan cycle is reached, we use Eqs. (2.19) to update the diagonal quantities to their “ring mode corrected” versions which incorporate ring mode marginalization but not ground-synchronous marginalization. Whenever the end of a contiguous series of data files is reached, we use Eqs. (2.21) to update the ring mode corrected quantities to their groundsynchronous corrected versions which incorporate the full CAPMAP noise model. The final output is the optimal map and noise covariance m atrix given in Eq. (2.16). This three-stage algorithm can be implemented efficiently; we find a running time of 25 CPU-min per 100 hours of raw data which is processed. 2.7 Likelihood analysis Given a map and associated noise covariance (N , m ), which could either come di rectly from the map-making program (§2 .6 ) or from subsequent map combining and differencing operations (§2.8, 2.9), we estimate the underlying power spectra using a likelihood analysis. At an intuitive level, the likelihood analysis amounts to “looking for excess covariance” in the data (beyond the noise covariance N ), by exploring the posterior likelihood C [ C f E , C f B , C f B \m] (2.22) for the polarization power spectra, given the observed map m. It can be proved (Bond et al., 1998) th a t the likelihood analysis we will describe is an optimal procedure for estimating power spectra, in the sense th at statistical lower limits on the power spectrum uncertainties (the Cramer-Rao inequality) are saturated. First, we need an expression for the signal covariance due to an assumed set of power spectra C E E , C BB, C E B . This is computed in Appendix C, Eqs. (C.14), (C.15), where the covariance between any pair of pixels x , y is computed in the “two-point” Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 38 frame in which the Stokes basis at each of the points x , y is aligned with the great circle direction between the two points. Looping over each pair of pixels x, y, and rotating back to the “global” Q /U basis, results in a (2 Np[x )-by- (2 N p[x) signal covari ance m atrix S which depends on the power spectra. (We mention one optimization: in implementation, we sample the correlation functions with arcminute spacing in 6Xy = cos- 1 (x • y), and evaluate at the desired separation using cubic interpolation. This allows us to compute signal covariance matrices “on the fly” in a few CPUseconds whenever needed throughout the analysis pipeline.) Next, let us first write the likelihood £[m|5'] of obtaining the noisy map m, given noise covariance N and signal covariance S. This likelihood is simply a multivariate Gaussian with zero mean and covariance given by (S + N): £[m\S] = det _1/ 2(5 + N ) exp + N ) ~ lrm ^ (2.23) Using Bayes’ theorem, this can be inverted to give the posterior likelihood for a given signal covariance S, given the noisy map m: £[S'|m] oc det _ 1 / 2 (S' + N ) exp ^ - - r n ^ ( S + P[S] (2.24) We have introduced a prior P[S] and used the symbol oc to indicate th at the quantity on the right-hand side should be normalized to have integral 1 . The likelihood function in Eq. (2.24) can be used to estimate power spectra and assign power spectrum uncertainties, as we will now see. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 39 2 . 7 .1 B a n d p o w e r e s t i m a t e s : m a x i m u m lik e lih o o d The first application of the likelihood function C[S\m] (Eq. (2.24)) is simple: byfinding the power spectra which maximize the likelihood, we can obtain estimates for the power spectra given the noisy data m. Since the small CAPMAP survey area limits its spectral resolution to A I ~ 300, we parameterize the power spectra by a small number of bandpowers A^; our default choice is to take 4 bandpowers for each of { E E , B B , E B } , corresponding to the angular ranges (^mjn, £m&x) — (2, 500), (500,800), (800,1100), and (1100,1400). Thus the signal covariance matrix S can be written: S = Y ,& b S b b (2.25) where S is the signal covariance due to unit power (i.e. A^ = 1) in the single EE, BB, or EB band indexed by b. By default, we use flat bandpowers: “unit power” means th at £(£ + l)Cf/(2ir) is equal to 1 throughout the band. In EE, we occasionally use bandpowers which are multiples of fiducial: “unit power” means th at has the predicted values (see Fig. 1.3) throughout the band. This allows us to easily check consistency with the theoretical prediction: the measurement and theory agree if Aj, = 1 (within uncertainties) for EE bandpowers, and A^ = 0 for EB /BB bandpowers. Substituting Eq. (2.25) into Eq. (2.24), the posterior likelihood for the bandpowers A 5 can be written logC[A\m] = ( —~Tr log(S + N) - ^ d T (5 + fV)- 1 ^ (2.26) where we have assumed a uniform prior P[A^] in the bandpowers, and dropped an overall constant on the right-hand side. In this formalism, estimating the power spectrum is equivalent to maximizing the multivariate function log £[A|m] in the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 40 variables A 5 . Several approaches to this maximization problem have been proposed in the literature; as we now explain, we use a new approach. Our basic strategy is to use the BFGS maximization algorithm (Press et al., 1992), which requires two ingredients: first, an efficient procedure for computing the gradient 5(log£)/3A ^; and second, a procedure for maximizing log £ when restricted to an arbitrary line in parameter space, i.e. finding the value of t which maximizes the quantity log£(f) = log£[A^0) + t A ^ } (2.27) where the vectors A ^°\ A ^ are arbitrary. For the first ingredient, computing the gradient of log £ , straightforward differen tiation of Eq. (2.26) gives: 31og£[A|m] dAb ilY ( (5 + JV)-1^ ) + \ ™ T (S + N ) - l S b(S + N T ' m (S + N) 1 + mmT (2.28) where in = (S + A )_ 1 m. In the second line, we have written the gradient in the form Tr(MS),); once the matrix M has been computed, this can be computed in a few CPU-seconds using the cubic interpolation trick from the beginning of §2.7, without needing to store the matrices S &in dense form. (The linear operation M —> TV(M S 5 ) is the transpose of the linear operation A& —> ^b^b- and any efficient algorithm for the latter can be converted formally to an efficient algorithm for the former.) The computational cost of computing the gradient in Eq. (2.28) is therefore dominated by the single dense m atrix operation (S + N) —> (S + iV)- 1 . The second ingredient needed for the BFGS algorithm, solving the single-variable Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 41 maximization problem in Eq. (2.27), initially appears to require many evaluations of the likelihood function, but we introduce a trick. We compute the dense matrices SS = (2.29) b S0 + N = ( ^ A < 0 )s ( j + J V (2.30) and then compute relative eigenvalues Xt and eigenvectors vt , defined by {SS)vi = Xi (S0 + N ) v i (2.31) with normalization v f ( S 0 + N)vj = 5ij (2.32) We also compute the components /i? = rnTvl of the map m in the eigenbasis. A short calculation now shows th at the ID likelihood in Eq. (2.27), obtained by restriction to a line in parameter space, is given by (up to an irrelevant overall constant): logc(t) = ^iog(i + tXi) + Y + t x l j (2-33) This shows th at a single dense m atrix operation (computing relative eigenvalues and eigenvectors) is sufficient to write the likelihood at all points in the ID line defined in Eq. (2.27) in a simple closed form. Once this has been done, the likelihood can be maximized along the line using brute force evaluation, at a negligble computa tional cost (because dense m atrix algebra is 0(N^-]yi) whereas the cost of evaluating Eq. (2.33) is <9(7Vpix)). Combining these procedures for computing the gradient (Eq. (2.28)) and solving Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 42 the ID maximization problem (Eq. (2.27)), we can use the BFGS algorithm to con verge to the maximum likelihood bandpowers. In comparison with other approaches such as Newton-Raphson, this method has several advantages. First, it is fast: only two dense m atrix operations (inversion and solving an eigenproblem) are needed in each iteration. If we needed the matrix of second derivatives, <92 log£[A |ra] 8 A bd A v (2.34) - d T (S + N ^ S b i S + iV)_ 1 5 6/(5 + N ^ d then m atrix operations would be required. Second, because each iteration is based on solving the ID maximization problem (Eq. (2.27)) the value of the likelihood is guaranteed to increase in each iteration, and the region of parameter space where (S + N) is not positive definite (so th at the likelihood is undefined) is never entered. This is because in the “fast” form Eq. (2.33)), the positive definite region is given by (—1/A) max < t < ( —1/A)min, so we can easily avoid leaving it. In contrast, in other iterative methods, it is possible to “bounce” outside the positive definite region, which complicates the implementation. 2 .7 .2 B a n d p o w e r u n c e r t a i n t i e s : c o n d itio n a l lik e lih o o d s The procedure described in the previous subsection produces maximum likelihood estimates for the bandpower values A b. However, this procedure is incomplete without some means of estimating bandpower uncertainties as well. This requires exploring the likelihood function away from maximum. The simplest method for estimating the uncertainty on bandpower A& would be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 43 to consider the conditional likelihood \ogC[Ab\ m , A b^ b] (2.35) on A b. with the other bandpowers A y^.b fixed to their maximum likelihood values. If the bandpowers were independent, this would preserve all information in the joint likelihood, but in reality there are correlations (to be discussed in detail below) which make this method approximate. Because the conditional likelihood log C[Ab\m, A b/^Lb] is obtained by restricting the joint likelihood to a ID line in parameter space, it can be computed efficiently using the the eigenvalue trick from §2.7.1. As described there, after a single £>(Npix) dense m atrix operation, the conditional likelihood can be w ritten in a simple form (Eq. (2.33)) in which the cost of subsequent evaluations is 0 ( N pix ). This allows us to compute the entire ID likelihood quickly, as illustrated in Fig. 2.5, where we show conditional likelihoods log C[Ab\m, A b>^b\ for each band b, with the other bands b' ^ b mixed to maximum likelihood values. Because the bandpowers are not independent, this set of conditional likelihoods does not contain all the information in the joint likelihood logC[Ab\m\. We can get a quick look at band-band correlations by plotting the two-band conditional likelihood on bands i»i, 6 2 , with the remaining bands b' 7 ^ , 62 fixed to maximum likelihood values: 1°S j A ^ IWl) A y ^ bl^2i (2.36) We sample this 2D likelihood efficiently by using the eigenvalue trick from §2.7.1 to compute Eq. (2.36) on a collection of ID lines which pass through the maximum likelihood point. This is illustrated in Fig. 2.6, where we show a contour plot of the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 44 r~n M I | I I I I | I I I I | I I I ]_ LiII 0 2 4 6 8 E E / fi d [2 , 5 0 0 ] 0 E E / f i d [5 0 0 ,8 0 0 ] j i ] i i i | i m i. B B /fla t [5 00 ,80 0 ] J. B B / f l a t [ 8 0 0 ,1 1 0 0 ] B B / f l a t [ 1 1 0 0 ,1 4 0 0 ] J. Vi-li II h I i jJJl -5 10 E B / f l a t [ 2 ,5 0 0 ] E E /fid [110 0,1 40 0 ] | i i i | i i i | i r -.A i i fr-i-i 0 10 2 0 30 40 B B / f l a t [ 2 ,5 0 0 ] E E / f i d [ 8 0 0 ,1 1 0 0 ] 2 0 -1 0 0 10 20 E B /fla t [500,800] 20 40 - 5 0 0 50 E B / f l a t [ 8 0 0 ,1 1 0 0 ] E B / f l a t [ 1 1 0 0 ,1 4 0 0 ] Figure 2.5: Conditional likelihoods for EE, BB and EB bandpowers, with remaining bandpowers fixed to maximum likelihood values, for simulated W -band (black/solid) and Q-band (blue/dotted) full-season data. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 45 20 15 10 5 0 ■5 0 10 20 30 40 50 Figure 2.6: Conditional likelihood for lowest-^ EE and BB bandpowers (on the x-axis and y-axis respectively), with remaining bandpowers hxed to maximum likelihood values, for simulated Q-band full-season data. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 46 EE bandpowers 1 -0 . 0 1 0 . 0 0 -0 . 0 0 1 - 0 .0 1 -0.09 - 0 . 0 1 1 0 .0 0 -0.09 -0.06 1 - 0 . 0 0 - 0 . 0 1 -0.06 0 . 0 2 -0.09 0 .0 1 0 .0 0 0 .0 2 0.04 0.03 - 0 . 0 2 - 0 .0 1 0 . 0 0 -0.06 0 . 0 0 - 0 .0 1 0 . 0 1 0 .0 1 0.05 -0 . 2 2 -0 . 0 0 -0 . 0 1 0 . 0 1 - 0 .0 1 -0.17 - 0 . 0 0 - 0 . 0 1 - 0 .0 1 0 . 0 2 - 0 . 1 0 0.03 - 0 . 0 0 - 0 . 0 0 - 0 . 0 1 -0.26 BB bandpowers -0.09 0 . 0 2 - 0 .0 1 - 0 .0 1 0 .0 1 - 0 . 0 1 0.04 0 . 0 0 0 .0 1 0.03 -0.06 0 . 0 1 -0 . 0 2 0 . 0 0 0 .0 0 0.05 -0.04 - 0 .0 1 0 . 0 1 1 -0.04 1 -0.17 - 0 .0 1 1 -0.17 - 0 . 0 1 -0.17 1 0 .0 1 - 0 . 0 1 -0.17 - 0 . 1 2 - 0 . 1 0 0.04 - 0 .0 1 -0.09 -0.27 0.07 0 . 0 0 0.04 -0.09 0 . 0 0 0 .0 1 0 .0 1 -0.03 0.05 -0.32 EB bandpowers -0 . 2 2 -0 . 0 1 -0 . 0 1 -0 . 0 0 - 0 . 0 0 -0.17 0 . 0 2 - 0 . 0 0 - 0 .0 1 - 0 . 0 0 - 0 . 1 0 - 0 .0 1 0 .0 1 - 0 . 0 1 0.03 -0.26 -0 . 1 2 - 0 . 1 0 0 . 0 1 0 .0 1 - 0 . 1 0 -0.27 0.04 -0.03 0.04 0.07 -0.09 0.05 -0.32 -0 . 0 1 0 . 0 0 0 .0 0 1 -0 . 0 2 0 . 0 0 0 .0 0 -0 . 0 2 1 -0.15 0.03 1 0 .0 0 -0.15 -0 . 1 0 1 0 .0 0 0.03 - 0 . 1 0 Table 2.2: Correlation m atrix between bandpowers, computed from 2D conditional likelihoods with the remaining bandpowers held fixed to maximum likelihood values. 2D conditional likelihood in Eq. (2.36) for the EE, BB bandpowers in the lowest t bin, (^mim fmax) = (2, 500). About 10 radial lines are needed to obtain smooth contours. This procedure can be repeated for each pair of bandpowers (&i, 6 2 ). The results are summarized in Tab. 2.2, where we show the covariance m atrix between all pairs of bandpowers (rather than plotting each full 2D likelihood). Note th at the eigenvalue trick, which allows us to evaluate the likelihood “all at once” along any ID line in parameter space, allows us to reduce the effective number of dimensions by one (relative to, say, grid sampling) when computing a conditional likelihood. For the ID conditional likelihood (Eq. 2.35), a single evalution is required; for the 2D likelihood (Eq. 2.36), a one-parameter family of evaluations is required (corresponding to the polar angle around the point of joint maximum likelihood). However, we do not use this method to explore the likelihood function when the number of parameters is larger than two; in this case, there is a more efficient procedure: Markov chain sampling. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 47 2 .7 .3 B a n d p o w e r u n c e r t a i n t i e s : m a r g in a liz e d lik e lih o o d s The techniques for computing ID or 2D conditional likelihoods described in the pre ceding subsection are useful for a “quick look” at power spectrum uncertainties, but fail to capture all information contained in the joint likelihood log £[A|m]. For ex ample, for the “bottom line” result of the experiment, we would like to report uncer tainties on each bandpower A;, with the remaining bandpowers marginalized (rather than fixed to their maximum likelihood values). If the number of bands is not small (say > 5), it is essentially impossible to compute marginalized uncertainties by direct sampling of the likelihood function (as we did in §2.7.2 for conditional uncertanties) without making approximations. A huge number of likelihood evaluations would be required, in order to obtain sufficient sampling to perform the multivariate integral. However, we can introduce another technique which is ideally suited for this problem: Markov Chain Monte Carlo (MCMC). The basic idea of the Markov chain method is: rather than sampling the like lihood function at a fixed set of points, we generate a sequence of random sam ples A ^ \ A ^ , A ^ , . . . which are drawn from the multivariate likelihood (or PDF) £[A|m]. Note th at each sample is a vector A^(i) , where the index b ranges over bandpowers. In the Markov chain approach, it is easy to generate marginalized un certainties: if we simply discard bandpowers b' ^ b in each sample, then the resulting sequence of numbers will be randomly sampled from the marginalized likelihood for bandpower b. Similarly, if we discard bandpowers b' ^ 6i , 62, then the resulting sequence of pairs (A ^ , A^2) will be randomly sampled from the joint likelihood on bandpowers b\ , 62 with the remaining bandpowers b1 b\, 62 marginalized. (This is useful for, e.g. computing marginalized correlation coefficients.) In the Markov chain method, each vector A^(i) is generated randomly starting Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 48 from the previous vector ( i — 1) . The transition probabilities are constructed in such a way (to be described in detail below) th at the equilibrium distribution, after many iterations, is the likelihood function £[A&|m]. However, consecutive samples are cor related: the conditional distribution P [ A ^ |A ^ ^], for a fixed value of A ^ will not be equal to the equilibrium distribution. Therefore, if we have N consecutive en tries in the chain, they cannot be treated as statistically independent. (At a heuristic level, the effective number of independent samples is N / ( , where ( is the “correla tion length” of the chain.) In practice, this simply means th at if we are using the Markov chain to compute a particular quantity, such as the marginalized 95% confi dence region on a bandpower, or the marginalized correlation coefficient between two bandpowers, we run several chains until they converge to the same answer, rather than running a single chain and assuming 1 / y / N statistics. How do we generate each random vector A^ from the previous vector A^ , such th a t the equilibrium distribution after many iterations is £[A^|m]? There are many ideas for doing this in the Markov Chain literature; we use the best-known method, Metropolis-Hastings sampling, which we now explain in detail. We start by randomly generating a “proposal” vector A^r using a proposal density (or PDF) which is a multivariate Gaussian centered at A ^ P[A(t; - 1} A f ] = det "V2(W)exp ( - j f A f 1 - A ^ J M ^ A f - A p 1’) ) (2.37) The choice of covariance matrix M in the proposal density will be discussed below. We then either accept the proposal (meaning th at we take the next sample in the chain to be A ^ = Ajj*r), or reject (meaning th at we start over by generating a new proposal vector to be accepted or rejected). This is decided randomly, with Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 49 o o 'H £ o o LTI m m -4 4 6 8 10 EE [2,500] 14 12 (microK) 16 18 Figure 2.7: Markov chain likelihood samples from the full-season simulated dataset in Q-band. Each sample is a twelve-component vector, corresponding to the twelve bandpowers shown in Fig. 2.5, but only the lowest-^ EE bandpower (x-axis) and lowest-^ BB bandpower are shown here. The equilibrium distribution of the chain is the marginalized likelihood, but each sample is correlated to the previous sample. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 50 acceptance probability given by: , W = £ [ A f H \ I P [ A f - A i r 11] , ..V , V . L ,' 6 \ P 1 4 ~ " ^ A?] J (2 38) Note th at, with the Gaussian proposal density defined in Eq. (2.37), the second factor on the right-hand side of Eq. (2.38) cancels, but we have written Paccept in a more general form which does not require P[a[* ^ —> Ajj*r] and P[A^r —> A ^ to be equal. This two step procedure, generating proposal samples (Eq. (2.37)) which are either accepted or rejected (Eq. (2.38)), is known as the Metropolis-Hastings algorithm. It can be shown th at this scheme leads to the equilibrium distribution £[A^|m] as required, for any choice of covariance matrix M in the proposal density (Eq. (2.37)) However, efficiency considerations dictate the choice of M: if M is too small, then consecutive samples are highly correlated and the correlation length of the chain is large. If M is too large, then the acceptance probability (Eq. (2.38)) will be small on average and many likelihood evaluations will be required to advance in each iteration. We make the following choice of M: L_ ( o A82aA~ =- = = Mr}7 }/ = bb v'A'hand \ d A b d A y „... log C[A\m] 6 (2.39) 1 ' V A = ML where the notation (-) a =ML means th at the second derivatives are evaluated valueof A whichmaximizes the likelihood. Thus, before running the at the Markov chain, we precompute maximum likelihood bandpowers and evaluate the m atrix of second derivatives (Eq. (2.34)). The computational cost of Metropolis-Hastings scheme is determined by the like lihood evaluations needed to compute the acceptance probability in Eq. (2.38). The Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 51 most efficient way to evaluate log C is to write log£[A|m ] — - - l o g det (S + N) — ^-mT (S + N ) 1m = ( 1o§(Lm) + \ ( L ~ l m )i(L ~ l m ) i jS - (2-40) where L is the lower-triangular Cholesky factor of the total signal + noise: AT + ^ A t Sj = LLr b (2.41) Using Eq. (2.40), the only 0 (A rpix) operation is Cholesky factorization (note th at L ~ ^ m can be computed with cost 0 ( N ^ X)), which is significantly faster (by a con stant factor) than m atrix inversion or solving an eigenproblem, although the 0 (N^-x) asymptotic complexity is the same. It is interesting to note that, using likelihood evaluations as in §2.7.2, it is easy to compute conditional likelihoods but difficult to compute marginalized likelihoods. Using Markov chains, the opposite is true: it is easy to marginalize parameters (by simply discarding their values from the chain) but difficult to impose constraints or make the likelihood conditional (without running a new chain). Marginalized likelihoods from the Metropolis-Hastings chain are shown in Fig. 2.8. 2.8 Com bining maps It is convenient to have a general-purpose program for combining two maps (N\, m \) and (N 2 , m 2 ), produced by the map-making procedure in the preceding section. (Note th at each “map” consists of a length-(2 7Vpjx) vector m and a (2 iVpix)- by- (2 fVp;x) noise covariance m atrix N.) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 52 i j i M i i i i | i i i j" i i i j i LJ 0 20 4 0 60 80 E E / f l a t [ 2 ,5 0 0 ] TT _LX L l! E E / f l a t [ 5 0 0 ,8 0 0 ] 50 100 E E / f l a t [ 8 0 0 ,1 1 0 0 ] 0 50 100 150 E E / f l a t [ 1 1 0 0 ,1 4 0 0 ] X 0 10 2 0 3 0 4 0 B B / f l a t [ 2 ,5 0 0 ] B B / f l a t [5 0 0 ,8 0 0 ] H T I I | I I I I ) I II I | I I I I j I I I I H -10 0 10 20 E B / f l a t [ 2 ,5 0 0 ] E B /fla t [500 ,80 0 ] B B /fla t [800 ,11 0 0 ] [_l | I I |-j I I I | I I I | I | | M E B / f l a t [ 8 0 0 ,1 1 0 0 ] B B /fla t [1 1 00 ,14 0 0 ] U_ | I | | | | | I | | | | E B / f l a t [ 1 1 0 0 ,1 4 0 0 ] Figure 2.8: Marginalized likelihoods for EE, BB, and EB bandpowers, obtained by histogramming Markov chain samples, for simulated W-band (black/solid) and Qband (blue/dotted) full-season data. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 53 One application of such a program is to distribute the computational cost of map making across several CPU ’s, without needing to parallelize the linear algebra in the map-making program (§2.6). We can simply divide the data into subsets, run a separate instance of the serial map making program on each subset, then combine the maps. The resulting map will be equivalent to the result of running the map making program on the entire dataset. A second application is to combine W -band and Q-band data, which have different beam sizes. We do this by making two maps which correspond to all the desired data at the two beam sizes, then combining the maps. (This is a m atter of convention: the map making program in §2 . 6 only emits maps with a single beam size in order to keep the code as simple as possible; we introduce the extra complexity of distinct beam sizes in the map combining program.) We now describe our algorithm for combining maps. First, consider the case of two maps ( N i , m i ) and (Ar2 , m 2 ) with the same beam size. Then the combined map (N , m) is simply given by: N~l N ~lm = N {l + N ^1 (2.42) = N ^ lm\ + N ^ lm2 (2-43) If the input maps cover a different set of pixels, the inverse noise covariance matrices N F 1 are first extended (with zeroes) to the union of the pixelizations. In this case, map combining is just a computational shortcut: the combined map (AT, m) is equal to the map obtained by running the map-making program (§2 .6 ) on the union of the timestream d ata used to obtain the maps (Ni, m \) and (N 2 , m 2 ). If the beam sizes are different, then combining maps is more difficult. We emit a “map” (TV, m) containing two distinct types of pixels corresponding to the two beam Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 54 sizes. / 0 Ni N N<1 J V ° ( \ m (2.44) Thus the pixel count roughly doubles when Q-band and W -band are combined. In order to perform a likelihood analysis (§2.7) on a “doubled” map (such as Eq. (2.44)) in which the Q-band and W -band pixels are kept separate in the noise covariance, all th at is needed is a prescription for computing signal covariance matrices. The correct prescription is: ( 2 . 45 ) where S \ , S 2 denote signal covariance matrices with each respective beam size, and S x denotes the signal covariance matrix computed with beam size ( 2 . 46 ) Eq. (2.45) includes correlations between the Q-band and W -band pixels which enforce the constraint th at the CMB seen in W -band and Q-band is the same, except for the beam size. If the off-diagonal blocks were zero (as in the noise covariance, Eq. (2.44)), the W-band and Q-band d ata would be incorrectly treated as measuring independent patches of sky. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 55 2.9 N ull maps Given two maps ( N i , m \ ) and (N 2 , m 2 ), the map combining procedure in the last section provides an optimal way for determining the power spectrum, given both measurements. It is also useful to have a procedure for determining whether the two measurements are consistent with each other, i.e. whether they measure the same underlying CMB. To determine consistency, we first compute a null map by differencing the two maps, so th at the response to any shared CMB signal is zero. Thus the null map should be “pure noise” if the two input maps are consistent. We then subject the null map to the power spectrum analysis described in §2.7, to check consistency with zero signal power. In this section, we describe our procedure for obtaining null maps. If the input maps ( N i , m i ) and (A 2 , m 2 ) have the same beam size, then differ encing is straightforward: the null map (N , m) is given by. If the N = N \ + A^2 m = m i —m 2 input maps cover a different set of pixels, the noisecovariance (2.47) matrices iVj and maps mjmust first be restricted to the intersection ofthe twopixelizations. In contrast to the case of combining maps (§2 .8 ), it is not possible to apply inverse noise weighting to the maps m i, m 2 , to optimize the signal-to-noise in the output map. Straightforward differencing as in Eq. (2.47) is the only way to subtract the maps which ensures zero response to an arbitrary CMB signal. Now consider the case when the input beam sizes are different, say beam 1 is larger than beam 2. We degrade map 2 to the beam size 1 (the details of the beam- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 56 degrading operation will be described shortly), then difference using Eqs. (2.47) and give the output map the larger beam size. We do not keep separate pixels for the two beam sizes as in Eq. (2.44). For a null map, this does not lose information, since one is limited by the angular resolution of the map with the worst beam size, just as the straightforward subtraction of maps in Eq. (2.47) means th at one is limited by the noise level of the map with the worst sensitivity. The beam-degrading operation in real space is given in detail as follows. We first define a degrading operator A whose m atrix element between pixels p,p' with separation Qppt is given by exp(— Fpp. 'V < 2 '4 8 ) where the coefficient a is defined by a 41n2 ^ ( a FWHM)2 (a FWHM)2 J The denominator in Eq. (2.48) simply normalizes so that ^2’49^ A pp/ = 1. The degrad ing operation on a map (N, m) is then given by N -» A N A t (2.50) m —»• A m (2.51) When degrading the beam size, we must throw away all pixels within a few beam widths of the survey boundary. One mightwonder whether itis safe to perform the beam-degrading operation in pixel space, given th at the pixel-pixel separations are roughly beam size. This can be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 57 directly tested by asking whether the signal covariance matrices Si , (A S 2A T ) (2.52) corresponding to beam-degrading the signal covariance in harmonic and pixel space respecively, are distinguishable. This will depend on the signal-to-noise. For CAPMAP sensitivity and beam size pixels, direct comparison of likelihood functions shows th at the two signal covariances in Eq. (2.52) are only “distinguishable” by a fraction of a sigma. 2.10 N ull te st suite A typical application of the null map procedure described in the preceding section would be to verify that, if we combine all W -band data taken during daytime, the resulting measurement is consistent with all W -band data taken during nighttime. We run the map-making program in §2.6 on disjoint “day” and “night” subsets of the W-band d ata (imposing the data selection described in §2.5), obtaining maps (-^day> m day) and (^night; m night)- This pair of maps is then taken as input to the map differencing procedure (§2.9), obtaining a day-night null map (Nnun, mnuu). We would like to check th at the null map is consistent with zero signal. We do this using the likelihood analysis of §2.7: we fit EE, BB, and EB bandpowers to the null map, obtaining the result shown in Fig. 2.9. Each fit is performed in a single large £ band, rather than using a multiband as we do for the overall power spectrum analysis (Fig. 2.5), because the effective sensitivity of the null map is significantly worse than the total sensitivity of the experiment. (In units of integration time, the loss of sensitivity is a factor ~ 4: the day/night submaps lose a factor ~ 2 relative to Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 58 1000 500 EE ( / x K 2) BB (/x K2) -1 0 0 EB (/ x K 2) 10 Figure 2.9: Day/night null test for W -band CAPMAP data. The y 2 value (Eq. (2.53)) to zero for the EE, BB, and EB likelihoods is 0.01, 0.79 and 0.25 respectively, with one degree of freedom in each y 2. their sum, and differencing loses another factor ~ 2 .) To quantify the consistency with zero signal, we compute the probablity to exceed zero (P T E q) for each of the three likelihood curves in Fig. 2.9. We obtain P T E q = 0.53, 0.19, 0.31 for EE, BB, and BB respectively. For combining many null tests, it is convenient to convert each P T E q value to a y 2 with one degree of freedom: y 2 = 2 ( e r r 1( l - 2 - P T E 0)) (2.53) This conversion is trivial for a single likelihood curve: Eq. (2.53) has been constructed so th at the RHS will be distributed as a y 2 with one degree of freedom in the absence of any signal (assuming th at P T E q is uniform distributed). However, for N likelihood curves, it is convenient to sum the y 2 for each curve, obtaining an overall y 2 with N degrees of freedom which can be used to decide whether the overall null test suite passes or fails. For the day/night null test in Fig. 2.9, we obtain an overall y 2 = 1.05 with 3 degrees of freedom, indicating th at the null test passes. Note th at the “neutral” value of P T E q is 0.5, with values close to 0 or 1 indicating failure of the null test; Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 59 whereas the “neutral” value of y 2 is zero, with large values indicating failure. The day/night null test we have described in detail is part of a larger null test suite, where each null test is based on dividing the data into disjoint subsets. The test suite is shown in Tab. 2.3; the individual tests are described as follows (in order of appearance in the table): 1. Period-period tests: We difference maps made from data from each of the three gain periods described in §2 .2 .2 . 2. Per-dewar tests: We difference maps made from data in different dewars (Fig. 2.1). 3. Per-frequency tests: We difference maps made from data in different subfre quencies (Tab. 2.1). 4. Grand y 2: The “grand y 2” was introduced in §2.5 as a statistic for cutting anomalous cycles during periods of bad weather or instrumental flareups. In this null test, we divide channels into subsets based on whether the time-averaged y 2 of the channel is higher or lower than the median value. (This roughly monitors the level of ( 1 / / ) noise in the channel.) 5. Universal structure y 2: Analogously, we divide channels into subsets based on whether evidence for ground-synchronous structure is higher or lower than the median. 6 . I —»■Q mixing: Each polarization channel has a characteristic level of I —>Q mixing, or pickup of anisotropies in the sky temperature. In this null test, we divide channels into subsets based on whether this level is higher or lower than the median. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 60 W: period 1 vs 2 W: period 1 vs 3 W: period 2 vs 3 W: dewar 0 vs dewars 1+2+3 W: dewar 1 vs dewars 0+2+3 W: dewar 2 vs dewars 0+1+3 W: dewar 3 vs dewars 0+1+2 W: SO vs S1+S2 W: SI vs S0+S2 W: S2 vs S0+S1 W: good vs bad grand x 2 W: good vs bad US x 2 W: good vs bad I—>Q W: day/night W: high/low IF temp W: period-period subtotal W: per-dewar subtotal W: S0 /S 1 /S 2 subtotal W subtotal Q period 1 vs 2 Q period 1 vs 3 Q period 2 vs 3 Q dewar 0 vs dewars 1+2+3 Q dewar 1 vs dewars 0+2+3 Q dewar 2 vs dewars 0+1+3 Q dewar 3 vs dewars 0+1+2 Q SO vs Si Q good vs bad grand x 2 Q good vs bad US x 2 Q good vs bad I—>Q Q day/night Q high/low IF temp Q period-period subtotal Q per-dewar subtotal Q subtotal Grand total 0 .0 0 / 1 0.47/1 2.06/1 0 .0 2 / 1 0.34/1 2.99/1 1.03/1 0.65/1 0.30/1 0.17/1 0 .1 1 / 1 0.06/1 0.76/1 0 .0 1 / 1 1.25/1 2.54/3 4.38/4 1.13/3 10.23/15 2.18/1 0.33/1 2.70/1 0.50/1 0.19/1 0.18/1 2.91/1 0.32/1 0 .0 0 / 1 4.62/1 0 .0 0 / 1 0.18/1 0.87/1 5.21/3 3.78/4 14.99/13 25.22/28 0.99/1 0 .0 1 / 1 0.53/1 0 .1 2 / 1 0.98/1 0.03/1 0.09/1 0.51/1 3.77/1 0.33/1 0 .1 0 / 1 1.23/1 0.81/1 0.79/1 1.45/1 1.52/3 1.22/4 4.61/3 11.73/15 0 .0 1 / 1 0.55/1 0 .2 1 / 1 0.53/1 1 .2 0 / 1 0.38/1 0.81/1 0 .0 0 / 1 0.09/1 0.24/1 0.15/1 0.55/1 4.43/1 0.77/3 2.91/4 9.14/13 20.87/28 4.18/1 0.16/1 0.90/1 0.04/1 1.84/1 0.04/1 1.17/1 0.33/1 0.46/1 1.34/1 0 .0 0 / 1 0 .0 1 / 1 0.08/1 0.25/1 0 .0 0 / 1 5.25/3 3.08/4 2.12/3 10.80/15 0.78/1 0.40/1 0.34/1 1.24/1 3.70/1 0 .0 1 / 1 2 .0 2 / 1 0 .1 0 / 1 0 .2 2 / 1 2.30/1 1.38/1 0 .1 0 / 1 0.60/1 1.53/3 6.96/4 13.19/13 23.99/28 5.17/3 0.65/3 3.49/3 0.18/3 3.16/3 3.06/3 2.29/3 1.48/3 4.53/3 1.85/3 0.21/3 1.29/3 1.66/3 1.05/3 2.70/3 9.31/9 8 .6 8 / 1 2 7.86/9 32.76/45 2.97/3 1.28/3 3.25/3 2.26/3 5.09/3 0.56/3 5.74/3 0.42/3 0.31/3 7.16/3 1.52/3 0.83/3 5.91/3 7.51/9 13.65/12 37.31/39 70.08/84 Table 2.3: Null test suite for real CAPMAP data. The overall x 2 of the test suite is 70.08 with 84 degrees of freedom, indicating th at the null test passes. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 61 7. IF temperature: There was some evidence th at amplifier gains might depend on the tem perature of the intermediate frequency (IF) component of the hardware, so we include a null test based on dividing the timestream into subsets where the IF tem perature is higher or lower than the median value. The result of the null test suite can be summarized as a single chi-square value: X2 = 70.08 with 84 degrees of freedom, indicating th at the testsuite passes. However, there is an im portant caveat here. Although the chi-square values with one degree of freedom computed in Eq. (2.53) are distributed as true x 2 variables, the same is not true when summing over null tests: our “chi-square” values with many degrees of freedom are not truly x 2 distributed, because of correlations between null tests. Such a chi-square value with N degrees of freedom will still have mean N, but its variance will be larger than (2N). In principle, the true variance, including the effect of correlations between null tests, could be determined by Monte Carlo simulations of the entire null test suite. We have not done this here, since it would be computationally expensive, and our overall chi-square value is only lcr away from the mean, even when true x 2 statistics are assumed. It is interesting to note th at the null test suite only passes in Q-band if groundsynchronous modes are marginalized in the map making stage (§2.6). In Tab. 2.4, we show the result of the test suite if ground-synchronous mode removal is disabled in the map maker. The total x 2 in Q-band increases from 37.31 to 145.19, with 39 degrees of freedom, indicating clear failure of the test suite and evidence for groundsynchronous contamination. In W-band, the total x 2 of the testsuite also gets worse, but by a fraction of a standard deviation, so no conclusion can be drawn. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 62 W: period 1 vs 2 W: period 1 vs 3 W: period 2 vs 3 W: dewar 0 vs dewars 1+2+3 W: dewar 1 vs dewars 0+2+3 W: dewar 2 vs dewars 0+1+3 W: dewar 3 vs dewars 0+1+2 W: SO vs S1+S2 W: SI vs S0+S2 W: S2 vs S0+S1 W: good vs bad grand x 2 W: good vs bad US x 2 W: good vs bad I—>Q W: day/night W: high/low IF temp W: period-period subtotal W: per-dewar subtotal W: S0/S1/S2 subtotal W subtotal Q period 1 vs 2 Q period 1 vs 3 Q period 2 vs 3 Q dewar 0 vs dewars 1+2+3 Q dewar 1 vs dewars 0+2+3 Q dewar 2 vs dewars 0+1+3 Q dewar 3 vs dewars 0+1+2 Q SO vs SI Q good vs bad grand x 2 Q good vs bad US x 2 Q good vs bad I—>Q Q day/night Q high/low IF temp Q period-period subtotal Q per-dewar subtotal Q subtotal Grand total 0.38/1 0.04/1 1.27/1 0.33/1 0.82/1 2.97/1 1.81/1 0.37/1 1 .0 1 / 1 0.03/1 0 .0 0 / 1 0 .0 0 / 1 0.91/1 0.19/1 1.84/1 1.70/3 5.93/4 1.41/3 11.99/15 1.09/1 0 .8 6 / 1 1.87/1 5.57/1 1 .1 1 / 1 0.19/1 4.39/1 1.75/1 9.25/1 14.26/1 3.87/1 0.15/1 1.53/1 3.82/3 11.26/4 45.90/13 57.89/28 0.89/1 0.07/1 0.40/1 0 .0 1 / 1 1.38/1 1.56/1 0.03/1 0.55/1 4.40/1 1.27/1 0.09/1 1.74/1 1.08/1 1.17/1 0.63/1 1.35/3 2.98/4 6.22/3 15.26/15 0.32/1 0.08/1 0.08/1 8.43/1 3.80/1 1.57/1 2.52/1 3.25/1 21.92/1 0 .0 1 / 1 9.22/1 0.03/1 6.05/1 0.48/3 16.33/4 57.29/13 72.55/28 4.50/1 0.49/1 1 .0 2 / 1 0.26/1 2 .1 0 / 1 0.23/1 0 .2 1 / 1 0.46/1 0.03/1 1.97/1 0.28/1 0.03/1 0 .1 2 / 1 0 .1 0 / 1 0.06/1 6.02/3 2.80/4 2.46/3 11.88/15 0.32/1 0.24/1 1 .2 1 / 1 18.49/1 3.47/1 0.93/1 3.67/1 2.25/1 2.33/1 5.07/1 3.03/1 0.52/1 0.48/1 1.78/3 26.55/4 42.01/13 53.89/28 5.78/3 0.60/3 2.70/3 0.59/3 4.29/3 4.76/3 2.06/3 1.38/3 5.44/3 3.27/3 0.38/3 1.77/3 2.12/3 1.46/3 2.53/3 9.07/9 11.71/12 10.09/9 39.13/45 1.74/3 1.19/3 3.16/3 32.48/3 8.39/3 2.68/3 10.58/3 7.25/3 33.51/3 19.34/3 16.12/3 0.69/3 8.06/3 6.08/9 54.13/12 145.19/39 184.32/84 Table 2.4: Null test suite for real CAPMAP data, with ground-synchronous mode removal disabled in the map making stage, showing clear failure of the test suite in Q-band. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 63 M b ~+ -5 0 M c CM s. o -+ -5 0 w S' 3 50 c N o -+ -5 0 0 500 1000 Multipole 1 Figure 2.10: Final power spectrum measurement in from the full-season simulated dataset, with marginalized uncertainties computed by Markov chain sampling. The three sets of error bars represent W -band data alone (left), Q-band data alone (mid dle), and both frequencies combined (right). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 64 o 0.5 1 1.5 E E /fid u cia l (sin gle band) 2 Figure 2.11: One-band likelihood in a multiple of the fiducial EE spectrum, for the full-season simulated dataset with both frequencies combined. Zero power is excluded at 13o\ Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 65 2.11 Final power spectrum We conclude this chapter by presenting the final power spectrum measurements for the full-season simulation, as a proxy for the final results of the analysis, to appear shortly. In Fig. 2.10, we show the measurement of the EE, BB, and EB power spectrum in flat bandpowers, for W -band data alone, Q-band alone, and the two frequency channels combined. To assess the overall statistical significance, we perform a one-band fit for EE in the “shape” of the fiducial model, with BB and EB set to zero. The resulting likelihood curve is shown in Fig. 2.11; the lcr range for this measurement is (1.02 ± 0.16). This suggests a 6 <r detection, but the actual significance of excluding zero power is much higher (13cr). This is because the likelihood curve becomes very non-Gaussian at low values due to the presence of a small number of modes with high signal-to-noise. In conclusion, we have developed an optimal, likelihood-based analysis pipeline for the CAPMAP dataset which incorporates a complex noise model without ap proximations and includes a variety of statistical techniques such as Markov chains. Results from a null test suite (§2.10) have been finalized and indicate consistency of the dataset when it is divided in many different ways. The final power spectrum result is not yet available at the time of this writing, but the pipeline has been demonstrated on realistic simulations, and final results are expected this fall. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CH APTER 3 P U R E P S E U D O -Q ESTIM A TO RS 3.1 Introduction The CAPMAP pipeline described in the previous chapter performs a likelihood-based analysis and obtains power spectrum uncertainties which are provably optimal (Bond et ah, 1998). We have seen th at many dense m atrix operations, with computational cost <3(lVpix), are needed. In CAPMAP, the m atrix size is typically ~ 5000, so such operations are easily affordable. However, they will soon become a bottleneck for upcoming experiments which will target B-modes with increasing sky coverage and resolution. For future generations of polarization experiments, likelihood analysis will probably be infeasible. An alternative method, which has become the “industry standard” for CMB tem perature experiments with Npix > 105, is using p s e u d o -q u a d r a tic estimators (Wandelt et ah, 2001). In the polarization context, pseudo-C^ estimators mix E into B, in the following sense: for a noiseless CMB realization containing only E-modes, the estimated B-mode bandpowers will be nonzero. In an ensemble of such realizations, the estimated B-mode bandpowers will be zero in the mean (this is because pseudoC# estimators are unbiased by construction), but because they are nonzero in each realization, E-mode signal power does contribute to the variance of the B-mode es timators. This extra variance can be thought of as a source of noise which is due to the estimators alone. Recently, Challinor & Chon (2005) showed th at this “estimator noise” can dominate the sample variance from lensing B-modes, so th at it becomes the dominant contaminant if the instrumental noise is sufficiently small. For surveys with f sky ~ 0.01, they show th at it limits the value of T / S which can be detected to 66 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 67 ~ 0 . 05 . The purpose of this chapter is to construct pure p s e u d o -e s tim a to rs ; these are modified versions of polarization pseudo-Q estimators, which do not mix E into B in the sense defined above. Using these estimators, the estimated B-mode power will be zero for any noiseless CMB realization which contains only E-modes. Strictly speaking, this is only true in the continuum limit; in a finite pixelization, the B-mode estimators will acquire small nonzero values from pixelization artifacts. We show th at these pixelization effects can be made arbitrarily small by increasing the resolution. The basic idea of our construction was inspired by the pure B-mode formalism of (Lewis et ah, 2002; Bunn et al., 2003). Because pure B-modes exist in any finite patch of sky, a BB power spectrum estimator will receive no contribution from EE signal power if it is built entirely out of these modes. Our pure p s e u d o -e s tim a to r s have the property th at the observed polarization map always appears contracted with a pure B-mode which is constructed from heuristically chosen weight functions. The main technical difficulty is ensuring th at the fast algorithm for calculating the transfer m atrix (Hansen & Gorski, 2003, Appendix E) still goes through after this modification. Throughout this chapter, we use a fiducial ACDM cosmology which is consistent with WMAP (Spergel et ah, 2003), with parameters { Qm h? r > |A i?|2, n s, w, rrij/ } = { 0.024, 0.14, 0.73, 0.17, 2.57 x 10- 9 , 1, -1, 0 }. For pixelized all-sky maps, we use the Healpix coordinate system (Gorski et al., 2005) exclusively. We use a phenomonological definition of T / S , defining it to be the ratio of temperature multipoles Cf™™'/Cf™l*r . This chapter is organized as follows. In §3.3, we briefly review pseudo-Q esti mators, before constructing pure p s e u d o -e s tim a to r s in §3.4. In §3.5, we consider a spherical cap shaped mock survey with uniform white noise, using it to illustrate Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 68 general features of our estimators, and show th at our modification significantly im proves the performance of the estimators for noise levels < 20 /iK-arcmin. As a first step toward more realistic instrumental noise, in §3.6 we consider a survey with inhomogeneous, but not spatially correlated, noise and show th at the same conclusions apply. In §3.7, we study a range of noise levels and show th at our estimators are roughly 80% of optimal, defined by the degradation in the value of (T/ S ) which can be detected, for all noise levels considered. We conclude in §3.8. 3.2 N otation and C onventions We represent CMB polarization as a symmetric traceless tensor as in Kamionkowski et al. (1997b), but we have changed some conventions to agree with those of CMBFAST and Healpix. The all-sky metric gab and antisymmetric tensor ea& are given by \ 1 0 sin# -s in # 0 0 9ab (3.1) eab 0 sin 2 0 ) We define basis polarization fields 1 0 0 -s in 2# Qab u ab 0 sin# sin# 0 7 \ (3.2) o Given two points x, x', we define X a to be the vector at x which points away from x' along the great circle arc which connects the two, vector Ya — traceless tensors and symmetric = ( X aXi, —YaYi))/2, Ua&= ( X aYi)+YaXi))/2. The quantities X f, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 69 Y r, Q ', U 1 are defined in the same way with x , x ; interchanged. Note th at we always use lower case to distinguish the basis qab, uai)} which is globally defined (except at poles), from the “two-point” basis Q ab, U ^ , which is defined relative to a pair of points x , x.'. E and B modes are defined as follows. First, we define operators £ ab ~ "b 2 ^ a^ &ab ~ (3-3) 2 eac^ C^ b + 2 ebc^C^ a which take scalar fields to symmetric traceless tensors. E and B harmonics are defined by j ( e S T ) e ( e + i ) ( t + 2)£abYlm y (fm)os Y (£m)ab = ^ ( 3 ' 4 > _ 1 ) ( ( ( + 1)(< + We will also need some analagous definitions for the spin - 1 case. The “two-point” basis X a, Ya has already been defined; a global basis x a, Ha is defined by: xa = 1 0 Va = 0 sin 6> • (3.5) Spin- 1 harmonics, which we label G and C for “gradient” and “curl” , are defined by i ) VaYem 5i ° m>a Y (lm )a = ~ ^ / e ( i + i ) ta b ^ bYem Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ( 3 ' 6 ) *3'7^ 70 3.3 Pseudo-CV estim ators Pseudo-Q> quadratic estimators for polarization power spectra have been previously constructed by Hansen & Gorski (2003), who used the p s e u d o -fo rm a lis m (Wandelt et al., 2001). In this section, we briefly recall the pseudo-Gf construction, largely for the sake of establishing notation. One first chooses a pixel-dependent weight function W (x) which is zero outside the survey region. The choice is made heuristically, but the performance of the estimators is improved (on angular scales with signal-to-noise ratio < 1) by choosing W (x) to be smaller where the noise is larger, in order to downweight noisy regions. Frequently, the weight function is also apodized near the survey boundary, in order to reduce harmonic ringing. One then defines pseudo E and B multipoles by (the tildes signify “pseudo” ) E lm = ^ 2n ‘‘‘ (x)H /(x)y(^ ) a4 (x) (3.8) X B tm = ^ 2 n « ‘ (x )iv (x )y (f ^ ( x ) . and pseudo power spectra by a EE (3.9) m=—l Cf B = m——l It can be shown th at the expectation values of the pseudo power spectra are given by + \ A et' w ) NiBB Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (3.10) 71 where K^,, are £max-by-£max transfer matrices and N B E , N BB are £max~by-1 vec tors which represent additive noise bias. There is an efficient algorithm, which will be discussed in Appendix D, for exactly computing the transfer matrices from the weight function. The noise bias can be computed exactly in cases where the noise is uncorrelated between pixels (Appendix E), or by Monte Carlo in the general case. Unbiased power spectrum estimators C E E , C BB can be obtained from the pseudo power spectra C E E , C BB by simply subtracting the noise bias and applying the inverse of the (2£imax)-by-(2£max) “grand unified transfer m atrix” : ( def V a BB / -1 K+ ~ \ K ££' KW ' Cf,E - 1V ,F ^ KW + rBB \ CT/ (3.11) N bb The preceding construction has assumed th at the power spectrum is estimated at every multipole £. For reasons of sky coverage or signal-to-noise, it is often necessary to bin multipoles into bandpowers with A£ > 1. In this case, for each band b, one defines pseudo bandpowers c F = E EE C fB = E p« s t PMCf B . (3.12) where the m atrix P defines the £ weighting within each bandpower estimator. A commonly-used choice is (Hivon et al., 2001): 1 1 (1 + 1 ) ^ Jb+l)Jb) Pbl lo w -r 11 Jb) < p Jb+1 ) How - 1 < how lo w 0 (3.13) otherwise. One also introduces a m atrix P , which defines an interpolation scheme by which the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 72 signal power spectra depend on bandpowers A^: c?B =E ^ z^EE \ ' p a E E b Cl - 2^ BB (3-14) A commonly-used choice, corresponding to piecewise flat power spectra, is: 2?r P tb = < :r p(b) < p 11 h a w - 1 (3,15) otherwise. 0 The binned analogs n(b+ 1) < h ow of the transfer matrices and noise bias vectors (Eq. (3.10)) are defined by: EE\ (Ci Kw (c?B) Kw Kw Kw AE E n ee (3.16) + a bb and are related to the unbinned versions by: K k n bb / = PyK^Pyy. = P yN(. The binned analogs of the unbiased estimators (Eq. (3.11)) are defined by: P>E E V KW C EE _ N EE KW PBB rB B V K bb' K bV \ u b' N, BB (3.17) b' Strictly speaking, these are unbiased estimators of the bandpowers A^, when the CMB power spectra are of the precise form (3.14). 3.4 C onstructing pure pseudo-C^ estim ators We now construct a modified version of the B-mode power spectrum estimator which is “pure” , in the sense th at the estimator is identically zero in any noiseless realization Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 73 of the CMB which contains only E modes. It is convenient to explain the construction first in the continuum limit (§3.4.1), where sums over pixels can be replaced by integrals, and then address implementational issues associated with finite pixelized maps (§3.4.2). 3-4-1 C o n s t r u c t i n g p u r e p s e u d o - C i e s t i m a t o r s 1: c o n t i n u u m l i m i t The underlying reason why the unmodified p s e u d o - e s tim a to r is not “pure” , in the sense defined above, can be understood as follows. The pseudo B multipoles are defined by Blm j d2x - f g 2 \ l ‘ \ x ) W ( x ) Y ^ )al>(x) (3.18) Since multiplication in position space mixes E and B modes, the product W ( X)Y^m^afj(x ) is a mixture of E and B, even though ^ / ^ ^ ( z ) is a B-mode. Therefore, Bgm receives a nonzero contribution from E-mode power in IP*1. The basic idea of our construction is now easy to explain. Writing out the defini tion of from Eq. (3.4), the definition of B(m above can be rewritten Bfm = —, x y / ( e - i ) e ( e + i)(£ + 2 ) (3.19) d2x ^ 2 n ab(x)W (x )B ab( Y Z x ) ) f Suppose th at this definition is modified by simply moving W (x) inside the operator Bab- B Z ‘ lm =‘ 1 -J{1 - ! ) < ( £ + ! ) ( < + / 2) x d2x ^ g 2 H ab(x)Bab(W(x)Y;m (x)) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (3.20) 74 In addition, suppose that the weight function W (x ) and its gradient vanish on the boundary of the region. Then B ^ e will be identically zero in a noiseless CMB realization which contains only E modes. One way to see this is from the perspective of the pure B-mode formalism (Lewis et al., 2002; Bunn et al., 2003); since ( W (x)Y£m (x)) satisfies both Dirichlet and Neumann boundary conditions, Bab{W{x)Y^rn{x)) is a pure B-mode. Alternately, a short, self-contained proof is given as follows. A noiseless CMB realization which contains only E-modes can be written Hab = £ab(t), for some scalar function 4>{x). Putting this into the definition of B ^ e and integrating by parts twice, one gets \ J { t — + !)(£ + 2) ~ p U re (3.21) lm = J d2x ^ f g £ abcf)(x) Bab(W(x)Y;m (x)) = J d2Xy/g (_ V aV 6 + l g abV 2)f{x) x = J d2x yjg eacVcVi,(V“V'' - L oi,V 2)0 M X W (x )Y ;m (x) - (j) dcr ta (_ V aV 6 + \ g abV 2)(t>{x) x V b(W{x)Ytm {x)) - j> da t a (Va + \ v aV 2)(j>{x) W ( x ) Y lm (x) £ = 0. Here, §da denotes integration around the survey boundary and ta denotes the unit tangent vector. The boundary terms are zero because W (x) and V aIT (x) are assumed Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 75 to vanish on the boundary. To avoid confusion, we will denote the “pure” pseudo B multipole defined in Eq. (3.20) by B g ^ e, and denote the “mixed” multipole defined by Eq. (3.18) by for the rest of this chapter. This one change, replacing by B ^ e, is the only modification we propose to the pseudo-C^ formalism of §3.3. In particular, we leave the definitions of E^m and C f E unchanged. By analogy with the definition of tipole , it is possible to define a “pure E” mul- , but we expect th at incorporating this would worsen the performance of the EE power spectrum estimator. This is because ambiguous modes, which are useful for measuring E-mode power, would be discarded. Note th at we have named our estimators “pure pseudo-Cf” even though we are really using a “mixed” estimator for E-modes and a “pure” estimator for B-modes. 3 -4 - 2 C o n s t r u c t i n g p u r e p s e u d o - C t e s t i m a t o r s 2: f i n i t e p i x e l i z e d m aps Before moving on to construct pseudo power spectra and transfer matrices for the modified estimators, we explain how the definition of B p^fie (Eq. (3.20)) is to be interpreted when n a^(x) is a finite pixelized map, instead of an idealized continuous field. The technical obstacle is making sense out of the object Bab{WY^m) which contains covariant derivatives, and appears on the right hand side of Eq. (3.20). We first expand Bab(WYim) using the product rule B M M = ( B a b h ) h + h ( B ubf 2 ) +2Totai( V 7 l ) ( V ‘i/2 ), Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (3.22) 76 where we have introduced the tensor Tabcd = ~^eac9bd d" ~^ead9bc d" ~^bc9ad d" ~^f-bd9ac- (3.23) (Note th at contracting a spin-2 object with T results in a 45° rotation, e.g. TahC(iqcd = ~ u abi Tab c d ~ ~^ab-) Performing the derivatives which act on Yj>m, and replacing the integral by a sum over pixels, this results in the following form for B ^ e. (We do not include the pixel area as an overall prefactor, since the normalization of B ^ e will eventually drop out when we define unbiased power spectrum estimators in Eq. (3.30). However, straightforward conversion of the integral to a sum does implicitly assume an equalarea pixelization.) = E 2n ° ^ W X)y (?m)a(,W (3-24) X +N't £ 2T°bcdn ab(x)Wc(x)Y<’,* )(i(x) X +Nt £ 2 T " n „ t (x)H U x)>7m(x) X where we have defined W a = V aW w ab = (3.25) (VaVj - (1/2)9o6V2)IV N't = 2 / V ( « - l ) ( < + 2) Nf = + + In the form (3.24), the only covariant derivatives are those which act on the weight function in the definitions (3.25) of W a, W ^ . If W (x) is of known analytical form (e.g., Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 77 Gaussian or cosine apodization), then W a, W ab can simply be computed analytically. More generally, assuming th at W{x) is slowly varying compared to the pixel scale (which will be the case for any sensible weighting), W a and can be computed from W (x ) by finite differencing. This is nontrival to implement for an irregular spherical pixelization such as Healpix; we present one finite differencing scheme in detail in Appendix B. In either case, once a prescription for computing Wa, Wafj from W has been specified, we use Eq. (3.24) as the definition of B ^ e in a finite pixelization. Note th a t the first term on the right-hand side is the unmodified pseudo multipole B J ^ xed; the second and third terms can be thought of as counterterms which cancel the E-B mixing and involve spin - 1 and spin - 2 weight functions Wa, Waiy The relative weights of the three terms are ^-dependent, in a way which downweights the counterterms at high t. This is in qualitative agreement with the general picture of E-B mixing arising from survey boundaries (Bunn, 2002; Lewis et al., 2002) in which the mixing is a less significant contaminant at high I. The definition (3.24) suggests an efficient algorithm for evaluating the pseudo multipoles B ^ e. The first term on the right-hand side is computed, for all (£, m) simultaneously, using a fast spin - 2 spherical harmonic transform of the weighted polarization field W ( x ) l i ^ x ) . The second and third terms are analagously computed using a spin - 1 transform of the vector field 2T abcdn abw c = ( n Qw Y - a u w x )xd + ( n Qiv x + n u w Y )yd R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. (3.26) 78 and a spin - 0 (ordinary spherical harmonic) transform of the scalar function 2T abcdn abw cd = n QW u - i i u W q . (3 .27) (In Eqs. (3.26) and (3.27), we use subscripts X , Y to denote components of a vector field in the “global” { x a,y aj basis, and subscripts Q,U to denote components of a spin-2 field in the {qa5 , u ai} basis.) The prefactors Nj,, i\fy are applied after the transforms. This algorithm computes every B ^ e from the map IIa 5 (x), with the same asymptotic complexity 0{fimax) as in the unmodified pseudo-6 ^ formalism of §3.3, but the overall constant is 2-3 times worse, since one spin-0, one spin-1, and one spin- 2 harmonic transform are needed, rather than a single spin - 2 transform. Since fast spin - 1 harmonic transforms are not implemented in the Healpix library, we implemented them as part of this thesis; the details are presented in Appendix A. We have now constructed “pure” pseudo multipoles B ^ e for finite pixelized maps; weconclude this section by defining pure pseudo power spectra and unbiased estimators of the power spectrum. This is done in complete analogy with §3.3. The pseudo power spectrum (jBB,pure .g ^egnef| d ? B,PUre - S T l ' t SZ mB Z e (3.28) m = —£ The expectation values of ( jBB'Pure an(j the unmodified pseudo spectrum C f E are R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 79 given by (C fE) ^BB,pure^ + Kif K\It' -pure T^+pure K] Kw N + ( C EE ^ U£' r BB \ f E (3.29) BB,pure \ Nt An essential ingredient in the method is a fast algorithm for computing the new transfer matrices K ^ f ure. In Appendix D, we present such an algorithm and show th at the computational cost is 0( £max), which is the same as the cost of evaluating the estimators once. (Noise bias is discussed in Appendix E.) The m atrix K ^ f ure measures the contribution to (CBB,pure) by E-mode power, and is therefore expected to be zero. Strictly speaking, this is only true in the continuum limit; in a finite pixelization, K ^ f ure will acquire a small nonzero value from pixelization artifacts. This will be studied quantitatively in §3.5. Finally, we define pure pseudo-Cy estimators (jEE^Pure gBB,pure ^ sjmpiy sub tracting noise bias and applying the inverse of the (2 £max)-by- ( 2 (.max) transfer matrix: ( piEE,pnire \ u£ ^BB,pure KW K —pure K C X K,i f +pure f E - \ -l (3.30) N E E q B B _ j^BB,pure Note that, even though we have not changed the definition of the pseudo spectrum C ¥ E , the definition of the unbiased estimator C ¥ E has been modified in our formal ism. This is because the m atrix inversion on the right-hand side of Eq. (3.30) mixes all R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 80 rows of the (2£max)-by-(2£max) matrix. In practice, we have found th at the change in C f E is miniscule, but we do introduce the notation (jEE’Pure distinguish between the two versions. 3.5 A n exam ple w ith hom ogeneous noise In the preceding section, pure p s e u d o -e s tim a to r s (jEE’P'u re; (jBB,pure^ wj1^cj1 not mix E and B modes have been defined. In this section, these estimators will be studied in detail for a specific mock survey, namely a spherical “cap” of radius r = 13°, with uniform white noise, and a Gaussian beam with 25’ FWHM. Parameters similar to these were originally proposed for the QUIET experiment, which provided the original motivation for the pure pseudo-Q construction. Power spectra will be estimated in “flat” (Cg oc !/£(£ + 1 )) bands with A l = 40. An exception is the lowest band, where we have restricted the £ range to 10 < I < 40; we have found th at trying to estimate power at £ < 10, which is above the survey scale, gives spurious results. We will assess the performance of the estimators by comparing their Monte Carlo covariance with the Fisher matrix, FW = 1 t t ( S 6 (S 0 + N J -'S y fS o + N ) - 1) (3.31) Here, N is the (2Npix )-by-(2Npix ) noise covariance matrix, Sq is the signal covariance m atrix in the fiducial model, and is the signal covariance m atrix associated to each flat bandpower in EE or BB. The Cramer-Rao inequality asserts th at the Fisher m atrix is a lower bound on the covariance of any unbiased power spectrum estimator; conversely, minimum-variance unbiased quadratic estimators (Tegmark, 1997) have a covariance m atrix which is equal to the Fisher matrix, but require dense matrix R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 81 computations which are prohibitively expensive for the survey sizes considered in this chapter. The infeasibility of dense matrix algebra also means th at the Fisher matrix (3.31) cannot be computed in a straightforward fashion. Our method for making the cal culation affordable is to exploit azimuthal symmetry of the survey region and noise. Taking a Fourier transform in the azimuthal coordinate p, the matrices S, N will be block diagonal in the azimuthal wavenumber m (S is still dense in the coordi nate 0), which makes the m atrix operations in Eq. (3.31) affordable. Details of the method are presented in Appendix F. We note th at this method also could be used to make optimal power spectrum estimation affordable, but it applies only for surveys in which both the sky coverage and noise are azimuthally symmetric. Such surveys therefore permit benchmark comparisons between pseudo-Q estimators and optimal, likelihood-based methods, for survey sizes which are large enough th at comparison would normally be infeasible. To use the pseudo-6 ^ method, one must heuristically choose a weight function W(x). We will make a modest effort to optimize this by hand, deferring a systematic framework to §4. The only possibility considered for W (x ) will be cosine apodization, 1 for 9 < r —r* for r —r* < 9 < r \ 0 (3.32) for 0 > r The parameter r* is an apodization length which will be chosen shortly. Note th at this choice of W (x ) satisfies the requirement of §3.4: both W and its gradient vanish at the survey boundary. We obtain the spin-1 and spin - 2 weights W a and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (Eq. 82 (3.25)) by analytic calculation, rather than finite differencing: r —0 r* r —0 r* (3.33) Qab for r —r* < 9 < r, and 0 otherwise. We note th at the “pure” multipole B ^ e is defined (Eq. 3.24) as the overlap integral between 11°^(x) and the pure B-mode (3.34) +N ,T abaiWcd(x)Yl m (x). It is illuminating to consider the behavior of this mode for varying £ and r*. In Figure 3.1, the mode is shown for (£, m) = (10,1) and (£, m) = (50,1), taking r* = 7°. At £ = 10, the second and third terms in (3.34) dominate the first, and the mode is concentrated in the apodization region (r —r*) < 9 < r, which is undesirable from a signal-to-noise perspective. At £ = 50, the contribution of these terms is comparable to the main term, and the statistical weight is distributed throughout the survey region. This illustrates a general point: the “pure” estimators defined in §3.4 require apodization near the survey boundary. In the tophat limit r* —> 0, the derivatives of W (x ) increase in magnitude; the third term in the mode (3.34) dominates the others, and is concentrated at the boundary. In this limit, B ^ e formally receives no contributions from E-modes, but achieves this using counterterms which are line integrals around the boundary, to cancel the E-B mixing. (Compare Eq. (4) of Lewis (2003).) Naturally, for noisy data, this ruins the performance of the power spectrum R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 83 10 5 0 -s -1 0 IffJP ; Figure 3.1: The pure B-mode (3.34), for (£,m) = (10,1) (left) and (50,1) (right), using apodization length r* = 5°. Only the real part is shown. In the left panel, the second and third terms, which are concentrated in the apodization region, dominate the main term. estimator. In a finite apodization, the counterterms in (3.34) are effectively “smeared” over a nonzero area near the boundary, and the resulting power spectrum estimators are sensible in the presence of noise. By comparison, unmodified pseudo-C^ estimators behave reasonably even for a tophat weight function; apodization serves the milder purpose of reducing Fourier ringing. Another general feature of our method is th at more apodization is optimal at low i. At i — 10, Figure 3.1 shows th at the counterterms in (3.34) are dominant, if r* = 5°. Since increasing r* decreases the magnitude of the counterterms, this suggests th at the optimal apodization length at £ = 10 is greater than r* = 5 ° . In contrast, at £ = 50, Figure 3.1 suggests th at r* ~ 5° is roughly optimal. Next we show the behavior of the transfer matrices K K ^ ure. In Figure 3.2, the m atrix entries are shown for varying b', with the band b fixed at £min = 80, tmax = 120. These m atrix entries can be interpreted as the contribution to (CE E ) and (c EB,pure) from E-mode and B-mode power in the band b'. The small nonzero value of K by Ure represents contamination of the B-mode estimators by E-mode power in a finite pixelization, and goes to zero in the continuum limit N s —> oo. The transfer m atrix formalism can also be used to calculate the total contribution Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 84 “I 1-1--1—I-1-1-1--1-1—I-1-1-1-]--1--1—I-1” bb' 0.1 0.01 0.1 K; r 0.01 ■bb' 0.001 0.001 0.0001 0.0001 K: r re N„,h = 1 2 8 10 - ' (N’ 0 50 100 150 M u ltip o le 1 200 50 =512) 100 150 200 M u ltip o le 1 Figure 3.2: Transfer functions K ^ , (left) and K j^fure (right), shown for varying b', with the band b fixed at £rnm = 80, £rnax = 120. The apodization length used was r* = 5 ° . The m atrix entry K ^ , represents the mean response of the BB estimator in band b to BB power in band b'\ K ^ , represents the response of the BB estimator to EE power. As N sl(ie oo, K ^ f ure approaches zero. to {('jBB’Pure) from all E-mode power in the fiducial model, including power at i > 200. One can think of this as the pseudo power spectrum of E-mode power which is aliased to B by the pixelization. The result is shown in Figure 3.3; for comparison, a gravity wave B-mode spectrum with T / S = 0.01 is also shown. As expected, the aliased power goes to zero in the continuum limit. The level of aliased power, relative to the T / S = 0.01 gravity wave signal, suggests th at detecting such a small B-mode signal requires a massive amount of overpixelization. We will quantify this better in §3.7. We now take up the issue of choosing the apodization length r* in Eq. (3.32). In Figure 3.4, the Monte Carlo RMS scatter of the unbiased estimator (jB B ^Pure js shown for varying r*, in two bands b: the lowest band 10 < £ < 40, and the second lowest band 40 < £ < 80. For both bands, the estimator performance degrades sharply when r* is chosen smaller than the optimal value, but the optimal value is Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 85 0.1 0.1 = 128 0.01 0.01 s id e 0.001 s id e 0.001 /w =1024 0.0001 0.0001 0 50 100 150 M u ltip o le 1 s id e 0 50 10 0 150 M u ltip o le 1 200 Figure 3.3: Contribution to ( C ^ ^ ’pMre) fro: n the fiducial E-mode power spectrum, for zero beam (left) and FWHM 25’ (right). T his can be interpreted as the estimated Bmode power which is contributed by E-mo< les due to pixelization artifacts. A gravity wave B-mode spectrum with T / S = 0 . 0 1 i:? shown for scale. The apodization length used was r* = 5°. r* ~ 8° for the first band and r* ~ 3° fo r the second. This is consistent with the qualitative discussion after Eq. (3.34). For the higher bands, repeating this analysis shows th at r* ~ 3° is roughly optimal. C'ur solution to this problem is to use one weight function W (x), with r* = 8 ° for the lowest £ band, and a different W (x), with r* = 3° for the higher bands. We briefly describe the implementatioi lal details associated with using different weight functions in different I bands. We ei nphasize th at these comments apply to all types of pseudo-Q> estimators, not just th e pure estimators studied in this chapter, To evaluate the estimators, one first compi ites a separate set of B(m for each weight function. Then, in each band b, the pseui lo power spectrum Q, is computed using the Bgm corresponding to the appropriate weight function. Generally speaking, one could use any number N wt < N^an(i of weij £ht functions, at the expense of increasing the computational cost, since N wt sets o: ? spherical harmonic transforms must be Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 86 0.03 0.03 cT 0.02 0.02 T w + 0.01 0.01 0 5 10 r. (d e g ) 0 5 10 r, (d e g ) Figure 3.4: RMS bandpower errors for pure p s e u d o - e s tim a to r s (blue/solid), unmodified psuedo-Cf estimators (green/dashed), and optimal Fisher errors (red/dotted), shown for varying apodization length r*. In the left panel, the band 10 < £ < 40 is shown; in this band, the performance of unmodified p s e u d o - e s t i mators is very poor and the dashed curve is not visible. In the right panel, the band 40 < I < 80 is shown. The noise level used was 20 //K-arcmin. computed. (The increase in the cost is smaller than a factor N wt , since the transforms are 0 { ^ ax). For the specific example in this section, using a different weight function in the lowest band, with i max = 40, has a very small impact on the running time.) When computing the transfer m atrix element one uses the weight function associated to band b. We have now specified the weight functions which will be used for the mock survey in this section. To make a fair comparison between the pure and unmodified p s e u d o -e s tim a to rs , we optimized the apodization length r* independently for the unmodified versions, but found th at a tophat weight function (r* = 0) was optimal in all bands. In the lowest t band (10 < £ < 40), we found th at unmodified pseudo-Cf estimator performance was extremely poor (Figure 3.4). If E-modes are artificially removed from the signal, then it improves dramatically, suggesting th at this is due to R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 87 severe E-B mixing in the lowest band. In Figure 3.5, we have compared (Monte Carlo) RMS errors of pure pseudoestimators, RMS errors of the unmodified p s e u d o -e s tim a to rs , and the Fisher bound (3.31). This was done for two choices of signal and noise level: first, a T / S = 0.2 gravity wave signal and noise level 20 /iK-arcmin, and second, a T / S — 0.05 gravity wave signal and noise level 10 pK-arcmin. For E-modes, the performance of the pure and unmodified estimators is the same, as stated at the end of §3.4, and close to optimal except in the lowest band. For B-modes, the relative performance depends on the noise level: for the first choice, the pure estimators perform slightly better than the unmodified versions, and slightly worse than the Fisher bound, in all bands. The exception is the lowest band, in which the performance of the unmodified estimator is very poor. For the second choice of noise level, the performance of the unmodified estimators has degraded significantly, but the performance of the pure estimators remains about the same relative to optimal. 3.6 A n exam ple w ith inhom ogeneous noise In the previous section, we studied a mock survey with homogeneous white noise. As a first step toward more realistic noise models, in this section we study noise which is inhomogeneous, but not correlated between pixels. We will use the same region as in §3.5, namely a spherical cap of radius r = 13°, but the following special form for the noise covariance: <Q(x)Q(x')) = <f/(x)£/(x')) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (3.35) 88 50 0.05 W 40 * 0.025 0 0 200 400 600 800 0 150 200 100 150 M u ltip o le 1 200 100 0.02 o 30 T co + 0.01 20 0 0 200 400 M u ltip o le 1 600 800 0 50 Figure 3.5: RMS bandpower errors for pure p s e u d o -e s tim a to rs (blue/left), unmod ified pseudo-Q estimators (green/middle), and optimal Fisher errors (red/right). The survey region is a spherical cap of radius 13° with uniform white noise. The top two panels are for a fiducial model with T / S = 0.2 and noise level 20 pK-arcmin; the bottom two panels are for a fiducial model with T / S = 0.05 and noise level 10 pKarcmin. In each pair of panels, bandpowers are shown for E-modes (left) and B-modes (right). In the E-mode panels, multipoles £ < 250 have been replotted on a log scale for visibility. In the B-mode panels, the lensing component of the power spectrum has been shown separately (dashed); detecting the gravity wave signal requires measuring power in excess of this level. R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 89 Here, r) is a constant with units /iK-arcmin which we will use to quote the noise level. The normalization r / ( l —cosr) is included so th at the total integrated sensitivity will be the same as a homogeneous survey whose noise level is 77. As in §3.5, we will study two sensitivity levels: 77 = 20 /iK-arcmin and rj = 10 /iK-arcmin. Because the noise covariance (3.35) is azimuthally symmetric, optimal power spectrum errors can be calculated using the method of Appendix F, and used to benchmark pseudo-C/ power spectrum estimation. To motivate the form (3.35), consider a single detector which makes constantvelocity scans through the center of the survey region, at a variety of angles. If the distribution of angles is uniform, the scans are rapid, and the timestream noise is white, then the resulting noise in the pixel domain will be given by (3.35). We must first choose pixel weight functions which will be used to construct pseudoCf estimators for the noise (3.35). Considering E-mode estimators first, we have found th a t a tophat weighting is significantly suboptimal at high £, in contrast to the homogeneous case. (A similar result for tem perature estimators appears in (Hansen et al., 2002), where it is shown th at Gaussian apodization improves a tophat weight function in a mock survey with inhomogeneous noise, but not in the homogeneous case.) However, we were able to construct E-mode estimators which were 95% of optimal, for all £ > 80, by using weight functions of the form “ 5 E (S JT ? ( 3 '3 6 ) The form of W t {0) was motivated by the observation that inverse noise weighting, W{6) = l/sin (0 ), should be optimal in the noise-dominated limit; we included a regulator e to smooth the singularity at 9 = 0. The values of e used are shown in Table 3.1. We were unable to achieve the same level of EE estimator performance R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 90 77 = 20 //K-arcmin £ range weight function £ < 440 tophat weighting 400 < £ < 560 W e{d), e = 0.070 560 < £ <680 W e(9), e = 0.025 £ > 680 W e(B), e = 0.010 ?7 = 10 /iK-arcmin £ < 600 tophat weighting 600 < I < 720 W t (Q), e = 0.070 720 < £ <800 Wc(0), e = 0.017 £ > 800 We(fl), e = 0.008 Table 3.1: Pixel weight functions used to construct E-mode estimators for the mock surveys in this section. using cosine apodization (3.32) in place of the apodization (3.36), or using fewer than four different weight functions. This example highlights a practical issue for pseudo-Q power spectrum estima tion: for a given survey, how does one choose pixel weight functions which optimize performance of the estimators? This issue is separate from the E-B separation prob lem which is the focus of this chapter; we have seen here th at it is im portant even for E-mode power spectrum estimation alone. In §4, will investigate this question sys tematically and devise algorithms for constructing optimized pixel weight functions. Turning now to weight functions for B-mode estimators, we have found th at cosine apodization (3.32) results in p s e u d o - e s tim a to r s whose performance (relative to optimal) is similar to the homogeneous case. However, best results were obtained using apodization length r* = r = 13° in all bands, in contrast to the homogeneous case where we used r* = 8° for 10 < £ < 40 and r* = 3° for £ > 40. We have now completely specified weight functions; the performance of the pseudoCg estimators is shown in Figure 3.6. The results are similar to the homogeneous case. At sensitivity rj = 20 fiK-arcmin, the pure B-mode estimators perform slightly better Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 91 than the unmodified estimators, and slightly worse then optimal. At increased sensi tivity r) = 10 /iK-arcmin, the performance of the pure B-mode estimators remains the same relative to optimal, but the unmodified estimators have significantly degraded. Our conclusion in this section is th at inhomogeneous noise, at least for the specific form (3.35) studied here, does not pose a problem in principle for power spectrum estimation using pure pseudo-Q estimators, although the practical issue of choosing suitable pixel weight functions remains. 3.7 D etectab ility o f gravity wave B -m odes In this section, we will study the smallest gravity wave B-mode signal which can be detected using our pure pseudo-C/ estimators, as a function of the noise level. We use the survey parameters of §3.5: a 13° spherical cap with uniform noise, and define p s e u d o -e s tim a to r s using the I bands and pixel weight functions defined in th at section. In the left panel of Figure 3.7, we have considered a “wide-beam” survey with a 25’ Gaussian beam. The lensing B-mode signal is treated as an extra source of Gaussian noise. The minimum (T/ S) which is detectable at l a is given by c r / s ) 1<r= (3.37) where C ^ denotes the bandpower covariance of the estimators (including noise con tribution) in a fiducial model with T / S = 0, and (SE)^ denotes the estimator mean contributed by a tensor B-mode signal with T / S = 1. We note th at (SE)^ can be computed exactly using the transfer m atrix formalism of Appendix D, but C ^ must be computed by Monte Carlo. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 92 50 0.05 40 3i 30 K 0.025 C\i + 0 0 200 400 600 800 50 0 50 0 50 100 150 200 100 150 200 0.02 W 40 a. 0.01 + o 0 200 400 M u ltip o le 1 600 800 M u ltip o le 1 Figure 3.6: RMS bandpower errors for pure pseudo-C/ estimators (blue/left), un modified pseudo-C/ estimators (green/middle), and optimal Fisher errors (red/right). The survey region is a spherical cap of radius 13° with inhomogeneous noise given by (3.35). The top two panels are for a fiducial model with T / S = 0.2 and noise level 20 /iK-arc:min; the bottom two panels are for a fiducial model with T / S = 0.05 and noise level 10 /iK-arcmin. In each pair of panels, bandpowers are shown for E-modes (left) and B-modes (right). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 93 0.1 1 0.01 0.001 0.01 0.0001 N„ h = 2 5 6 N*h = 1 0 2 4 10-® 0.001 0.1 1 N o ise (/u K -a r c m in ) 10 0.1 1 10 N o ise ( /iK - a r c m in ) Figure 3.7: Minimum T /S detectable at la, for a 13° spherical cap survey with uniform noise, assuming a 25’ beam and Gaussian lensing contaminant (left), and zero beam and no lensing contamination (right). The solid line is the optimal Fisher value, the dotted line is the Monte Carlo value using unmodified pseudo-Q estimators (§3.3), and the dashed lines are Monte Carlo values using pure p s e u d o -e s tim a to rs . For the parameters in the right panel, finite-pixelization artifacts impose a “floor” on the value of T /S which can be detected, but the value can be made arbitrarily small by increasing the resolution. For the parameters in the left panel, we find th at increasing the resolution beyond N s^ e = 256 results in no significant improvement. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 94 In Figure 3.7 (left panel), we show the minimum detectable T / S for both pure and unmodified pseudo-C/ estimators, with the optimal Fisher value (computed using the method of Appendix F) shown for comparison. In all three cases, there is a floor to the gravity wave signal which can be detected, arising from lensing contamination, even in the limit of zero noise. W ith unmodified pseudo-C/ estimators, we find the best possible l a detection is T / S = 0.042, which is reached at a total sensitivity of ~ 10 /iK-arcmin. This agrees well with the results of Challinor & Chon (2005). W ith pure psuedo-C'/ estimators, the best possible lcr detection is T / S = 0.00170, which is reached at total sensitivity ~ 2 /iK-arcmin. For comparison, the optimal value is T / S = 0.00133, or 78%, so the performance of the estimators is close to optimal. These (T/ S) values are specific to the survey region considered here, and depend on the geometry in a way which involves boundary E-B mixing and does not scale as — 1/2 fsky as m°de-counting arguments would suggest (Amarie et al., 2005). In principle, these values of (T/ S ) are not necessarily ultimate limits, since we have arrived at them by treating the lensing B-modes as a Gaussian contaminant which can not be separated from the gravity wave signal. In actuality, lensing B-modes are non-Gaussian, and “delensing” algorithms have been proposed (Hu & Okamoto, 2002; Okamoto & Hu, 2003; Hirata & Seljak, 2003) which exploit this to separate the gravity wave and lensing components, reducing the level of contamination. The amount of lensing contamination which can be removed in this way depends on the noise level and beam size. In this thesis, we make no attem pt to model residual delensing errors realistically. Instead, we note th at the result of delensing will lie between two extremes: no separation of the gravity wave and lensing B-modes (which has just been considered), and complete separation. In the right panel of Figure 3.7, we consider the latter extreme; we completely remove the lensing component of the B-mode power spectrum, and show {T/ S ) \ a as a function of noise level. We have also used zero beam R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 95 size, to crudely reflect the fact th at delensing algorithms require measuring modes at high t to separate the gravity wave and lensing signals at low I. Under these extreme assumptions, there is no limit in our framework to the detectability of gravity wave B-modes as the signal-to-noise improves, since we do not consider other sources of contamination such as astrophysical foregrounds. Using unmodified pseudo-C^ estimators, the smallest gravity wave signal which can be detected is the same as in the left panel, T / S — 0.042; removing the lensing component of the B-modes results in no improvement. Rephrasing, the extra BB estimator covariance which is contributed by E-modes dominates the contribution from lensing B-modes. Using pure pseudo-Q estimators, there is a floor to (T / S ) \ a in any fixed pixelization, but the floor improves without limit as the resolution is increased (Nsi^e = 256, 512,1024 are shown.) This is consistent with the discussion in §3.5; in a finite pixelization there is some contamination of the B-mode estimators by E-modes, but the contamination goes to zero in the continuum limit. For any fixed noise level, one can choose a sufficiently high resolution so th at the contamination is negligible. If this is done, then the value of ( T / S ) \ a obtained is approximately 80% of the optimal Fisher value, for all noise levels considered in the right panel of Figure 3.7. The results of this section illustrate a qualitative difference between our pure pseudo-C^ estimators and the unmodified versions. As the signal-to-noise improves in a fixed survey region, a floor is revealed to the gravity wave signal which can be detected using unmodified pseudo-C^ estimators. This is not the case for pure pseudo-C/s, although a second smaller floor is eventually revealed when the signalto-noise becomes good enough th at the sensitivity is limited by contamination from lensing. Going beyond this will require separating the lensing signal using delensing algorithms. We have crudely studied this regime by assuming perfect delensing and R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 96 zero beam size; under these assumptions, we find th at the estimators alone impose no limit to the value of {T/S) which can be detected. 3.8 Conclusion In this chapter, we have defined pure pseudo-Q estimators, which have the property th at the estimated B-mode power receives no contribution from E-modes, even on a cut sky. As a consequence, E-mode signal power does not contribute to the variance of the B-mode estimators. The pixel weight functions for these estimators have spin - 1 and spin-2 components (3.25), which can be obtained from the spin-0 component by covariant differentiation if its form is analytical, or finite differencing (Appendix B) in general. In contrast to the usual pseudo-C/ formalism, the spin-0 weight function must have a finite apodization near the boundary, and obey Dirichlet and Neumann boundary conditions. We give an algorithm (Appendix D) for computing the pseudoCf transfer m atrix which is used for debiasing, and show th at its computational cost is 0 (£^ia;r), which is the same as the cost of evaluating the estimators. We have studied these estimators in detail for mock surveys on a 13° spherical cap, using both homogeneous noise, and inhomogeneous noise of a specific form (3.35). In both cases, we found th at for B-mode power spectrum estimation, pure pseudo-C/ estimators performed slightly better than the unmodified versions at noise level 20 //K-arcmin, and much better at noise level 10 /iK-arcmin. In the homogeneous case, we considered a wide range of noise levels, and showed th at the pure estimators are ~ 80% of optimal, defined by the degradation in { T / S ) \ a which can be detected. The 80% figure is obtained in two limiting cases: assuming no separation of the gravity wave and lensing signals, and perfect separation. In the latter case, there is no limit, imposed by the estimators alone, to the value of {T/ S) which can be detected. In Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 97 constrast, using unmodified p s e u d o -e s tim a to rs , we have found th at the smallest signal which can be detected in a patch with this geometry is { T / S ) \ a = 0.042, in agreement with Challinor & Chon (2005). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. CHAPTER 4 A G EN ER A L SO LUTIO N TO TH E E-B M IX IN G PR O BLEM 4.1 Introduction In a finite patch of sky, the E - B decomposition framework needs modification. It is no longer true th at any polarization field can be uniquely decomposed into pure E and pure 5 - modes. A new set of modes, so-called “ambiguous” , needs to be introduced (Bunn et al., 2003). These new modes receive contributions from both E and 5-modes. In practice because the E signal is expected to be so much larger than the B one, the ambiguous modes will be dominated by E signal. Thus all the information about the cosmological 5-modes is contained in the pure 5-modes, which are orthogonal to both pure E and ambiguous modes. The leakage of E signal into 5 can be thought of as an additional source of noise on top of th at introduced by the detector, but a source of noise th at can be eliminated by a suitable choice of variables. Analyzing the d ata of the next generation of experiments is potentially quite chal lenging. Measuring the power spectra using a fully optimal likelihood based analysis requires 0 ( N ^ X) operations and is probably not feasible for this next generation of experiments. The alternative technique, widely used for CMB tem perature experi ments, is the pseudo-Q> quadratic method (Wandelt et al., 1998). Unfortunately in a finite patch of the sky the pseudo-Cys do not isolate the pure 5-modes before con structing the quadratic estimators. As a result the quadratic estimator is contamined by the leakage from the ambiguous modes. This contamination can be removed on average by taking suitable linear combinations of the pseudo-Cy s. However the leaked power still contributes to the variance of the estimators, significantly degrading their 98 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 99 ability to measure the amplitude of the cosmological B-modes. In §3, we introduced an improved technique: pure pseudo-C^ estimators. This technique preserves the simplicity of the pseudo-C /s but at the same time ensures th at no E —>B mixing occurs. In this chapter we will look at the pure pseudo-Q from a new perspective. Our different approach will allow us to understand various features of the pure pseudo-CV’s and also give guidance as to how to choose the weight function th at the technique requires. The main result, a general ansatz for optimizing the pseudo-Q weight func tion given arbitrary signal and noise covariance, is presented in §4.5. Using this ansatz to generate weight functions in an autom ated way, we study several mock surveys, culminating in a realistic simulation of a fiducial experiment, with characteristics based on the upcoming balloon based EBEX (Oxley et al., 2004). Our simulations have inhomogeneous noise, complicated boundaries and point source masking. We will argue th a t the pseudo-Q> estimators give near-optimal power spectrum errors on all angular scales. This resolves a long-standing practical issue in the pseudomethod: the lack of an algorithm for choosing weight functions in complex situa tions. Previous studies have had to rely on optimizing weight functions by hand, using expensive Monte Carlo simulations to compute the estimator variance for each candidate weight function considered. At the same time, we preserve the strong E - B separation of the pure pseudo-Q method, achieving excellent B-mode errors even for the low noise levels anticipated for the upcoming generation of CMB polarization experiments (Oxley et al., 2004; Taylor et al., 2004; Yoon et al., 2006). R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 100 4.2 4-2.1 Prelim inaries S pin two notation First we will introduce our notation. We will follow closely (Bunn et al., 2003). The (linear) polarization of the CMB is described in terms of the Stokes parameters Q and U. The definition of Q and U depends on the coordinate system chosen. In this subsection we review definitions th at are valid for the full sky, so we will use spherical coordinates to define Q and U. We will follow the notation of (Zaldarriaga & Seljak, 1997). The Stokes parameters can be combined to form a spin-2 combination (Q + iU) and a spin-(—2) combina tion (Q — iU). In the full sky these combinations can be decomposed using spin-2 harmonics, Q + iU = y ] &2 ,lm 2 Yl m Q — iU = i Im a—2,lm (—2 (4-1) Im It is natural to introduce a scalar (E) and a pseudoscalar (B) field to describe po larization. The expansion coefficients of these two fields in (ordinary spin-0) spherical harmonics are aE,lm = ~ { a2,lm T a —2,lm)/‘l i aB,lm ~ i( a2,lm ~ a —2 (4-2) These two functions completely characterize any polarization field on the sphere (Zaldarriaga & Seljak, 1997). They are im portant physically because cosmologi cal density perturbations cannot create 5-type polarization while gravity waves can (Kamionkowski et al., 1997b; Zaldarriaga & Seljak, 1997). The spin-2 harmonics in equation (4.1) can be related to the usual spin-0 spherical Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 101 harmonics by means of two first-order differential operators, the spin-raising ( 6 ) and spin-lowering ( 6 ) operators (Zaldarriaga & Seljak, 1997). When applied to the spin-weighted spherical harmonics, these operators yield the following identities: & sYim = [(I ~ s ) (l + s + & sYlm = 1)]1//2 s + l ^ l m ~[(l + s)(l ~ s + 1) ] 1//2 s-lYlm- (4-3) We can show that: XE = [6 6 ( Q + iU) + 6 6 ( Q - i U ) \/2 = - ^ [ ( ( + 2 ) ! / ( i - 2 ) ! ] 1/ V l m i ' l m Im Xb = i[$ 5 (Q + iU) - 6 6 (Q - i U) \/2 = £ [ ( ( + 2)!/(( - 2 ( 4 . 4 ) Im Thus we can take linear combinations of second derivatives of the Stokes parameters and obtain variables th a t depend only on £ or on S . We pause to note th at the reason why E and B are the focus of attention instead of XEi XB is partly a m atter of convention. Perhaps more importantly, E and B have the same power spectrum on small scales as the Stokes parameters, while th at of % differs by a factor of ~ Z4. As a result while white noise in Q and U translates into white noise in E and B , it becomes colored noise for y. 4-2.2 Small-angle approxim ation If one works over a small patch of sky, one can use the small-angle (flat-sky) ap proximation. When working in this limit, it is more natural to measure the Stokes Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 102 parameters with respect to a Cartesian coordinate system (x, y) instead of the usual polar coordinate axis. In the flat-sky approximation polarization is expanded in terms of Fourier modes, C ix i' \ d2l E(l) U( x ) J ■ / (EY- cos 2(p ( —sm 2(f) • + m ^sin 2(f)j cos 2(f) o»lx (4.5) )\ where x = (x, y) and 1 = /(cos (b, sin (p). The differential operators reduce to simply 4-2.3 $ = ~ ( d x + idy), (4.6) & = ~ { d x - i d y ). (4.7) The am biguous m odes On a manifold without boundary, any polarization field can be uniquely separated into an E part and a B part. But if there is a boundary (i.e., if only some subset Q of the sky has been observed), this decomposition is not unique. In this section we summarize the results of (Bunn et al., 2003). Polarization fields living on 0 form a normed vector space with the inner product f (QQ1+ UU')dVL, JQ. (4.8) and we say th at two fields (Q+iU) and ( Q' +i Ur) are orthogonal if their inner product vanishes. We refer to a polarization field as • E if it has vanishing x b -> B if it has vanishing XEi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 103 • pure 5 if it is orthogonal to all 5-fields, and • pure B if it is orthogonal to all 5-fields. On the complete sky, every polarization field can be uniquely represented as a linear combination of an E field and a B field, and all E fields are perpendicular to all B fields. In other words, the space of all polarization fields is the direct sum of two orthogonal subspaces: the space of all E fields and the space of all B fields. In this case, there is no distinction between an E field and a “pure 5 ” field. But if only some subset of the sky has been observed,so th at flis a manifold with boundary, then this decomposition is not unique. Oneway to see this is to note th at there are modes th at satisfy both the 5-m ode and 5-m ode conditions simultaneously. When we split a polarization field into an 5 part and a 5 part, these “ambiguous” modes can go into either component. In order to make the 5 / 5 decomposition unique, we must first project out the ambiguous modes. The existence of ambiguous modes has im portant implications at the time of measuring the 5-m ode power spectrum. Here we concentrate on “simple” quadratic methods th at make no attem pt to filter those ambiguous modes at the level of the map (such as Wandelt et al. (1998)). A full likelihood analysis would automatically deal with the ambiguous modes but it will probably be computationally prohibitive. It might be feasible for low resolution maps, becoming the method of choice for constraining the gravitational wave amplitude (Efstathiou, 2006). It is easiest to understand the issue to focus on the pseudo-Q> method, but the point applies generally. In the pseudo-C/ method one just measures the power spectra of 5 and 5 as one would do in the full sky and simply masks out the unobserved regions. After this procedure both C f and C f spectra are dominated by 5-modes. An illustrative example is shown in figure 4.1. We considered a circular region 10° in Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 104 radius. The 5 and B power spectra were measured by simply masking the unobserved region but otherwise proceeding as if one had a full sky measurement. The calculation was performed in the flat sky approximation which is adequate for this size region. The input polarization field had 5-m odes only. In the figure we show the measured B mode power spectrum spectra, which is produced by leakage of the 5-modes. For comparison we show the expected 5-m ode power spectrum if T / S = 0 .1 together with the 5-modes produced by gravitational lensing. Also for reference we plot the — 1/2 detector noise power spectrum for wp ' = 5.75 /iK-arcmin and an 8 arcminute beam. This level of noise is within the range of expected values for the upcoming EBEX balloon experiment (Oxley et ah, 2004). The pseudo-6 / method then takes linear combinations of the measured C /’s to eliminate the 5-m ode contribution in the hopes of revealing the cosmological 5 modes. The point is th at this procedure makes the 5-m ode power spectra estimator independent of 5 in the mean, but the variance of the estimator is still dominated by the leaked power. Perhaps an extreme example is helpful to clarify the problem further. Imagine th a t cosmological parameters were known perfectly so th at there is no uncertainty in the 5-m ode spectra. Then the leaked 5 power is perfectly predicted and can be subtracted from the measured signal to leave an unbiased C f estimator. This is analogous to what one does with detector noise. Unfortunately just as with noise the variance of the estimator is increased by the leaked power. The pseudo-Q- technique does even worse than this because it does not assume any knowledge of the C f so it uses only linear combinations th at are on average independent of 5 for any choice of C f . (In reality for a small patch this is not possible and some assumptions such as constancy of the 5 power spectrum in bands must be used. These assumptions are expected to be reasonably well satisfied.) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 105 10* Lensing + GW Leakage N oise 10- * io-» Lensing 10 ,GW 100 1000 Figure 4.1: Illustration of the leakage due to finite sky coverage in the ordinary p s e u d o -te c h n iq u e . A circular region 10 degrees in radius was used to estimate the pseudo-Q ’s. The leakage curve shows the 5-m odes obtained even in the absence of any cosmological 5-m odes just from the leakage of E. For comparison we also show the expected cosmological signal for T / S = 0.1 and the noise power spectrum for an — experiment with wp 1/2 = 5 .7 5 /iK-arcmin and an 8 arcminute beam. Figure 4.1 illustrates the severity of the problem for experiments with noise levels comparable to those of our fiducial experiment (5.75 pK-arcmin). The leaked power is larger than the detector noise all the way to I ~ 400, so using the pseudo-C/> technique is clearly wasteful. Note th at even on scales where the lensing signal dominates there would be significant degradation. R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 106 4.3 A voiding leakage by m easuring xb When trying to measure the 5-m ode power spectrum in a finite patch there is a simple way to avoid unwanted leakage from 5-modes: estimate the power spectrum starting from the xb map. In the next section we will show th at this is mathematically equivalent to the pure pseudo-C*£ technique. It is important to realize th a t in practice no information is being lost by doing this. By construction xb has no contribution from either pure 5-modes or ambiguous modes. Although in principle there is some information on the amplitude of 5-modes in the ambiguous modes, in practice th at signal is completely swamped by the contribution of the 5-m odes so there is no real loss of information by working with XBPerhaps an analogy helps at this point. One could consider a scalar field, like the CMB tem perature, and wonder whether there is any loss of information if one were to compute the Laplacian of the map and extract the power spectrum from that. In the full sky there would be no loss of information but in a small patch one would lose. In the full sky there is a one to one relation between the Laplacian map and the underlying tem perature map (except for the monopole and dipole which are lost when one takes the Laplacian, but they do not carry cosmological information); the only difference is th at both signal and noise power spectra will be multiplied by £^(£ + l ) 2 which does not alter how well one can constrain each multipole. In a finite patch the situation is different because there are non-zero tem perature fields on the finite patch th at can have zero Laplacian. Those modes are filtered out by the “Laplaciantechnique” with the resulting loss of information. In our case the analogue of the modes th at are being filtered out are the ambiguous modes which we want to remove anyway. Thus for constraining the 5-m ode amplitude, starting with the XB field is sufficient. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 107 Of course the fact th a t the XB maP contains for practical purposes all the cos mological information does not mean th at if we extract its power spectrum in a non-optimal way we will still conserve all the information in the data. If in prac tice (as we will do in later sections) we use a pseudo-Cf technique to extract power spectra, it may be th a t allowing some leakage is still advantageous. As long as the leaked power is small compared to the detector noise it is most certainly harmless. We will see th at it is indeed the case th at small amounts of leakage are preferred by our technique. We will postpone discussion of this point to later. 4-3.1 A very blue spectrum : the need f o r apodization The additional derivatives needed to construct \B make its power spectrum very blue. We will show in this section th at this induces high levels of aliasing unless appropriate apodizations are used. Thus apodization will be an integral part of any technique th a t starts with xbTo understand the issue it is best to write down the formulas in the flat sky approximation. We assume th at the Stokes parameters have been measured in a certain region of the sky and th at XB was calculated by finite differencing. Then the x b fidd is multiplied by a window function W (in real space) and the Fourier components of (W x b ) are measured and squared to estimate the power spectrum. The weight W is zero outside the observed patch. If derivatives are taken using nearest neighbors, one also loses an additional pixel around the edge of the map. Note that this loss is directly related to the counting of ambiguous modes as described in Bunn et al. (2003). The resulting power estimates are the convolution of the XB power spectrum and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 108 the Fourier transform of the window, 0“ = J f J il l ___ ^ 2 IW - O il2 C f B - (4.9) The XB power spectrum is now so blue th at unless the window function decreases fast with I, all estimates of power will be dominated by the small scale modes. One way of seeing why this is a problem is to consider normalizing the window such th at ||W (0 )|p = 1. The estimator in equation (4.9) estimates the power at 1 but has aliasing from other modes. Even if one considered the idealized situation in which the higher I power spectra were known so th at the aliasing could be calculated and subtracted, the aliased power would still contribute to the noise of the estimator. If the aliased power dominates the integral in Eq. (4.9), the additional variance is large compared to the signal one is after with the resulting loss of information. Thus one would want to avoid having such aliasing if possible. The high I behavior of ||W(1 ) | | 2 is directly related to the number of normal deriva tives of W th a t vanish at the boundary of the patch. In fact lim ||JT(1) | | 2 oc r 2(n+2), I—>00 (4.10) where n is the number of derivatives of the window th at vanish at the boundary with n — 0 meaning th at the window itself is continuous. Thus requiring th at equation (4.9) converges requires using a window th at is smoother at the boundary. For ex ample if we compare measuring the power spectra of the Stokes parameters without apodization to measuring the power spectrum of XBi compensating the additional /4 factor in the XB spectrum requires having both the window and its first derivative vanish. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 109 Reducing the aliasing requires considering weights th at decrease continuously to zero as one gets to the border; on the other hand, if too much apodization is used, then one is effectively using a smaller patch of sky thus decreasing the signal to noise. We will use an improved version of this argument to find an optimal choice of W in section 4.5. Here we just want to mention th at when dealing with XBi because of its very blue nature, we are forced to confront the issue of apodization straight on. At this point the reader may wonder if using XB is such a good idea as one has to take derivatives of the data, deal with extremely blue spectra, etc. In the next section we will show th at our technique is equivalent to the “pure” pseudotechnique. Thus all the intuition we gain focusing on XB whl be directly translated to th at other method. The pure pseudo-Cf formalism also clarifies the issue of statistical weight; we will see that, even though the apodized window may suggest th at the edges of the survey are downweighted, the statistical weight is in fact roughly uniformly distributed for the optimal choice of apodization. 4.4 R elation to th e pure p seu d o-C /s We have now seen th at the problem of estimating the B-mode power spectrum (with out contamination from B-modes) can be reduced to estimating the power spectrum of a scalar field, by passing to the curl X B ■ In principle, any method for power spec trum estimation can be applied to XBi to obtain B-mode estimators which do not suffer from E —> B mixing. We will consider a special case which will be the focus of the rest of the chapter: estimating the power spectrum of XB using pseudo-Cyestimators. Let us recall how pseudo-Q power spectrum estimators are defined for the scalar field XB- First, one computes multipoles of the weighted field (or “pseudo multi Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 110 poles”): ^ = Ta- 1w‘+ d « + 2) / ' A a (,)iy w ^ w (4'n) where W (x) is a heuristically chosen weight function. The issue of choosing IT(x) will be studied in detail in subsequent sections. In (4.11), we have included the prefactor 1/ y/(£ — !)£{£ + l ) ( t + 2) so th a t the normalization will match the B-mode power spectrum C f ^ . We briefly describe the construction of unbiased bandpower estimators from the pseudo multipoles (4.11); for more details see (Wandelt et al., 1998; Hivon et al., 2001). First we sum the multipoles in bins, obtaining pseudo bandpowers ^ dgf y 1 1(1 + 1) ~y*~x 0a 1^21+1 2 tr ^ Im Im £&b m=—£ ,4 1 2 x V4-14) where the index a runs over £ bands. Then we define unbiased estimators by c a = K ^ iC c where N a is the (additive) noise bias of each between pseudo bandpowers - AW (4.13) Ca , and K aa/ is the transfer matrix (4.12) and signal bandpowers A a/. (Moreprecisely, N and K are defined by (Ca ) = K aai A ai + N a .) The preceding construction defines a power spectrum estimator for polarization which eliminates E —> B mixing in the strongest possible sense: the estimated Bmode power acquires no contribution from B-modes. In §3, “pure pseudo-Q estima tors” for polarization, which also eliminate B —►B mixing in this strong sense, were defined. We now prove th a t the two estimators are mathematically equivalent. Substituting the definition (4.4) of xb int° the definition of the multipole (4.11) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 111 and integrating by parts, one obtains: (4.14) + c'c- where we have defined the spin - 1 and spin - 2 quantities Wi( x) = 6 W ( x ) W 2(x) = 6 d W ( x ) . (4.15) In the form (4.14), it is seen th at a*m agrees with the pure multipole defined in §3, after changing from tensor notation to the spin-s notation used here. This shows that the pseudo multipoles in the two constructions are equal; since the rest of the pseudoC( construction (Eqs. (4.12), (4.13)) is also the same in both cases, this completes the proof th a t the \ B estimator and the pure pseudo-Q estimator are equal. Since the estimators are mathematically equivalent, it is a m atter of preference which to use, and we will switch between the two formalisms in the rest of this chapter depending on which is more convenient. Using the XB estimator has the advantage th at pure B-mode power spectrum estimation is equivalent to power spectrum esti mation of a scalar field with a very blue spectrum. As we will see shortly, this will allow us to formulate a unified procedure for optimizing the weight function W (x), which applies to both pure and ordinary p s e u d o -e s tim a to rs . In the pure pseudo-C^ formalism, the terms have been organized (Eq. (4.14)) so th at the derivatives act only the weight function, rather than the spherical harmonic or the noisy polarization map. Provided th at the weight function varies slowly compared R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 112 to the pixel scale, this can make it easier to take derivatives numerically; one only has to compute W i ( x ) , W 2 (x). The first term in Eq. (4.14) is the usual pseudo- Ci estimator for 5-modes; the remaining two terms are interpreted as higher-spin counterterms which cancel the E —> B mixing. We will see th at this form of the estimator is convenient when optimizing the weight function numerically given a noise map (§4.7). We conclude this section by discussing a general property of the estimator: the weight function must be apodized so th at both W (x ) and its derivative vanish at the boundary. It is illuminating to see how this requirement arises from both ways of defining the estimator. For the pure pseudo- 6 ^ estimator, it is seen (from the mode in brackets in Eq. (4.14), which multiplies the Stokes parameters in a pixel) th at the statistical weight of a pixel is given by a combination of W (x) and its first two derivatives. Therefore, if either W( x) or its derivative have nonzero boundary values, the statistical weight will con tain a delta function on the boundary, which leads to a divergent estimator. (One might try to cure this divergence by defining W \ , W 2 with the delta function terms omitted from the derivatives; the estimator will then be finite, but since the rela tions (4.15) do not hold strictly, the estimator will mix E —> B .) This reasoning also shows th a t even though W( x) is apodized near the boundary, the statistical weight of pixels near the boundary need not be small, since the counterterms W\ and W 2 will be largest there. In fact, for the optimized weight functions th at we will consider in subsequent sections, we have found th at the statistical weight is roughly uniformly distributed. For the XB estimator, the divergence arises in a different way. Although the estimator appears not to involve derivatives of W(x) , the field \ B h-as a noise power spectrum on small scales which grows as t If either W (x) or its derivative is nonzero Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 113 on the boundary, then jjkF(/) 112 decays as 1/T4 or slower for large I (Eq. (4.10)) which causes the integral (4.9) to diverge. From this perspective, the need to apodize is understood as a consequence of small-scale noise fluctuations in \ B with a very blue power spectrum. 4.5 C hoosing th e optim al weight function In pseudo-C'^ power spectrum estimation, an im portant practical issue is choosing the pixel weight function W (x ) to minimize the variance of the estimator. Although no general procedure has been proposed, several rules of thum b for optimizing W (x ) under different limiting conditions have appeared. In the noise-dominated limit, in verse noise weighting {W(x) = l /a^(x)) is optimal; in the signal-dominated limit, uniform weighting is best possible (Challinor & Chon, 2005). If the power spectrum is being estimated on scales smaller than fluctuations in the noise level, the FKP ansatz (Feldman et al., 1994) has become the industry standard for galaxy surveys. The preceding rules of thumb implicitly assume th at the signal + noise power spectrum is white on small scales. However, we have seen th at when choosing the weight function for pure 5-m ode estimation, one includes an extra factor t 4 in the power spectrum. This qualitatively affects the optimization and invalidates the rules of thumb. For example, W (x ) and its first derivative must vanish on survey bound aries, in contrast to the white noise case. In this section, we will describe a general ansatz for the optimal weight function, which makes sense for an arbitrary A'pix-by-Apjx matrix C representing the total covariance (signal plus noise). In particular, it can be applied to a noise power spectrum which grows as any power of £, and so provides a uniform framework for both pure and ordinary pseudo-Cy power spectrum estimation. We will see th at our Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 114 ansatz reproduces the above rules of thumb in the appropriate limits, and correctly apodizes W( x ) if the power spectrum is blue on small scales. First we introduce some notation. The Apix-by-Apjx signal covariance in the bandpower we are estimating will be denoted C a , where a is an index ranging over £ bands, and we normalize it so th at = 1. If M\ and M 2 are two Apix-by-jVpjx matrices, then [M\ * Mq\ denotes the Arp;xby-Apix m atrix obtained by elementwise multiplication: [Mi * M2]ij = (A/i - (4.16) The multiplication is always performed in a basis where each row or column cor responds to one pixel; otherwise the operation defined by (4.16) would be basisdependent. Now we can state the ansatz for optimizing the pseudo-6 ^ weight function which will be the main result of this chapter: the optimal weight function W opt (thought of as a length-Npix vector) is given by W op, = [C“ . C ] - 1 l Vec (4.17) where l Vec deontes the length-A^pjx vector consisting of all l ’s. It may seem strange th at the ansatz (4.17) includes the * operation (4.16) which is specific to the pixel-space basis. Loosely speaking, in p s e u d o - p o w e r spectrum estimation, we are constrained to use a filter (multiplying by the mask) which is diagonal in pixel space, and so the pixel basis is given special significance. Before giving the derivation of the ansatz, we note some general properties. W ithout using the * operation, it seems to be impossible to achieve all of the following at once: R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 115 1. If the noise is uncorrelated between pixels, and the noise dominates, then Wopt corresponds to inverse noise weighting. This follows from Eq. (4.17) upon noting th at C is diagonal in the pixel basis, say C jj = crfSij, so th at (C a *C ) = C and (Wopt)t = <T_2- Note th at in this case, p s e u d o - p o w e r spectrum estimation is optimal, in the sense th at the Cramer-Rao inequality is saturated. 2. If the signal + noise power spectrum grows as a power law I^ on small scales, and A > 0, then Wopt is apodized near the boundary. The number of derivatives which vanish at the boundary is given by (A/2 —1 ), in accord with the discussion at the end of §4.3.1. As the wavenumber £q where one is estimating the power spectrum increases, the apodization length decreases. These statements will be proved in §4.5.2. 3. Eq. (4.17) reduces to the FKP prescription in the regime where FKP applies. This is shown in Appendix G. The matrix form of the ansatz (4.17) may suggest th at A/pjx-by-Apjx m atrix operations are needed to compute the optimal weight function, which would be problematic since such operations are computationally prohibitive in situations where pseudoestimators are used. (If they were feasible, then one could do a maximum likelihood analysis instead.) However, in Appendix H, we will see th at a conjugate gradient approach may be used to compute the RHS of Eq. (4.17) even in surveys where the pixel count is large. R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 116 4-5.1 D e riva tio n Now we give a heuristic derivation of the ansatz (4.17). The idea is to find the weight function W such th at the pseudo-Q estimator (defined in Eq. (4.12)) Ca [d] = diWiC^Wjdj (4.18) ij is as close as possible to the optimal estimator O a [ d] d= X ^ C - ^ C - 1) ^ - • (4 -1 9 ) ij In Eqs. (4.18), (4.19) we have denoted the data vector (a length-A^pjx vector whose covariance m atrix is C) by d. How should “close” be defined? We first note th at both estimators can be written as Monte Carlo averages C o Ml = ^ §>c Oa{d] = ) (4-20> where the notation (-)$a means th at an expectation value is taken over an auxiliary field &a with covariance C a . From the expressions (4.20), a natural definition of “close” is th at the expectation value e ={(E - £ *“c.7dJ I ) ij / ' R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. <4-21) 117 be minimized. Here, the expectation value is taken over both the auxiliary field <&a with covariance C Q and the data vector d with covariance C . We minimize E by setting its derivative with respect to the weight function to zero: i ij d ,$ Q 13 (no sum on k). (4.22) i.e., W opt = ( C * C a ) - 1l vec. This completes the derivation of Eq. (4.17). 4 -5 .2 A variation al f o r m o f the a n sa tz In Eq. (4.17), we have written the ansatz for the optimal weight function W opt as a m atrix equation. There is an equivalent formulation as a variational principle which connects to the discussion in §4.3.1 and will also be directly useful in later sections: W opt is the weight function which minimizes the expectation value of the pseudo bandpower Ca JjWiWj C y C g (4.23) subject to the normalization condition ^ Wi = const., Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (4.24) 118 This is shown by differentiating Eq. (4.23) with respect to IT), obtaining y~] Cj j C f j W j = j 1 (no sum on i) (4.25) i.e., Wopt = (C * C ^ - i l v e c . (Strictly speaking, the RHS in Eq. (4.25) should be equal to a Lagrange multiplier A, but the overall normalization of the weight function is arbitrary.) In the previous subsection we showed th at W opt is the weight function which makes the pseudo-Cy estimator approximate the optimal estimator as closely as pos sible. The variational principle gives another interpretation of W opt which matches the discussion from §4.3.1: W opt is the weight function which minimizes the total aliasing from both signal outside the bandpower and noise. Even if known perfectly on average, say because the power spectrum on other scales was given, this aliasing will contribute to the noise in the estimator thus degrading the signal to noise. The variational principle also gives a simple proof of property # 2 at the end of §4.5, th a t (A/ 2 —1 ) derivatives of W opt vanish at the boundary if the power spectrum is a power law on small scales. From Eq. (4.10), if n is the number of derivatives which vanish, then the contribution to the expectation value (Ca ) from small scales (large I) is given by (4.26) Therefore, n > (A/2 — 1) is a necessary condition for the integral to converge, and so W opt must satisfy this condition, since it is chosen to minimize the expectation value. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 119 4.6 Som e analytic solutions To build intuition for the ansatz (4.17), we give some analytic solutions in the case where the survey region is a circle of radius R and the covariance C is given by a powerlaw power spectrum C , where A = 2 or 4. We assume uniform noise throughout the survey region, so th at the covariance is completely described by a power spectrum, but statistical isotropy is broken by the survey boundary. More realistic surveys and power spectra will be considered in subsequent sections. We use the flat sky approximation throughout. 4-6.1 Case 1: Z2 p o w e r spectrum Let us first consider the case where the power spectrum is proportional to i2 , so th at the operator C is equal to (—V 2). Computing the operator (C * C a ) involves some subtleties, so we show the details explicitly. In the position space basis, the matrix elements of C and C a are given by (C )^ = - V 2S2( x - y ) {Ca)xy = J q{Zq\x ~ y\) ■ (4.27) where £q denotes the wavenumber where we are estimating the power spectrum. From the definition (4.16), the m atrix elements of (C * C a ) are ( C * C a )xy = = [ - V 262( x - y j \ j 0(e0\ x - y \ ) —V 2S2(x — y ) + £ q S2 ( x — y) (4.28) (4.29) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 120 0.8 ^ 0.8 10R-" 0.6 0.4 0.2 0.2 0.2 0.6 0.4 0.8 1 0.2 r /R 0.4 0.6 0.8 r/R Figure 4.2: Optimal window functions WoPt(r), given by Eqs. (4.31), (4.33) for a signal+noise power spectrum proportional to 12 (left panel) or / 4 (right panel). where we have used Jo(0) = 1, J' ( 0) = 0, Jq (0) = —1/2. This calculation shows that the operator (C * C a ) is equal to (—V 2 + ^q). Combining Eqs. (4.17), (4.28), the optimal weight function satisfies the differential equation (—V 2 + £Q)Wopt(x) = const. (4.30) with Dirichlet boundary conditions (as proved in §4.5.2). The solution to (4.30) is given by ( 4 ' 3 1 ) The solutions to (4.31) are shown in the left panel of Fig. 4.2. For the A = 2 power spectrum, the weight functions are apodized near the boundaries with an apodization length which decreases with increasing £q. This is in contrast to the white noise case where the weight function would be uniform, independently of £q, if the noise is homogeneous throughout the survey. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 121 4 -6 .2 Case 2: /4 p o w e r spectrum We next consider the case of a signal + noise power spectrum which is proportional to C. This case corresponds to pure B-mode estimation in the noise-dominated limit. A calculation similar to (4.28) shows th at the differential equation for the weight function is ( - 'V 4 + 4 ^ V 2 - 4 ) W = const. (4.32) with Dirichlet + Neumann boundary conditions. The solution to (4.32) is w = * + W - r ) h ( e + R ) - e - I Q( £+r) h( £- R) £+I0( £ - R) h ( £+R) - £ - I 0C + R ) h C - R ) opU ; { } where £ ± d= ( 2 ± V 3 )1/2£q (4.34) This solution is shownin the right panel of Fig. 4.2. Relative to the A = 2 case (left panel), the apodization length is larger, and the apodization issuch th at both W( x) and its derivative vanish on the boundary. 4.7 N um erical solutions Let us summarize our results so far. We have shown that E —» B mixing in pseudoCi power spectrum estimation can be eliminated by taking the curl of the map, then estimating the power spectrum of the resulting scalar field. Because the curl operation results in a noise power spectrum which grows as Cg oc £4, the weight function must be smoothly apodized near boundaries in order to control contamination from small-scale power. This technique is mathematically equivalent to the pure pseudo-C^ formalism from §3, in which “counterterms” involving spin-1 and spin-2 weights are added to the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 122 ordinary pseudo-C'^ estimator to cancel E —> 5 mixing. We have proposed a general ansatz (4.17) for optimizing pseudo-C/; weight functions in the presence of arbitrary signal and noise covariance, and constructed analytic solutions in special cases. For practical application, one needs a method for solving the ansatz numerically, giving a specification of the noise. The most general noise model we will consider is uncorrelated between pixels, isotropic in each pixel, but with an arbitrary pixeldependent amplitude: For this noise model, we have implemented a “black-box” procedure which starts from the noise RMS a(x) in each pixel and outputs 5-m ode and 5-m ode weight functions in each ^ band. The procedure uses the variational principle from §4.5.2, finding weight functions which minimize the quantity (Ca )■ In this and the next section, we will describe qualitative features of the solutions and study estimator performance, deferring implementational details of the method to Appendix H. We work in the pure pseudo-Q formalism, with one minor modification. As we have described it in §4.4, the spin-1 and spin-2 weights are always given in terms of the spin- 0 piece by: W i(x ) = S W ( x ) W 2(x) = c> S W (x ), (4.36) In our numerical optimization procedure, we do not impose Eqs. (4.36) as constraints; each 5-m ode “weight function” actually consists of three independent pieces: a spin0 (scalar) function Wq(x), a spin-1 (vector) field W\(x), and a spin-2 (tensor) field W 2 {x). Note th a t for EE power spectrum estimation, we do not include higher-spin Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 123 counterterms, and so each .E-mode weight function is simply a scalar W(x). In order to differentiate between different flavors of estimators, we will use the following terminology in the rest of this chapter. If both counterterms are absent (i.e., W\ = W 2 = 0), we will refer to the E-mode estimator as “ordinary pseudo-Cy’. We will reserve the term “pure pseudo-Cy’ for the situation where the relations (4.36) hold exactly, e.g. because W (x ) is of known analytical form and W \ , W 2 are obtained by simply evaluating derivatives in each pixel. In this case the E-mode estimator will have zero E —> E leakage (except perhaps from aliasing artifacts in a finite pixelization). Finally, in a case where the relations (4.36) hold approximately, we will refer to the E-mode estimator as “p s e u d o - w i t h counterterms” . This will be the situation for the weight functions produced by our numerical optimization procedure; since the relations (4.36) are not imposed as constraints, they do not hold strictly, and so there will be some nonzero mixing in the estimator. However, we will see shortly th at the relations will hold approximately, and the level of E —> E mixing for optimized weight functions is small in practice. This is because reducing the mixing helps minimize the quantity (Ca ), so our optimization procedure prefers weight functions which satisfy Eqs. (4.36) to a good approximation. In fact, this is the main reason why we have found it convenient to work in the counterterm formalism, rather than passing to the curl x b (which we showed was equivalent in §4.4): the variational optimization procedure sidesteps implementational issues associated with taking derivatives in an irregular spherical pixelization. In Fig. 4.3, we show the result of our optimization procedure for (. = 20, a spherical cap shaped survey with radius 10°, and three choices of noise level: 100, 20, and 5 //K-arcmin. For the largest noise level, both higher-spin counterterms are small and the weighting is roughly uniform. In this regime, our optimization procedure reduces to ordinary p s e u d o - p o w e r spectrum estimation with tophat weighting. For the Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4.3: 5-m ode weight functions for varying noise levels, for I = 2 0 and a 1 0 ° spherical cap with homogeneous noise. As descrbed in the text, each weight function consists of spin-0 , spin- 1 , and spin - 2 pieces, which we show in the left, center, and right columns. The top three rows show the 5-m ode weight functions produced by our optimization procedure for noise levels 100, 20, and 5 /iK-arcmin (from top to bottom). In the bottommost row we show the analytic solution given by Eq. (4.33). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 125 50 : E - m o d e p o w e r s p e c tr u m "t 30 0 1000 500 2000 1500 0.015 0.1 "— “ B - m o d e pow er s p e c tr u m T /S = 0 (le n sin g BB) T /S = 0 .0 5 0.08 0.01 0.06 0.04 0.005 0.02 0 50 100 150 200 200 4 00 600 6 00 1000 Figure 4.4: Power spectrum errors for the 10° homogeneous mock survey described in §4.7, using pseudo-Q power spectrum estimation with counterterms (left/blue) or optimal estimators (right/red). For the B -mode power spectrum, we compute power spectrum errors in a fiducial model with T / S = 0 but also show a spectrum with T / S = 0.05 for comparison. smallest noise level, the weight function is apodized, both counterterms are present, and all components of the weight function have nearly converged to the analytic solution (4.33), which represents the limit E E B B . Here our procedure reduces to pure pseudo-Q with weighting th at can be understood by solving the differential equation (4.32). Our optimization procedure therefore smoothly interpolates between ordinary and pure p s e u d o - a s the noise level is improved, “turning on” the higherspin counterterms as they are needed to reduce E —»• B mixing below the noise floor. (We will illustrate this in a different way in the next section.) Next we consider power spectrum errors from our estimators for a spherical cap shaped mock survey with radius 10°, homogeneous noise level 5.75 pK-arcmin and R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 126 Gaussian beam $FWHM = 8 arcmin. These values have been chosen to roughly model the fiducial realistic survey which will be treated in more detail in the next section. We have used an azimuthally symmetric mock survey so th at optimal, or max imum likelihood, power spectrum estimation will be computationally feasible, even though the pixel count is large enough th at this would normally be impossible. The computational speedup is obtained because both signal and noise covariance m atri ces are diagonal in m; for details see Appendix F. This allows us to compare our estimators to optimal in a baseline survey which approximates our fiducial realistic experiment (§4.8) within the constraint of azimuthal symmetry. In Fig. 4.4, we show power spectrum errors for the mock survey using both pseudoC'i with weight functions produced by our optimization procedure, and optimal power spectrum estimators. It is seen th at our method is slightly suboptimal at very low I but rapidly becomes optimal. For example, the B-mode bandpower RMS is 72% optimal in the lowest band, 88% optimal in the second-lowest, and 91% optimal in the third lowest. We emphasize th at this level of performance for 5-m odes is much better than could be obtained at this noise level using ordinary pseudo-Q) estimators (Challinor & Chon, 2005), and th at we have obtained it using a completely autom ated procedure for generating weight functions from the noise distribution. 4.8 A realistic exam ple In the preceding section we considered a homogeneous, azimuthally symmetric mock survey whose noise distribution was much simpler than a real CMB experiment. We conclude by considering a more realistic example. We will use noise maps made from preliminary simulations of the EBEX experiment (courtesy of Sam Leach). The noise map is shown in Fig. 4.5. The simulation used the scan strategy proposed by EBEX R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 127 12.04 ^K-arcmin 6.02 5.25 4.72 4.32 4.01 3.76 3.55 Figure 4.5: Survey region, noise distribution, and point source mask for our fiducial realistic experiment. The side length of the bounding square is 25°. The map is based on preliminary simulations of the upcoming balloon experiment EBEX. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 128 and a realistic focal plane configuration to estimate the number of hits for every 1 arcminute square pixel after a 14 day long duration balloon flight. The details of the scan, focal plane configuration, number of detectors and ultim ate sensitivity are not of particular interest here and are probably subject to change before the flight takes place. However the simulations give a good illustration of the level of complexity of the noise maps and sky coverage for upcoming ground and balloon based experiments. Thus they serve as a nice test bed for our methods. Since the map is not azimuthally symmetric, computing optimal power spectrum errors is computationally prohibitive and we will not be able to compare our method to optimal. Instead we will compare p s e u d o - p o w e r spectrum errors for the fiducial realistic experiment to those obtained for the azimuthally symmetric mock survey in the preceding section, which roughly approximates the sky coverage and noise distribution. In Fig. 4.5, we have generated randomly-positioned point sources with average number density 0.04 deg- 2 , and masked each point source to radius 17 arcmin, cor responding to the 5a level of the beam (assumed Gaussian with % W H M — 8 arcmin). W ith these parameters, 1% of the survey area is excluded by the point source mask. The area th at will need to be cut out to sufficiently mask point sources to measure 5-m ode polarization is at this point rather uncertain. We have based our choice of 1% on point source simulations conducted at SISSA for the Planck satellite reference sky (courtesy of Carlo Baccigalupi). Our choice should thus be seen as an illustrative example to help us understand how our method would deal with the resulting holes. We will analyze the map with and without the mask, to isolate the impact of point sources when measuring 5-modes. For analyzing a very inhomogeneous map such as this, with noise fluctuations on large and small scales, an autom ated procedure for optimizing weight functions is a Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. Figure 4.6: Weight functions for the fiducial realistic experiment. Each of the three rows corresponds to a different bandpower, with multipole ranges (frnin , ^max) given from top to bottom by (30,70), (190,230) and (510,550). W ithin each row from left to right, the E-mode weight function, the spin-0 piece of the E-mode weight function, and the spin-1 and spin-2 pieces of the E-mode weight function are shown. Figure 4.7: A zoomed view of the (Anim4nax) = (190,230) E-mode weight function near point sources. From left to right, the panels show the noise map, the scalar piece of the weight function, and the two higher-spin counterterms. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 130 practical necessity. Using the optimization procedure described in §4.7, we generated weight functions for both E-mode and E-mode power spectrum estimation; some representative weight functions are shown in Fig. 4.6. Considering E-mode weight functions first, our method generates weight functions which are more inhomogeneous than those given by the FK P ansatz (Appendix G) but similar in performance. There is a slight improvement at low £; e.g. for the lowest .E-mode bandpower, the the bandpower RMS given by our estimator is ~ 10% better than FKP. Note th at in §4.7, we did not discuss E-mode weight functions, since they are very simple for the homogeneous mock survey. Turning now to E-mode weight functions, one qualitative feature is th at their statistical weight is concentrated near the center of the survey (compared to the Emode weight functions). This is for signal-to-noise reasons, since the E-mode signal is weak and the center of the survey is least noisy. Another feature of the E-mode weight functions is th at the higher-spin countert erms are relatively smooth across the map, even though the scalar piece of the weight function has structure on small scales matching similar structure in the noise. Instead, our numerical optimization procedure prefers to associate counterterms with “large scale” features such as survey boundaries or point sources, but not with small-scale features in the noise. From Fig. 4.6, one sees th at the counterterms are sourced mainly by boundaries at low £ and by point sources at intermediate to high £. A zoomed-in view of the E-mode weight function near point sources is shown in Fig. 4.7. The behavior of the weight function near the “internal” boundaries of point sources is similar to the behavior near external survey boundaries seen in Fig. 4.3: the scalar piece of the weight function is apodized, and the counterterms are large near the boundary. One consequence of the counterterms being smooth, even though the scalar piece R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 131 E-*B c o n ta m in a tio n ( o rd in a r y p se u d o -C ,) E-*B c o n ta m in a tio n (p s e u d o -C j w ith c o u n te r te r m s ) n o ise p o w er s p e c tr u m le n s in g BB p o w e r s p e c tr u m ___ ____ — ^ i /M ■“N 0.01 + 0.001 0.0001 200 400 600 800 1000 Figure 4.8: Contribution of aliased .E-modes to the pseudo power spectrum C f B for the fiducial realistic experiment with point source mask, with signal and noise power spectra shown for comparison. Using counterterms from our numerical optimization procedure, the level of E —> B mixing is well below the noise floor, in contrast to ordinary pseudo-Q. of the E-mode weight functions have small-scale structure, is th at the relations (4.36) are not strictly satisfied. Therefore, the estimator is not completely pure; there is some nonzero level of E —►B mixing. This is quantified in Fig. 4.8, where we show the mean contribution of aliased E -modes to the pseudo power spectrum Cg. In the pseudo-C^ construction, this contribution is always removed in the mean by the debiasing step, but acts as a source of extra variance. It is seen th at our optimization procedure produces counterterms which reduce E —> B mixing well below the level of the noise, even though strictly speaking, the mixing is nonzero, i.e. the estimator is not pure. (In Fig. 4.8, some “stair-step” features can be seen which arise because we do not generate a separate set of weight functions in every band, e.g. we use the same weight function between £ = 510 and £ = 830 to save CPU time.) Next we study the power spectrum errors which are obtained using these weight functions. In Fig. 4.9, three sets of power spectrum errors are compared. Considering the leftmost set of error bars first (i.e. pseudo-C^ with counterterms and without the R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 132 50 ; E - m o d e p o w e r s p e c tr u m 40 ft- 30 0 1000 500 0.03 0.15 - B - m o d e p o w er s p e c tr u m T /S = 0 T /S = 0 .0 5 01 2000 1500 0.02 S. u„ 0.01 0.05 + 0 50 100 150 200 200 400 600 800 1000 Figure 4.9: Comparison of three sets of pseudo-Q power spectrum errors for the fiducial realistic experiment. Weight functions were optimized independently in each of the three cases. Left/blue: P s e u d o - w i t h counterterms, without the point source mask. Center/magenta: Pseudo-Q with counterterms, with the mask. Right/red: Ordinary pseudo-C^, with the mask. For F'-mode power spectrum errors, the three cases are so similar th at only one set of error bars is shown. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 133 point source mask), we would like to know, how do these errors compare to optimal? As previously remarked, we are unable to make this comparison directly since the fiducial experiment lacks azimuthal symmetry. However, the pseudo-C^ errors for the fiducial realistic experiment can be compared to the p s e u d o - e r r o r s which were obtained previously (Fig. 4.4) for an azimuthally symmetric mock survey with roughly the same sky coverage and noise level. We find th at the two are roughly equal; the RMS bandpower errors for the fiducial realistic experiment for both EE and BB are ~ 10% worse th an the mock survey at low I and ~ 10% better at high £. It seems plausible th at this difference is simply due to inhomogeneity in the noise, and th at our method remains near-optimal for the fiducial realistic experiment, although we cannot say this with absolute certainty since direct computation of the optimal errors is not feasible. Now let us compare the errors with and without the point source mask, for pseudoC(t estimators with counterterms (i.e., the left and middle error bars in Fig. 4.9). We find that the point source mask degrades the errors on BB by ~ 20% for £ < 250 but the effect becomes small for large £, or for EE on all angular scales. Even though a tiny fraction of sky area is masked, we find th at the impact on B-mode errors is not negligible. This is to be expected since the point source mask does decrease the number of pure B-modes in the survey region. However, even taking a conservative estimate for the number density of point sources, we have found th at this effect is not large. Finally, we compare pseudo-C^ with counterterms to ordinary p s e u d o - ( i . e . , the middle and right error bars in Fig. 4.9). It is seen th at at this noise level, including counterterms in the estimator is essential for E - B separation. If ordinary pseudoestimators are used instead, the RMS B-mode errors are worse by a factor ~2-3 at intermediate to high £, and a factor ~ 10 at low £. If this comparison is repeated Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 134 without the point source mask, we find th at the two become equal at high t but remain drastically different at low I. This suggests th at the suboptimality of ordinary pseudo-C^ is mainly due to E —>B mixing from survey boundaries at low i and point sources at high t. 4.9 D iscussion We have given a general treatm ent of E —> B mixing and argued th at when estimating B-mode power, no information is lost by passing to the curl X B ■ The analogue for a scalar field, passing to the Laplacian, would lose the largest-scale modes on a cut sky, but for a spin - 2 field it is the ambiguous modes which are lost in this way. This presents a general approach to constructing estimators which are pure, i.e. which filter out contributions from ambiguous modes. We considered the estimator defined by applying p s e u d o - e s t i m a t i o n to the scalar field x jj, and proved th at it is mathematically equivalent to the pure pseudo-6 ^ estimator from §4.4. The equivalence sheds light on properties of the estimator, e.g. the requirement of a smoothly apodized weight function can be understood as arising from colored noise in xb- Perhaps most importantly, it allows us to consider ordinary and pure pseudo- Ci estimators in a unified way when studying the problem of optimizing the weight function. Our central result is a general ansatz for optimizing the pseudo- 6 ^ weight function. The ansatz can be expressed either in m atrix form (Eq. (4.17)), or as a variational principle: the optimized weight function minimizes the total expectation value (Ca ) of the estimator, subject to the constraint ^2x W (x ) = 1. Although our emphasis has been on next-generation ground-based polarization experiments, and we have only considered noise which is uncorrelated between pixels (Eq. (4.35)), the ansatz is Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 135 Mode-counting estimate (Eq. (4.37)) Homogeneous circular mock survey, optimal estimators (Fig. 4.4) Homogeneous circular mock survey, p s e u d o - w i t h counterterms (Fig. 4.4) “Realistic” noise map, no point source mask, p s e u d o - w i t h counterterms (Fig. 4.9) “Realistic” noise map, point source mask applied, pseudo-C^ with counterterms (Fig. 4.9) Ordinary pseudo- ( T /S ) lff 0.0032 ^(-^lens) 0.095 0.0039 0.095 0.0043 0.096 0.0045 0.082 0.0056 0.0536 0.085 0.258 Table 4.1: Forecasts for the l a upper limit on (T /S ) and the fractional error on the amplitude Aiens of the lensing 5-mode, for various levels of approximation to the fiducial realistic experiment. much more general: in principle it applies to arbitrary signal and noise covariance. In particular, it should be useful for the tem perature power spectrum, but we have not pursued this here. We have shown our the ansatz can be solved numerically for an arbitrary spatial noise distribution, and studied the performance of our estimators for several mock surveys, culminating in a full simulation of a realistic experiment with characteristics based on the upcoming balloon borne EBEX: complicated boundaries, inhomogeneous noise and a randomly generated point source mask (Fig. 4.5). This is summarized in Tab. 4.1, where we have compressed the 5 - mode power spectrum errors obtained for each survey into two numbers: the l a upper limit on (T/ S ) and the fractional error <r(yliens) on the overall amplitude of the lensing B - mode. In computing cr(Tiens), we have used the WMAP3 best-fit model (Spergel et al., 2006) with <jg = 0.74; in a different fiducial model it would scale roughly as cr(Ajens) oc a ^ 1. (We have also treated the power spectrum covariance as Gaussian; including non-Gaussianity in the lensing 5-m ode is expected to give a ~ 10% correction to cr(A|ens) at noise levels of R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 136 our fiducial experiment (Smith et al., 2006) but the effect can become large for lower noise.) As a baseline, we have shown in the first row of the table the naive “modecounting” estimate, given by assigning uncorrelated errors to each bandpower as follows: 1 / C BB Cov(At , Mb) = - f sky £ ( 2 C + 1) \^C J \ 2 + N(J (4-37) where Afy is the noise power spectrum. In the last row, we have shown for comparison the result of using ordinary p s e u d o - e s t i m a t o r s without counterterms. Each row of the table isolates one effect which might be worrying for E - B separa tion: mixing from the survey boundary, suboptimality of pseudo-6 ^, inhomogeneous noise with small-scale features, and the point source mask. Our conclusion is th at each of these has a small impact using our method, although it is worth noting th at when combined, these effects do result in a value of ( T / S ) i a which differs from the baseline mode-counting estimate by 75%. Considering all of these in combination, we have given a complete treatm ent of “geometric” effects which arise from the mask and inhomogeneous noise coupling E and B in a realistic survey. It should be mentioned th at there are other, non-geometric effects not considered here which will have an impact in real CMB polarization experiments, such as ( 1 //) noise, removal of ground-synchronous modes, systematic errors, and foreground con tamination. In particular, the power spectrum errors we have presented in §4.8 should not be regarded as a “bottom line” forecast for EBEX, as these effects have not been included. However, geometric mixing of E and B has been a practical concern for next generation experiments; we have given a general solution to the problem and demonstrated the method for noise distributions with the complexity of a real ex periment. Moreover, we have shown how to generate optimized weight functions in Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 137 a completely autom ated way, thus removing an outstanding practical obstacle in the pseudo-Q method. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX A SP IN -1 SPH ER IC A L H A R M O N IC T R A N SFO R M S As discussed at the end of §3.4.2, our algorithm for fast evaluation of B^m requires an implementation of fast spherical harmonic transforms for vector fields. This will also be an ingredient in the algorithm, to be presented in Appendix D, for computing the transfer matrices K ^ f ure. In this appendix, we write down the recursion and initial conditions which are needed: ( l ^ + l ,m ) ^ + l ,m = (i V (Ye% ~ (co s0 )Y ^ — (A-l) h .j = tabiYgf = {- ^ - J (si/ - 1 e y l* [(cos 9)xa + iy«\ Here, (,A<m) = ( 1 /Q V (< 2 - 1) ( < 2 - m 2 )/(4<2 - 1). An alternate approach to implementing fast spin - 1 transforms appears in (Lewis, 2005). 138 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX B F IN IT E D IF F E R E N C IN G IN A N IR R E G U L A R SPH ER IC A L PIXELIZATIO N Our definition of B£m (3.24) requires a prescription for computing the spin-1 and spin - 2 weights W a , W ab (3.25) from W ( x ) . If W ( x ) is of known analytical form, then a trivial “prescription” consists of carrying out the covariant derivatives by hand; otherwise, a finite differencing scheme must be used. Since this is nontrivial in an irregular spherical pixelization such as Healpix, we present the details of one such scheme here. At each pixel center x, we define a m atrix Ma^(x) by (B.l) Throughout this appendix, the notation ^ x/ denotes a sum over pixel neighbors x.' to the pixel x, and (x; —x)-1- denotes projection of the 3-vector (x; —x) into the plane perpendicular to the unit vector x. After the projection, (x; —x)-1- is a tangent vector at x, and M is a rank-2 symmetric tensor, as the index notation suggests. Now, if W (x) is any pixelized map, we define its “finite difference” gradient by V f DW'(x) d=Sf M -^ x ) £ ( / ( x ' ) - /(x ))(x ' - For purposes of this paper, we need the following extension. x ) ±b (B.2) If ua(x) is a finite pixelized (tangent) vector field, we define its “finite difference” covariant derivative 139 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 140 by ^ ( P x/ ^ x w(x ') - w(x ))b(x ' - x ) _Lc (B.3) where Vx/ ^ Kv(x') denotes the tangent vector at x obtained by parallel translating v(x') along the great circle arc connecting x and x /. This parallel translation is needed to covariantly difference tangent vectors at distinct points x, x ;. If one were to simply difference their components in the 9, ip coordinate system instead, then the resulting finite differencing scheme would suffer from coordinate artifacts near the poles. Using this finite differencing scheme, one takes the spin - 1 weight W a in Eq. (3.25) to be and the spin - 2 weight Wa^ to be the traceless symmetric part of We have tested this prescription for the case of cosine apodization, by comparing the pure p s e u d o - e s t i m a t o r s which result from finite differencing and analytic differentiation of the cosine weight function. For a fixed random CMB realization, we find th a t the difference between the two versions is negligible. R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX C CO RRELATIO N F U N C T IO N S FO R FIELDS OF A R B IT R A R Y SPIN For the transfer m atrix calculations in Appendix D, we need expressions for correla tion functions in terms of power spectra, for all combinations of spin-0 , spin- 1 , and spin-2fields. Wepresent these results in a form which generalizes to higher spins as well. Aspin s field ( —00 < s < 00 ) is a function (s f ) whose value at x depends on a choice of orthonormal basis vectors {ei, £ 2 } at x. Under the right-handed rotation = e '2 = (cos 0 )ei + (sin 0 )e 2 (C.l) —(sin 0 )ei + (cos#)e 2 (sf) must transform as ( s f ) ' = e~is^(sf ) . There is a spin-raising operator 6 and a spin-lowering operator 6 which transform a spin s field into fields of spin (s + 1 ) and (s —1) respectively. In the frame { e i , e 2 } = {xa,Ha}, these operators take the form &(sf) = - s i n s (0 ) &(sf) = - s i n - s (0) ^ + i c s c ( d ) - ^ j sin“ s (0 )(s/ ) - i c s c ( 9 ) - ^ j sins (0)(s/ ) An orthonormal basis for spin s fields is given by the spin harmonics &: Penrose, 1966; Goldberg et al., 1967), which are given for s > 0 (Newman by ,Ytm = \ l w T § . (6)S{Ylm) 141 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (c'2) 142 These are nonzero only for £ > |s|. Correlation functions between fields of spin s, s' can be expressed in terms of the following function, which generalizes the Legendre polynomial P^(x • x 7) in the case s = s' = 0 . P f V x ') # ^ (S^ m (x ){ e1,e2}={X,y})* m = —£ X (slY£ m ( x ' ) { e i,e2}={X',F'}) (C-3) We emphasize th at the spin harmonics on the right-hand side are evaluated in the “two-point” frame { e i , e 2 } = { X , Y } , { e^ e^ } = {X 7, Y7}, not the frame { e i , e 2 } = { %a,Ua}• W ith this choice of frame, the right-hand side is rotationally invariant and therefore depends only on the separation (x • x 7), as implied on the left-hand side. It can be shown (Ng & Liu, 1999, Eq. (3.17)) th at P^s is related to the reduced Wigner D-function d{OU , by PVf s’(cos 9) = ( — 1 )sd{,(9). o Using standard results on Wigner D-functions (Varshalovich et al., 1988), one obtains the spin-label symmetries, orthogonality relation, and product rule: Pess\ z ) = p f s (z) = f\zp g {z)p g {.z) ( - 1 )1+sp Y = s\ = ( - l ) Sl + * i + S 2 + 4 ^ ( 2 ^ 3 + 1) £3 l\ £2 \ -4 -4 z) (C.4) 2^ (C.5) / Pg4(z)Pt*4(z) - £3 X \ £\ i2 £3 -si s2 si -I- S2 p si+s2,s[+s^z) si + s'2 J Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. ) (C6) 143 Likewise, the following recursion and initial conditions may be used to evaluate Pj/S . Using the spin-label symmetries (C.4), it suffices to give initial conditions for the recusion assuming s > s' \: P lV m W — (2 ^ + 1 ) z In the recursion (C.7), we have defined = y/ — s 2 ) ( ^ 2 — s'2)/£. The recursion (C.7) with accompanying initial condition (C.8 ) is the main result of this appendix. We have used the language of spin-s fields, which permits the result to be stated in a uniform way for all spins. Since the transfer m atrix calculations of Appendix D use the (equivalent) tensor language in spins < 2, the rest of the appendix is devoted to translating between the two. As a byproduct, we will obtain expressions for the correlation functions between all combinations of fields with spins s, s' < 2. In the case where both spins are nonzero, it will be convenient to use the linear combinations (C.9) instead of P f s' , Pg'~S ■ The basic observation which relates the tensor and spin-s formalisms is th a t the frame-dependent vectors m=(ei+ie2)/2 m = (ei —ie2)/2 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (C.10) 144 have spins 1 and -1 respectively. (This notation follows Okamoto & Hu (2003) but our normalization differs by a factor l/y/2.) In terms of these, the vector and tensor harmonics Y ^ , Y ^ can be written Y (£m)a 'E Y d(£m)bc m )b r = ( - l Yt m ) ™ a ~ {lY l m ) m a = - ( - 2 Yern)m bm c - { 2 Yern)m bm c- (C.ll) These are parity-even combinations of {ma,m a} and {mjmc, m^nic}. (n q\ of P f s : The following sums can be evaluated using (C.ll) and the definition (C.3) (o. 1 2 ) 47T m ~ m E 4^r ,,/x E YL ^ m ) b ' A ^ ) m 2^+1 = — £ r Pe2(*)Q,bH m Y YU mi)a^ ) * M' Y((tm)bld^ L w A * ' )' ~= m rE , ,/X _ E^fmlfecW^fmlfc'c'^) _ m ^ 1 (Q1 e 2( z ) X aQ'b,d + R}2(z)YaU ^ ) 2 ^ + 1 4?r {Qf^QbcQ'b'd + RT (z)QbCUy^) To evaluate analagous sums containing Y ^ or Y ^ , note th at the 90° vector rotation Va —►—eaf)V^ sends Y ^ —>Y ^ , X —> Y , Y —> (—X); and the 45° tensor rotation n a6 -*■ ( - eac n cfe ~ e6cn ca)/2 sends Y ^ -»■ Y ^ , Q -> {/, £/ -»■ (-£?). For example, the following sum will be used in Appendix D: E Yi Z ) a ( A y {f m )bcA ' ) = m ~ R f W a Q 'v A R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. (C.13) 145 It is obtained from the fifth equation in (C.12) by applying a 45° rotation to the indices be, and no rotation to the index a. As an application, using sums of the form (C. 1 2 ), one can write down all corre lation functions between Gaussian fields with spins < 2 which arise from parity-even power spectra: <r T > = (x x > = (c .i4 ) 47T + I (TX ) = I I (TQ) = I <w> = EI 2^ t (c ! e Q? W+cfBR f M) <x = EI t ~(cF«12(z)+ <u ir> = E ^ f F ^ f w + c F o f w ) t (Y U > = E ^ 47T tF ^ w + c p ^ w ) I Here, correlation functions on the left-hand side are defined in the “two-point” basis from §3.2; e.g. (XQ) denotes the correlation between the X a component of a spin-1 field and the Q'a^ component of a spin - 2 field, at points x, x 7 with separation z = (x-x7). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 146 Correlation functions for parity-odd power spectra are given by: <T y > = (c .i5 ) i = <X Y > 4?r E e <xf/> = E ^ ( c?B«f2W- c f E RP( z)) t = E ^ ( C f £«<2w - cfBflJ2W) I = E ^ 47T I c f BM ? w - - R ? w ) For spins 0 and 2, these results have already appeared in the literature (Zaldarriaga, 1998; Tegmark & de Oliveira-Costa, 2001), with the following explicit formulas for P®2, Q22, P 22 used in place of the recursion relation (C.7): p 0 2 (2) = * - (lE + ^ ) [(<-l)<(<+l)(<’+ 2)]1/2 q22(2) — !)£{£ + 1 )(^ + 2 ) ™ R l ^ (£ + ~ ~ 4 2) P j_ , (z) - ( £ - ( £ - l ) £ ( £ + l ) ( £ + 1) z P j{ z ) 2 )(l - z 2) (C'16) (See also (Challinor & Lewis, 2005) for some spin - 1 equations.) The advantage of the present treatm ent is th a t it generalizes straightforwardly to all spins, including spins higher than 2 , and allows use of a single uniform recursion relation (C.7) rather than many explicit formulas of type (C.16). Our approach is similar to Ng & Liu (1999), who compute correlation functions covariantly for spins 0 and 2 . Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX D C O M P U T IN G T R A N S F E R M ATRIC ES There is still one ingredient missing from our formalism: an algorithm for efficient calculation of the transfer matrices K ^ f ure, which are needed to debias the power spectrum estimators (Eq. (3.10)). We have found it more convenient to compute transfer matrices using correlation functions than in harmonic space. Since this ap proach has not been used previously in the literature, we first illustrate the method for the simpler case of tem perature pseudo-C^ estimators (§D.l), before treating the case of pure polarization estimators (§D.2). In §D.3, we address implementational issues and show th at the computational cost of computing the transfer matrices is o (cL x )- D .l Transfer m atrices for tem perature p seu d o-C /s In the temperature-only case, p s e u d o - e s t i m a t o r s are constructed in a way which is completely analagous to the polarization case (Eqs. (3.8)-(3.11)). We briefly sum marize the necessary definitions; details can be found in Hansen et al. (2002). Pseudo multipoles and power spectra are defined by Ttm = ^T(x)W (x)yta (x)* (D.l) X 5 7 7 = a (d .2 ) m = —l Assuming no noise bias, the (£max) -by- (£max) transfer matrix Kfji is defined by (C7) = Ka c y V 147 R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. (D.3) 148 The purpose of this section is to compute K ^ . We first establish a key result: evaluation of a sum of the form S = ^W(x)F(x-x')W(x'), (D.4) XX' where F ( z ) is a polynomial in 2 of degree < (-max- Expanding F in Legendre poly nomials, F(z) = Yhi<l TTVCLCC FgP^z), we evaluate the sum as follows: S = J2 W(x)W(x!)FiPe(x • x') xx' (D.5) Aqr = E 2lTTH/(x)H,(x')F*Y«>"(x)*y<™ (x') x x 'm = E £<£„ m It will be useful to rewrite this as an integral involving F(z) instead of a sum involving F f S = 87 t 2 J d z ( W W (z)F(z), (D.6) where (W W {z)i l f 1 J2 \Wtm\2Pt (z) (D.7) £ < lmax m can be thought of as an estimator of the correlation function of W(x), pretending th at W{x) is a Gaussian field on the full sky which is zero (by coincidence!) outside the survey region. Armed with the key result (D.6), the transfer m atrix is calculated as follows. By definition, the m atrix element Kggi is the expectation value of C j T (Eq. (D.2)), given Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 149 the signal covariance (T (x)T (x/)) = (2£f + 1 ) /( 47 t)P^/(x • x'). Therefore, Ku = 11+ = 27r £ W ^ ) W ^ ' ) Yl m Y Y Yl m ( ^ Y P^ (x -x ') xx'm d2 W - ( D -8 ) We have now arrived at our desired expression for Kpji. We defer the issue of efficient evaluation of Eq. (D.8 ) to §D.3. For now, we note th at Eq. (D.8 ) is equivalent to the form commonly seen in the literature (e.g. Hivon et al., 2001, Eq. A31), 2 £ £! £" 47T (D.9) 0 l"m" 0 0 by virtue of the definition (D.7) of ( w w (z) and the identity / J £ £! £" (D.10) dz P i(z )P ti(z)Ptn(z) = 2 0 D .2 0 0 T he transfer m atrix K ^ f ure Computing the transfer m atrix K ^ f ure defined in Eq. (3.29) is completely analagous to the temperature-only case, but the bookkeeping is considerably more complicated owing to the presence of spin-0, spin-1, and spin-2 weights W (x), W a(x), W}JC{x). We will need to use the results in Appendix C, in which it is shown how to compute correlation functions between fields of these spins from all parity-even power spectra. Throughout this section, we will use the abbreviated notations Ylm =f Ytrn (x ) Ylm =f Ylm (x ) z =f (x • x') Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 150 in equations which contain a pair of points x, x 7. We first consider an analog of the “key result” Eq. (D.4), in which F is replaced by a 3-by-3 m atrix which contracts all combinations of spin-0, spin-1, and spin-2 weights: W(x) S = X J ( W (x ) ^ a (x ) W bc(x) ) F ( x ,x /) (D .ll) W a' (x') ^ W h'c\ x ' ) ) We make the assumption th at F is built from parity-even power spectra which are zero for multipoles larger than £max■ (This is the analog of the assumption, from the preceding section, th at F ( z ) is a polynomial of degree < (-max-) To be precise, let F f w , F f G, F f E , F GE, F e e , F GC, F g b , F b b be arbitrary power spectra. Note th at we denote the spin-0 “field” by W , and consider only parity-even combinations of fields. Then F is assumed to be of the form, ( jpWWy* v 1 p£ F (x ,x ') E £“ E£max m ipWGy* y/G I £m (£m)a’ jpGGyG* ylG £ (£m)a (£m)a1 -pGEyE* y/G I (£m)bc (£m)a' £m £m 77'WGyG* \rl £ (£m)a £m pW EV E* \rl £ (£m)bc £m / jpWWy* y /E \ £ £m (£m)b'd zpGEyG* y /E £ (£m)a (£m)Vd rpEEyE* y /E £ (£m)bc (fmlbV / 0 0 0 n jpCCyC * y/C £ (£m)a (£m)a' TpCByB* y/C £ (£m)bc (£m)a' xpCByC* y/B £ (£m)a (£rn)b'd zpBByB * y/B £ (£m)bc (£m)b'c? n \ (D.12) ) Note the bi-tensor structure of F: each m atrix entry carries a set of tensor indices which contract with the weight functions in (D .ll) so that S is a scalar. The significance of the sum (D .ll), with F-m atrix of the form (D.12), will appear shortly. For now, we forge ahead and evaluate S, using the method of the preceding Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 151 section. First, plugging the form (D.12) for F into the definition (D .ll) of S, we get: ( pW W = £ ( tm W?m < . pWG pW E \ pWG pGG pGE pW E pGE pE E / C V ( pCC WSn 7>BB V \ J \ pCB \ t?BB W(m w ,B * (D-13) J This expression for S is the analog of Eq. (D.5) from the preceding section. Our goal is to get an expression analagous to Eq. (D.6 ), in which the functions f W X {z), ... }. appear directly instead of the power spectra (z), . .. }. We first define correlation functions which are constructed from the weight functions in the same way th at one constructs correlation functions from parity-even power R eproduced with permission of the copyright owner. Further reproduction prohibited without permission. 152 spectra (Eq. (C.14)): -WW h (D.14) Y.W 'tm W tm Pt) lm »XX lm cw x - s B r t f f 1) lm cY Y ^ 1+ C k'&g J1) lm lm CQQ i E ( « « f +wfcwL*P) lm (XQ ^ Z ( w £ w fmQ}2 + w ' S w ' M 2) lm cUU YU C t E<<>*wfmRf +w?ZwLQ?) 47T it E f^ m + WeZ Wh n Q f ) lm Eq. (D.13) can then be rewritten as an integral containing the functions { f W W (z), f w x (z),...y. s = 8x2 f 1 dz t w w ( z ) f w w (z) + 2Cw x ( z ) f w x (z) + 2t;WQ( z ) f WQ(z) +CX X (z)SX X (z) + ( Y Y ( z ) f Y Y (z) + i Y Y ( z ) f Y Y (z) + 2CX % ) / X % ) + 2CY U ( z ) f Y U (z) + < « « ( * ) / « « M + ( V U { z ) f V U (z) (D.15) To show this, it is easiest to work backwards, substituting Eqs. (D.14) into Eq. (D.15), and ending up with Eq. (D.13). One writes Q fs\ R f s> in terms of Pj?s> (Eq. (C.9)), and uses the orthogonality relation for PgS (Eq. (C.5)). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 153 We have now arrived at our desired “key result” : an expression for S in terms of the functions { f W W (z), f W X (z), . . We now proceed to calculate the transfer matrix element K ^ f ure. (The case of K ^ f ure will be treated shortly.) This matrix element is the expectation value (C i d e / , r \ T i d ! e (x')) 'u J w _ {Ude(x)Ude = ^ BB,pure\ (.R given the signal covariance f ( z ) Q d e Q ,d'e' + Q f { z ) U d e U ,d' e'). (D.16) A short calculation, using only the definition of c ^ B,pure (Eqs. (3.24), (3.28)), shows th at ( Kt r e = E ( W (x) W a (x) W 6c(x) ]F ^/(X ,X /) W (x) W xx' y ^ (D.17) a\-x!) w 6V(x ') j where the F-m atrix is given by F « /(x ,x ') = A liiy [ R f ( z ) Q ^ Q ld'‘ ' + Q f ( z ) U d ‘ U " t <! ] \rlB (£m)d'e' B* (£m)de Y; £ E N'eTdj Y (^ )f N ' F M , ' Y $ n)r m=—£ \ Y £YdebcY (m x J \ N £Td!e'b'dY£m We will evaluate the right-hand side of Eq. (D.17) using the key result (D.15), but first we simplify the F-m atrix using identities from Appendix C. We show the details for the ( 1 ,2 ) m atrix entry, / ^ r>(T2) at/ 2£ -)-1 r p22f„\f\der\id!e' , r\22(„\riderr/d'e'~\ \ ' -i/-B* rp {£m)f U' ~ £n ( 2 e + l ) L-R V Q U J 2 ^ Y{£m)deTd'e'a'J Y(£m)f m=—£ (D.18) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 154 To do the sum over m, we use identity (C.13), obtaining It is then straightforward to do the index contractions and obtain: (D.20) This procedure can be used to simplify the remaining entries of F ^ /; when the dust has settled, one finds (D.21) We would like to apply the key result (D.15) with F-m atrix given by this form, but first there is an annoying technicality: our derivation of Eq. (D.15) assumed th at the F-m atrix was built from parity-even power spectra, in the sense th at Eq. (D.12) is satisfied. We claim th at this is so for the F-m atrix in (D.21), with £max = £ + £'■ To make this statem ent intuitively plausible, note th at the F-m atrix in (D.21) is a rotationally invariant, parity-even object which is constructed by multiplying objects of spins £ and t ' . A formal proof can be given by writing Q fs' , R f s' in terms of P^s' Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 155 (Eq. (C.9)), and using the product rule for P^s' (Eq. (C.6 )). This can be done one m atrix entry at a time; again we supply the details only for the ( 1 ,2 ) m atrix entry: r( U ) •W ( * '* ') = Nf - j 0 j r ( Q ? Q ? + R ? R ? ) K > = N ' i % ^ ( pt 2pf E \i-e'\<e»<e+£' +plr (D.22) l p ^ ) x 'u' .(2£f + 1)(2£" + 1) (47r)^ I X £' £ plO y> a' , 2 £’ E + 1 £ £’ £ m x Yp'm" (x /) (1 21 We have shown th a t F ^ ,’ ' is of the form which appears on the RHS of Eq. (D. 1 2 ), with £max = £ + £'. W ith this final technicality out of the way, we can use the key result (D.15), with F-m atrix given by (D.21), to evaluate the right-hand side of Eq. (D.17). This gives the transfer m atrix in the form: K +Pure = 27T J ^d z ^ ( M z ) Q p ( z ) + B £ (z)R ft (z )) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (D.23) 156 where the functions Ap(z), Bp(z) are defined by A e = ( w w Q f - 2 N e, CW X Q}2 + 2N£CWQP P + n £2(cx y y rV) x q\1 + C (D.24) - 2N£N £, ( XQ P$1 + NfcQQPe° B£ = Cw w R f - 2 N j , ( w x R e12 + N £2(( x x R e11 + ( y y Q}1) (D.25) —2N£N £( y u Pp1 + N 2( u u Pp® To compute AT^fwre, one modifies the signal covariance (D.16) by exchanging Q22 <->• R 22; this modification carries through to the end of the calculation and shows: K e ? Ure =2nJ l dz ( ~ 4 / 1) + B e(z ) Q } h z )) (D.26) Eqs. (D.23) and (D.26) are the main results of this appendix, and show how to compute the transfer matrices K ^ f ure for any choice of weight functions. We emphasize th a t even though the derivation is lengthy, the final result is simple. Each term in the integrals (D.23), (D.26) is a product of three functions: a generalized Legendre polynomial of degree a generalized Legendre polynomial of degree £, and a correlation function { ( W W (z), ( w x (z), . .. } which depends only on the pixel weight functions (D.14). The final form of the transfer matrices is simpler in position space than in harmonic space, where the three-way multiplication would be replaced by a sum involving 3j symbols, similar to Eq. (D.9), but with many terms. Finally, we discuss the case in which multipoles are binned into bandpowers. In this case, the bandpower transfer matrix K ^ is related to by: K ^ , = PbpK^jjPp/y, where the matrices P and P define the binning (§3.3). For each band- Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 157 power, we define “binned” versions of the functions Ag(z), Bg(z), Q“ g?(z), R p ( z ) : A b(z) = Y, P b t M * ) B b(z) = I Y (D.27) P btB dz) I Qv(*) = Yff (~ 1 7 ^ ) pevQe(*) v / M*) = Yt, V / Pt>vR?(z) In terms of these, the bandpower transfer matrices are given by: D .3 K W UTe = 2* f K W * ‘ = ^ j \ z i z ( ( M z ) Q b '( z ) + B b( z ) R u { z ) ) A + B b( z ) Q y ( z ) ) b( z ) R y ( z ) (D.28) Efficient calculation o f th e transfer m atrix integrals We have now shown th a t the transfer matrices K ^ , can be represented in integral form (D.28). In this section, we give an algorithm for evaluating the integrals to machine precision, whose running time is 0 ( 1 ^ ^ ) . Here, £max is the largest value of (£ + £') for which a transfer matrix K ^ , must be computed. This is the same cost, within a constant factor, of evaluating the estimators once (§3.4.2). The first observation is that, when computing correlation functions { C ^ ^ (-2)? ^ 1 7 I(Z), j. usjrig Eq (D.14), it is only necessary to sum over multipoles £ < £max- This is because, as argued at the end of the preceding section, the F-m atrix can be w ritten in the form (D.12), and by Eq. (D.13), only multipoles of the weight functions with £ < £max contribute to S. The second observation is th at the integrands in (D.28) are polynomials of degree < 2£max. This can be seen term-by-term after plugging in the definitions of Ag{z), Bg(z). For some terms, such as ( w w (z)Q‘p ( z ) Q ^‘ ( z ) , all three factors are poly Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 158 nomials; for others, such as £ (z)Q^2(z)Q‘p ( z ) , the last is a polynomial and the first two are polynomials times V l — z 2. There are no terms with an odd number of V l — z 2 factors. Because of this second observation, the integrals can be done exactly using Gauss- Legendre quadrature (Press et al., 1992, §4.5) with {(-max + 1) points. Our algorithm for evaluating the integrals is therefore given as follows. First, we compute spherical harmonic transforms {Wgm , W7^ . W^m , W ^ , W ^ } of the weight functions. Second, we compute the correlation functions { ( W W {z), ( W X (z), ... } at each of the {(max + 1) quadrature points, using Eq. (D.14). Third, we compute the 4A^an^ values { A ^ z ) , Bb{z), Qb{z): Rb(z)} at each quadrature point z, using (D.27). Fourth, we loop over bands b, £/, computing K ^ f ure by doing the integrals in (D.28) by Gauss-Legendre quadrature. Let us consider the running time of each of these stages. The first stage is 0{fimax), using fast spherical harmonic transforms. The second stage can be done in time 0{£max) by first summing over m in Eq. (D.14)) at fixed I and then summing over I at fixed z. The third stage is 0{Nband(max)> evaluating the generalized Legendre poly nomials by recursion (C.7). The fourth stage is 0 { N 2and( max) since Gauss-Legendre quadrature is 0 { ( max) once everything has been precomputed at the quadrature points. Putting this together, and noting th at Nband < (m a x , the computational cost of computing the transfer matrices K ^ f ure is 0{(„iax). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX E C O M P U T IN G NO ISE BIA S In addition to the transfer matrix K^jfure, the estimators presented in this paper also require computing noise bias terms N E E , f^BB^)ure (Eq. (3.29)). For real ex periments, which include such complications as 1 / / noise, noise bias for all types of pseudo-Q estimators must be computed by Monte Carlo. Indeed, a practical ad vantage of the pseudo-Cy framework is th at unbiased estimators can be constructed given only Monte Carlo simulations of the noise; no other representation of the noise covariance is required. However, for theoretical studies, it is convenient to have an exact formula for the noise bias in simple cases. In this appendix, we consider noise which is uncorrelated between pixels, and isotropic in each pixel, but not necessarily homogeneous: (Q(x)Q(x)) = {U(x)U(x)) = c^x)2^ / (E.l) Here, we represent the noise by its per-pixel RMS tem perature cr(x) (i.e., units fiK rather than /iK-arcmin). W ith noise covariance given by (E.l), the noise bias defined in (3.29) is given by fiEE = (E.2) x pjBB,pure = ^ X W (x ) 2 + i v 'V a (x)W a (x) + 2N$Wbc( x ) W bc(x) 47T This is derived starting from the definitions of E^m , B i m (Eqs. (3.8), (3.24)) using identity (C. 1 2 ) from Appendix C. 159 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX F FISH ER M A T R IX EVALUATIO N W IT H A ZIM U TH A L SY M M E T R Y In this appendix, we present the details of our method for fast exact evaluation of the Fisher m atrix (3.31), in the case of inhomogeneous, but azimuthally symmetric, noise. A similar method, in the context of CMB tem perature, appeared in (Oh et al., 1999). Since we only consider uncorrelated noise in this paper, the noise covariance can be written <Q(x), Q(x')) = (£/(x), f/(x')> = i((S)2<S<2>(x - x'), (F.l) where rj(9) is arbitrary. If we change variables from {Q, U} to i U = (Q ± iU), and Fourier transform in the azimuthal coordinate </?, by defining ~ /‘2 7r n±(0)=/ dv [Q±iU}(e,<P)ei’nf J0 (f.2) then the noise covariance (F .l) is still diagonal (in both m and 9): (iC(tf)n+(s')) = (n-*(»)n-,(«')> = - «')W (f .3) The point of this change of variables is th at the signal covariance is also diagonal in m (but still dense in 9). Using the results of (Zaldarriaga & Seljak, 1997), one can 160 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 161 show th at the signal covariance is given by: n + * (0 ) n™*(0 ) \ n m' + ,(0') KJ n m! _ ,(0') v ' xEe ' - 2 ,- i X t n (»•»') ^ ^rnm! / ^ (F.4) where we have defined SS ‘ x f j e , s') =f (cfE± c f B)(sYtm(e, mAimtf, o » (F.5) Eqs. (F.3) and (F.4) can be used to efficiently calculate the Fisher matrix FbV = iTV S^So + N r 'S y fS o + N ) - 1) (F. 6 ) For concrete calculation, the continuous coordinate 6 is replaced with a set of N equally spaced values 6\, O2 , ■■•, with spacing A 6. The (2iV)-by-(27V) noise and signal covariances of the variables { II^ ( 0 i), • • •, 11^ ( 0 ^ ) , n ^ ( 0 i), • ■•, I I ^ ( 0 jv)} are then given by N (m ) 4 -7T ( V(9i)2 X sin(0j) lJ A6 0 \ ' c(m ) ij ^ E i 0 ^ (F.7) V(6i)2 s . sin(flj) ll ) - 2 ,- 2 * ,+m(S„S,) 2 - 2 X i m (ei ,ej ) - 2 ,2 * ^ ,« ,) N 2,2x em (oi ,oj ) R e p r o d u c e d with permission of the copyright owner. Further reproduction prohibited without permission. 162 The Fisher m atrix (F. 6 ) is given by summing over m and tracing over 9: -i Fw = E 5 771— = ,&(s£m)(sjjm )+ N ) - 1 S<,m)(skm )+ N ) - 1) (F. 8 ) T Y lm a x i l r f s f (S<0) + + N ) - 1) 'W'Tnax + J 2 T r(s[m)(s[,m) + N ) _1s [7 l)(S^m) + N ) _ 1) m=1 where “Tr” is a (< 2 N)-hy-(2N) trace. Evaluating the spin harmonics by recursion in t, the computational cost of each Fisher m atrix element is 0 ( N Q m max), versus O(Npix) for brute force calculation without exploiting azimuthal symmetry. (For the mock surveys studied in this paper, with Healpix resolution N slc[e = 256, one has N q ~ TO and Npix ~ 104.) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX G RELATION TO FK P In this appendix we prove the assertion, stated without proof at the end of §4.5, that our ansatz for the optimal weight function reduces to the FK P (Feldman et al., 1994) approximation under appropriate conditions. The FKP approximation is used to incorporate the effect of inhomogeous noise when estimating the amplitude of modes th a t are much shorter than the size of the survey and also of the typical variation scale of the noise. FK P propose a weight w f K P oc t-i , <x? + S!-1C V (G .l) -5 where Iq is the wavenumber where the power spectrum is being estimated, Q, is the solid angle of each pixel, and af the noise variance in pixel i. Working in the flat sky approximation, we will now show th at under the FKP assumptions, W PKP satisfies Eq. (4.25). Assuming th a t we are estimating power in a narrow band a near Iq we have, o ij CT where lo = Iq ( cos _ 2r — °i ij "f" = / f d2l ( G ' 2 ) [ d(pior -ih-(xi-x-j) // ' 2tt ip\Q, sin</?i0). Plugging this into Eq. (4.25) we get, E c v c ^ w f KP = J 2 ^ Stj + I ^ L c ie- ^ - ^ (/ 2 tt /a + f1-1(7,, do 163 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (G.3) 164 The first term, coming from the noise contribution to the covariance, simply gives: 2 - 5— -----■ o f + n - 'C io (G.4) To make progress on the second term we need to use some of the FKP assumptions. First we willuse th a t we are interested in a situation in which the noise varies slowly and write, ^ o - i n ~ / ( xi) = / ( x *) + V /(x*)(xj - Xj) + • ■• , aj + “ c lo ~2 (G.5) so th at the second term in (G.3) becomes £ / (S n sr ( /(x ,) + V /(x ,)(Xj - X i) + • • •) (G.6) 3 For the first of these terms we get: 3 /21 + O ( l//0i2) (G.7) h where R is the size scale of the patch. Now let us check what happens with the additional terms in Eq. (G.6 ) describing anisotropy of the noise. The additional contribution is proportional to: V x / • V loq o, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (G.8 ) 165 which is suppressed by 1/I qR n with respect to the leading term where Rjy gives the scale of variation of the noise. Putting all of this together, we see that E C ijC?j W ' f K P = 1 + 0 ( \ / l aR) + O ( l / l 0R N ) j (G.9) i.e., for diagonal inhomogenous noise when considering modes much shorter than both the scale of variation of the noise and the patch, our ansatz (4.17) for the pseudo-Q weight function reduces to FKP. Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. A P P E N D IX H N U M E R IC A L O PTIM IZA TIO N OF W E IG H T F U N C T IO N S In this appendix we supply the details of our procedure, described briefly in §4.7, for numerically optimizing p s e u d o - w e i g h t functions given the noise RMS cr(x) in each pixel. We first treat the more difficult case of optimizing the B-mode weight function; as we will then see, optimizing the B-mode weight function can be obtained as a special case. As described in §4.7, the B-mode “weight function” really consists of three functions W (x), W\{x), W 2 (x) with spins 0, 1 , 2 respectively. These are varied independently in order to minimize the total expectation value (Ca ) of the pseudoQ estimator (with contributions from signal and noise), subject to the normalization constraint E w w = 1- (H 1 ) X To solve this minimization problem, we first rewrite the expectation value (Ca ) to be minimized in a form which makes the dependence on the weight functions easier to understand. Considering first the noise contribution to (Ca ), a short calculation shows th at |W l(x )p , \Wi(x)\{C„)n„ise = ^ T , w e A * ) { w { x f + 4 J f lx , v- l)(« + 2 ) (t - m i + l ) ( l + 1) (H-2) where W( denotes the t weighting within the bandpower under consideration. (Through out this paper we have taken this to be = £(£ + 1)/(27t) inside the band and zero outside the band, as in Eq. (4.12).) The signal contribution to (Ca ) is more difficult to compute, but follows from the 166 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 167 results of Appendix D. In order to write it down, we decompose the pieces of the weight function in spherical harmonics: nw £m Y ,W (x )Y fJ x ) (H.3) nG £m (J fh S a h h iE L M ' a£m X nE £rn a£m V ^ - i — 2 + ~ 2Ye*m ^ 1 2Y‘ m ( x ) ~ ~ 2Y‘m (x ) (The spin-2 piece is the E /B decomposition from Eq. (4.2), whereas the spin-1 piece is the gradient/curl decomposition of a vector field.) The signal contribution to (Ca } can then be written in the following form: ( Ge r WW (Ca) sig =E £m nW* £m „G* a£m E* £m rG£WG /~<WE U£ + 0C* nB* £m £m r cc U£ U£ r CB r BB r'W G r W E \ r GG r GE s-iGE U£ r CB s iE E £ / afm J ( a„W £m G a£m nE \ im \ (H.4) "?m ) Here, the power spectra which appear depend on the bandpower weighting Wg and signal power spectra. For the latter, it is convenient to introduce the notation C E = Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 168 (c F ^ ',sig ± c f ^ ' 3*g). The power spectra in Eq. (H.4) are then given by: c,w w -L 21" + 1 dz y 21 " a WG = L Cp d22(z )d22(z ) ~ ^ v *^2 ,—2 167T Vi" d z T . (. + Ifcr y V(<" - ! ) « " + 2) + &£/ < A - 2 ( Z) A . - 2 ( Z) 2t" + 1 \ -L dzT ,( ^ m nn \ J W'i» t V ( < " - W " + ! ) « ’" + 2) X^ 0 2 (Z) —C,^ ^ 2 2 (-2:)^ 0 2 (2:) + C\GG - VI" X r <CC = L d,2_2(z)dQ2(z ) 2£" + l \ 4W, 'I" 16tt y ( ^ - l ) ( £ " + 2 ) dzY. { m nn \ / (^) 2 Vty>// 1' X4 l ( 2:) _ n WE 2 (tyt1 Xk4-2( j i (^) - y. ^i._i:) '4, - 2 ( < f e W 2<" + 1i i,- 1:-) 4W> 16 tt y ( £ " - l ) ( f ' + 2) m nn \ £ 'l " x |^C'^’rfil(^)^22(2:)^ ll( 2:) + Cp rfi!_ i(^)^2 ,-2 (2:) ^ l - l ( 2:) y > /2 ^ '+ l\ a GE 2 Wff, 167r J ( £ " - l ) ( £ " + 2) x/£"(£" + l) Zfjf\ x \C p di 2{z)d 22 (z) ~ C p d [ _ 2 (z)d 2 - 2 (z ) ^01 (z ) a , y-^/2£" + l \ CB - S . dzY [ /~iEE ■ / 16tt J (£" - \)(£» + 2 ) y /£"(£" + 1 ) m nn \ X \Cp d i2(z)d22(z) + C£l d{ _2(z)d2_2(z) ^oi(z) 1\ we„ 16tt J (£"^L)f?"(f?" + !)(£" + 2 ) ' 2 £" + dz y Vi" x c-BB 2Won 2 2 (^)^ 2 2 (z) / _ Cp d 2 - 2 ( z ) d 2 -2 ( z ) ^ 0 0 (z) 2£" + 1 -L ^ e ( VI" v 16tt y (^" - 1)£"(£" + l ) { £ " + 2) x C ^ d 22 (z).d^{z) + Cp d 2 _ 2 (z )d 2 , - 2 (z ) ^ 0 0 (z ) Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. (H.5) 169 where d? ,(z) is the reduced Wigner rotation m atrix element (Varshalovich et ah, 1988); note th at dg0(z) is just the Legendre polynomial Pg(z). We have written the power spectra as position-space integrals; if convenient these expressions can be converted to harmonic space using the identity / 1 / // d z 4 , ± A z)4,±2(z) 4 - s , ± 2 T A z) = 2 1 £ £' s' =F2 £" ±2 X =F s' y (H.6) Armed with the expressions (H.2) and (H.4) for the noise and signal contribu tions to {Ca ), a procedure for minimizing (Ca ) — (Ca )sig + (Ca) noise subject to the constraint (H .l) is not difficult to come by. Representing the three components of the 5-m ode weight function by a single length-(5iVpjx) vector w, both (C'a )sig and (Ca) noise are quadratic in w, so we can write them compactly as ' rp rp (Ca ) = (Cot)sig d" (Ca)noise = ^ Qs'W T w Qnw (H.7) where Qs and Qn are (5Arpix)-by-(5Arpjx) matrices. In this notation, the constraint (H.l) can be written vT w = 1, where v is a vector with Apjx entries equal to 1(corre sponding to the spin - 0 piece of the weight function) and (4Apix) entries equal to 0 (corresponding to the higher-spin pieces). The solution to this minimization problem is simply _ (Qs + Q n ^ V W°P t ~ vT(Qs + Q n ) ~ lv /TJ oN (H 8) Since dense inversion of a (57Vpix)-by-(5A^pjx) m atrix is computationally infeasible, our approach is to compute (Qs + Q n)- 1 w using conjugate gradient inversion (Press et al., 1992). In order to use conjugate gradient inversion, two ingredients are needed. First, Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 170 one needs an efficient way to multiply a vector by ( Q s + Q n ) , even though this matrix will be too large to store in dense form. This is easily obtained from Eqs. (H.4) and (H.2 ); note th at multiplying by Q s requires transforming from pixel to harmonic space and back. The second ingredient is a preconditioner, or approximate inverse of (Qs + Q n ) , used to speed convergence; the better the preconditioner approximates (Qs + Q n)- 1 , the more rapidly the conjugate gradient search will converge. We use the following simple preconditioner, obtained by keeping only the diagonal of (Qs + Q n ) , then inverting: \ a0 + a \x ) a f + a 2(x) 0 2 + a 2(x) (H',9) where we have defined i °\ 4 v 7 = e ( ^ W i 7 = £ i v ( ^ v W g + c?c) £ + q b b ). 7 The performance of this preconditioner depends mainly on the noise level. For noise levels > 20 /iK-arcmin, only a few CG iterations are needed; for the noise levels of our fiducial experiment (~ 5.75 /iK-arcmin), around 100 iterations are needed; and for noise levels < 1 /iK-arcmin, the CG search does not converge. A better preconditioner may accelerate weight function optimization, but we have not attem pted to find one in this paper. In summary, we have now arrived at a method for numerically optimizing the 5-m ode weight function, given the bandpower (specified by the t weighting W#) and Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 171 noise RMS a(x). One precomputes the power spectra in Eqs. (H.5), then computes (Qs + Q n)~lv by conjugate gradient inversion, using the preconditioner (H.9), to obtain the optimized weight function given by (H.8 ). Finally, let us discuss the completely analagous but simpler problem of optimizing the E-mode weight function. In this case, the weight function is just a scalar W(x). The noiseand signal contributions to (Ca ) are given (in analogy to Eqs. (H.2), (H.4) above) by (Ca)noise = £ E Wt <T2 ( x ) W ( x f (H .ll) £x (C a)sit (H-12) = £m where c ww drf r1 dz ^2 (^± i) w><4w\c+42(z)dZ(z) + c^dl_2{z)4[_2(z) £'£" (H.13) Interpreting these as an iVpjx-by-A/pix m atrix equation (Ca) = (Ca)noise T (Ca)sig = w Qn'W T w QgW (H.14) the optimized E-mode weight function is given by (Q s + Q n ) ~ l V W^ = v T {Qs + Qn)- ^ - <H' 15) where v is a length-Epjx vector consisting of all l ’s. Eq. (H.15) is evaluated using conjugate gradient inversion with preconditioner given by W (x ) —> W ( x )/ (ctq ' 02 + cr(x)2). Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. R EFE R E N C E S Amarie, M., Hirata, C., & Seljak, U. 2005, Phys. Rev., D72, 123006 Barkats, D., et al. 2005, Astrophys. J., 619, L127 Bond, J. R., Jaffe, A. H., & Knox, L. 1998, Phys. Rev., D57, 2117 Bunn, E. F. 2002, Phys. Rev., D65, 043003 Bunn, E. F., Zaldarriaga, M., Tegmark, M., k Oliveira-Costa, A. d. 2003, Phys. Rev., D67, 023501 Challinor, A., k Chon, G. 2005, Mon. Not. Roy. Astron. Soc., 360, 509 Challinor, A., k Lewis, A. 2005, Phys. Rev., D71, 103010 Efstathiou, G. 2006, Mon. Not. Roy. Astron. Soc., 370, 343 Feldman, H. A., Kaiser, N., &; Peacock, J. A. 1994, Astrophys. J., 426, 23 Goldberg, J. N., et al. 1967, J. Math. Phys., 8 , 2155 Gorski, K. M., et al. 2005, Astrophys. J., 622, 759 Hansen, F. K., & Gorski, K. M. 2003, Mon. Not. Roy. Astron. Soc., 343, 559 Hansen, F. K., Gorski, K. M., k Hivon, E. 2002, Mon. Not. Roy. Astron. Soc., 336, 1304 Hinshaw, G., et al. 2006 Hirata, C. M., k Seljak, U. 2003, Phys. Rev., D 6 8 , 083002 Hivon, E., et al. 2001 Hu, W., k Okamoto, T. 2002, Astrophys. J., 574, 566 Kamionkowski, M., Kosowsky, A., k Stebbins, A. 1997a, Phys. Rev. Lett., 78, 2058 — . 1997b, Phys. Rev., D55, 7368 Kovac, J., et al. 2002, Nature, 420, 772 Lewis, A. 2003, Phys. Rev., D 6 8 , 083509 — . 2005, Phys. Rev., D71, 083008 Lewis, A., Challinor, A., & Turok, N. 2002, Phys. Rev., D65, 023505 172 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission. 173 Montroy, T. E., et al. 2006, Astrophys. J., 647, 813 Newman, E., & Penrose, R. 1966, J. Math. Phys., 7, 863 Ng, K.-W., & Liu, G.-C. 1999, Int. J. Mod. Phys., D 8 , 61 Oh, S. P., Spergel, D. N., &; Hinshaw, G. 1999, Astrophys. J., 510, 551 Okamoto, T., & Hu, W. 2003, Phys. Rev., D67, 083002 Oxley, P., et al. 2004, Proc. SPIE Int. Soc. Opt. Eng., 5543, 320 Press, W. H., Flannery, B. P., Teukolsky, S. A., & Vetterling, W. T. 1992, Numerical Recipes: The Art of Scientific Computing, 2nd edn. (Cambridge (UK) and New York: Cambridge University Press) Readhead, A. C. S., et al. 2004 Seljak, U., & Zaldarriaga, M. 1997, Phys. Rev. Lett., 78, 2054 Sievers, J. L., et al. 0900 Smith, K. M., Hu, W., & Kaplinghat, M. 2006, Phys. Rev., D74, 123002 Smoot, G. F., et al. 1992, Astrophys. J., 396, LI Spergel, D. N., et al. 2003, Astrophys. J. Suppl., 148, 175 — . 2006 Taylor, A. C., et al. 2004 Tegmark, M. 1997, Phys. Rev., D55, 5895 Tegmark, M., & de Oliveira-Costa, A. 2001, Phys. Rev., D64, 063001 Varshalovich, D. A., Moskalev, A. N., & Khersonskii, V. K. 1988, Quantum Theory of Angular Momentum (World Scientific) Wandelt, B. D., Hivon, E., & Gorski, K. M. 1998 — . 2001, Phys. Rev., D64, 083003 Yoon, K. W., et al. 2006 Zaldarriaga, M. 1998, Astrophys. J., 503, 1 Zaldarriaga, M., & Seljak, U. 1997, Phys. Rev., D55, 1830 — . 1998, Phys. Rev., D58, 023003 Reproduced with permission of the copyright owner. Further reproduction prohibited without permission.

1/--страниц