# Statistical analysis of the cosmic microwave background: Power spectra and foregrounds

код для вставкиСкачатьSTATISTICAL ANALYSIS OF THE COSMIC MICROWAVE BACKGROUND: POW ER SPECTRA AND FOREGROUNDS BY IAN J. O ’DWYER B.A.(Hons.), Cambridge University, 1994 M.A. (Cantab.), Cambridge University, 1996 M.S., University of Illinois at Urbana-Champaign, 2001 DISSERTATION Submitted in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Astronomy in the Graduate College of the University of Illinois at Urbana-Champaign, 2005 Urbana, Illinois R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. UMI Number: 3199102 INFORMATION TO USERS The quality of this reproduction is dependent upon the quality of the copy submitted. Broken or indistinct print, colored or poor quality illustrations and photographs, print bleed-through, substandard margins, and improper alignment can adversely affect reproduction. In the unlikely event that the author did not send a complete manuscript and there are missing pages, these will be noted. Also, if unauthorized copyright material had to be removed, a note will indicate the deletion. ® UMI UMI Microform 3199102 Copyright 2006 by ProQuest Information and Learning Company. All rights reserved. This microform edition is protected against unauthorized copying under Title 17, United States Code. ProQuest Information and Learning Company 300 North Zeeb Road P.O. Box 1346 Ann Arbor, Ml 48106-1346 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C ertificate of C ommittee A pproval University o f Illinois at U rbana-Cham paign G raduate C ollege July 1,2005 We hereby recommend that the thesis by: IAN J. O'DWYER Entitled: Statistical Analysis of the Cosmic Microwave Background: Power Spectra and Foregrounds Be accepted in partial fulfillment o f the requirements fo r the degree of: Doctor of Philosophy Signatures: ^ i— tr"" — D irector o f Research - Benjamin Wandelt H ead o f D epartm ent - Lewts Snyder A C om m ittee on Final Exam ination C hairperson - Benjamin Wandelt Committee M ember - Robert Brunner C ommittee M em ber - Edmund Sutton Committee M ember - Laird Thompson Committee M em ber - Committee Member - j M m i * Required for doctoral degree but not for m aster’s degree R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. A b stract In this thesis I examine some of the challenges associated with analyzing Cosmic Mi crowave Background (CMB) data and present a novel approach to solving the problem of power spectrum estimation, which is called MAGIC (MAGIC Allows Global Infer ence of Covariance). I begin with a brief overview of Big Bang cosmology and describe the proper ties of the CMB and the underlying theory. I then look specifically at the problems associated with power spectrum estimation. In light of the computational difficulty of a brute force approach to power spectrum estimation I review several approaches which have been applied to the problem and show an example application of such an approximate method to experimental CMB data from the Background Emission Anisotropy Scanning Telescope (BEAST). BEAST is a ground based CMB experi ment that is located at the White Mountain Research Station, Barcroft CA run by the University of California Office of Research. After examining this approximate power spectrum estimator I introduce MAGIC, a new approach to power spectrum estimation which is based on a Bayesian statisti cal analysis of the data and utilizes Gibbs Sampling. Rather than trying to directly compute the power spectrum likelihood, a task which has hitherto been computation ally impossible, we choose to sample from possible realizations of the power spectrum likelihood given the experimental data and thus build up a complete statistical de scription of each individual multipole component, Ci, of the power spectrum. I have implemented this technique and will demonstrate its application to the all-sky Wilkiniii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. son Microwave Anistropy Probe WMAP data. The results are in broad agreement with those obtained originally by the WMAP team. However, since MAGIC gen erates a full description of each Ci it is possible to examine several issues raised by the best-fit WMAP power spectrum, for example the perceived lack of power in the quadrapole and octopole (1=2 and 3). It is found that the distribution of CVs at low I are significantly non-Gaussian and that the observed ’low’ values at 1=2 and 3 are in fact consistent with the best-fit power spectrum. For example, the best fit ACDM model is consistent with the C 2 inferred from the MAGIC analysis of the combined WMAP Q +V +W bands, with a 10% probability of an even larger theoretical C 2 . Based on the exact analysis presented here, the “low quadrupole issue” can therefore be attributed to a statistical fluctuation. Finally, I examine the issue of Galactic foreground contamination on CMB exper iments and describe the principle foregrounds; dust emission, free-free emission and synchrotron emission. I show that it is possible to include the foreground compo nents in a self-consistent fashion within the statistical framework of MAGIC and give explicit examples of how this might be achieved. I then provide models of the three pimary Galactic foregrounds which are currently being utilized in a re-analysis of the WMAP data. Foreground contamination will become an increasingly important issue in CMB data analysis as we try to detect very small polarization signals in high resolution CMB datasets such as Planck. The ability of this new algorithm to pro duce an exact power spectrum in a computationally feasible time, coupled with the foreground component separation and removal is an exciting development in CMB data analysis. When considered with current algorithmic developments such as the ability to include asymmetric beam shapes and deal with polarized data, the future of the Gibbs sampling approach shows great promise. iv R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. For Mum and Dad v R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. A ck n ow led gm en ts Since the earliest times, man has wondered about his origins and his place in the Universe. We human beings wonder who we are and where we came from, and these questions stir the imagination of even the least curious among us. This curiosity has led from humble beginnings to the modern science of cosmology. Using the latest technology and observational methods we are able to probe the Universe in an amazing variety of new ways, and we are just beginning to uncover some of its secrets. After spending some very enjoyable time in the financial industry I realized that I was not getting any closer to the answers to any of the really interesting questions in life. I applied to graduate school and started my career taking classes and working in observational astronomy. It was not long before I became interested in cosmology and that interest ultimately led to this thesis. First and foremost, my thanks go to Ben Wandelt, an excellent advisor who has guided me along the path of the graduate student and frequently had to back-track to find me stuck in some muddy hole of my own making. I would like to thank Ben for his ceaseless struggle to cram cosmologically useful information into my brain and for being a friend as well as an advisor. The BEAST experiment was a difficult one to analyze and there were many obsta cles to be overcome before the CMB power spectrum was obtained. I had no previous experience with a CMB data set and Peter Meinhold provided ceaseless assistance and guidance for which I am very grateful. The sampling approach to CMB power spectrum estimation was first suggested by vi R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. a group at the University of Illinois (Wandelt, Larson, and Lakshminarayanan, 2004) and simultaneously an equivalent approach was proposed by a group at NASA’s Jet Propulsion Laboratories (JPL) (Jewell, Levin, and Anderson, 2004). Subsequent to these first ideas being published the groups from Illinois and JPL agreed to collaborate on this project, and its development has been a joint effort ever since. The results published in this thesis, which relate to the implementation of the MAGIC algorithm, are my own original work. However, it would be unfair and misleading to imply that I have been the only researcher working on the development of these algorithms and their application to cosmologically interesting data and I have endeavored to avoid this implication in the text. I wish to acknowledge and thank my collaborators at JPL, in particular Jeff Jewell and Hans Christian Eriksen for their hard work and useful discussions. At Illinois I owe a debt of gratitude to David Larson and Arun Lakshminarayanan for all their early work on MAGIC and frequent stimulating dialogue. I would like to thank Anthony Lewis for the use of his parallel spherical har monic transform routines, which were adapted for use in the MAGIC algorithm and helped speed it up greatly. My thanks also go to Carlo Baccigalupi and the Planck CTP Working Group for making their CMB foreground extrapolation codes publicly available. Last and not least, of course, I owe a huge debt of gratitude to my parents for their constant support, encouragement and love which has sustained me throughout my graduate student career, but especially at the low points. Heidi Hazeu gave me the confidence it took to give up a very comfortable life in finance to come and do what I love and Lew Snyder and You-Hua Chu made the transition to Illinois possible. Finally, my thanks go to my dog Karma for going running with me when I needed to get away from it all. vii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Table o f C on ten ts List o f T ables................................................................................................................ xi List o f F igu res.............................................................................................................. xii Chapter 1 In trod u ction ......................................................................................... 1 Chapter 2 T heory o f the Cosm ic M icrowave B ackground.......................... 2.1 Intro d u ctio n .................................................................................................. 2.2 The Big B a n g ............................................................................................... 2.2.1 Fluctuations in the C M B ............................................................... 2.3 N o ta tio n ........................................................................................................ 2.4 Newtonian Perturbation T h e o ry ............................................................... 2.5 The CMB Angular Power S p ectru m ......................................................... 2.5.1 Cosmic Variance ............................................................................ 2.5.2 Example Power S p e c tr u m ............................................................ 2.6 Primary CMB Anisotropies ...................................................................... 2.6.1 Primordial P erturbations............................................................... 2.6.2 Ordinary Sachs Wolfe E ffe c t......................................................... 2.6.3 Acoustic O scillatio n s..................................................................... 2.7 Secondary CMB A n iso tro p y ..................................................................... 2.7.1 Integrated Sachs Wolfe E f f e c t ...................................................... 2.7.2 Early Integrated Sachs-Wolfe E ffe c t............................................ 2.7.3 Late Integrated Sachs-Wolfe E f f e c t ............................................ 2.7.4 Rees-Sciama E f f e c t........................................................................ 2.7.5 Gravitational Lensing..................................................................... 2.7.6 R eionization..................................................................................... 2.8 C o nclusions.................................................................................................. 6 6 7 9 9 11 13 15 16 17 18 18 19 23 23 24 24 24 25 25 26 Chapter 3 Power Spectrum E stim ation -Challenges and Solutions .. 27 3.1 In troduction................................................................................................. 27 3.2 The Problem .............................................................................................. 27 3.3 Solutions........................................................................................................ 29 3.3.1 The Brute Force A pproach........................................................... 29 3.3.2 Experiment Specific T ech n iq u es.................................................. 29 3.3.3 Approximate Methods for PowerSpectrum Estimation . . . . 30 3.4 The MASTER M e th o d .............................................................................. 30 viii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 3.4.1 F orm alism ............................................................................................ 3.4.2 B in n in g ................................................................................................ 31 33 C hapter 4 A pplication o f an A pproxim ate Power Spectrum E stim ator to B E A ST D a t a ..................................................................................................... 35 4.1 In tro d u ctio n ...................................................................................................... 35 36 4.2 Description of the BEAST experiment ....................................................... 37 4.3 BEAST and M A S T E R ................................................................................... 4.3.1 The Data and Map M a k in g ........................................................ 37 40 4.3.2 Beam and Pointing R eco n stru ctio n ............................................... 40 4.3.3 Foreground R e m o v a l......................................................................... 4.3.4 D ata Analysis P ip e lin e ..................................................................... 41 46 4.3.5 Error C h e c k in g .................................................................................. 4.3.6 Computational C onsiderations........................................................ 47 48 4.4 The Power S p e c tru m ..................................................................................... 4.5 C o nclusions..................................................................................................... 50 C hapter 5 A Bayesian Approach to Power Spectrum E stim ation M A G IC ...................................................................................................................... 52 5.1 In troduction..................................................................................................... 52 5.2 Required MathematicalBackground............................................................. 53 5.2.1 Bayesian S ta tis t ic s ........................................................................... 53 5.2.2 Gibbs S am plin g................................................................................. 54 5.2.3 Conjugate Gradient M e th o d ........................................................... 55 5.3 The MAGIC A lgorithm .................................................................................. 56 5.3.1 B ack g ro u n d ........................................................................................ 56 5.3.2 The MAGIC alg o rith m ..................................................................... 57 5.3.3 Deriving the Sampling E q u a tio n s .................................................. 58 5.3.4 Constructing the D en sities.............................................................. 59 5.3.5 Signal Sam pling................................................................................. 61 5.3.6 Including Experimental Beams ..................................................... 62 62 5.3.7 Ci Sam pling....................................................................................... 5.4 Computational S c a lin g ................................................................................. 63 Chapter 6 A pplication to W M A P d a ta .......................................................... 65 6.1 In troduction.................................................................................................... 65 65 6.2 The WMAP D a ta .......................................................................................... 6.3 Applying MAGIC to WMAP .................................................................... 67 6.3.1 Im plem entation................................................................................. 67 6.3.2 Foreground R e m o v a l....................................................................... 69 6.4 Computational C onsiderations..................................................................... 71 6.4.1 Parallelization.................................................................................... 72 6.4.2 Facilities............................................................................................. 73 6.5 Mean Signal M a p .......................................................................................... 74 6 .6 Power Spectrum .............................................................................................. 75 ix R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 6.7 Discussions and Conclusions 77 Chapter 7 D etailed Foreground M o d e ls.......................................................... 80 80 7.1 In tro d u ctio n ................................................................................................. 7.2 The F o re g ro u n d s........................................................................................ 82 7.2.1 Synchrotron Emission ............................................................ 82 7.2.2 Free-Free E m iss io n .......................................................................... 83 84 7.2.3 D u s t ................................................................................................... 7.2.4 Point S o u rc e s .................................................................................... 85 7.3 Model E q u a tio n s ........................................................................................ 86 7.4 Foreground Equations .............................................................................. 87 88 7.5 Completing the BayesianD e sc rip tio n ....................................................... 7.5.1 Deriving the Sampling E q u a tio n s .................................................. 90 7.5.2 Including Instrument C h a ra c te ristic s........................................... 92 92 7.6 Implementations ........................................................................................ 93 7.6.1 Possible Choices of Basis V e c to r s ................................................. 97 7.7 A Specific CMB Foreground B a sis ........................................................... 7.7.1 Dust Basis Set ................................................................................ 98 ............................................................ 100 7.7.2 Synchrotron Basis Set 7.7.3 Free-Free Basis S e t .......................................................................... 102 7.8 C o nclusions................................................................................................. 104 Chapter 8 C o n clu sio n ............................................................................................ 106 8.1 Summary of P ro g re ss................................................................................. 106 8.2 The Promise of Gibbs SamplingM eth o d s................................................ 108 R e fer e n c es......................................................................................................................109 V ita ................................................................................................................................... 113 x R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. List o f Tables 2.1 Brief Thermal History of the Universe........................................... 2.2 Parameters used to calculate example power spectrum.............. 17 6.1 Characteristics of the WMAP frequency bands........................... 66 8 7.1 Parameter values for two-component foreground model dust extrapo lation 99 7.2 Free-free brightness temperatures per unit Rayleigh assumingTe=7000K. 103 xi R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. List o f F igures 2.1 2.2 4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 Example power spectrum. Power spectrum generated using CMBFAST and cosmological parameters determined by the WMAP team. Variation in the acoustic peaks of the CMB with changing values of Photograph of the BEAST telescope......................................................... The HEALPix pixelization scheme............................................................. Comparison of simulated and actual BEAST maps in Kelvin............... Overview of the steps in the BEAST simulation and analysis pipeline. Unbinned transfer function for BEAST..................................................... Mode-mode coupling kernel for the BEAST experiment........................ CMB anisotropy power spectrum for the BEAST ex p erim en t Combined CMB anisotropy power spectrum for BEAST and other ex periments......................................................................................................... 16 23 37 39 42 43 44 46 49 50 6.9 The WMAP Satellite.................................................................................... The WMAP Beam Characteristics............................................................. The WMAP KpO Mask................................................................................ A sampled WMAP residual monopole/dipole m ap................................. A sampled WMAP infinite variance foreground map............................. The posterior mean CMB signal for WMAP............................................ The WMAP Power Spectra......................................................................... The probability distributions of the Ce samples for a selected set of i. Graph for evaluating theoretical power spectrum models at the low I. 7.1 7.2 7.3 7.4 Dust emission map extrapolated from 100pm DIRBE maps..................... 100 Synchrotron spectral index m ap................................................................. 101 Synchrotron emission map extrapolated from 408MHz Haslam map. . 101 Free-free emission map................................................................................. 104 6.1 6.2 6.3 6.4 6.5 6 .6 6.7 6 .8 xii R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 66 67 68 70 71 74 76 77 78 C hapter 1 In trod u ction The Cosmic Microwave Background (CMB) radiation provides a window into very early times in the history of our universe. The existence of such a background was predicted as a consequence of Big Bang models of the universe by Gamow [1949] and Alpher Sz Herman [1950]. When Penzias and Wilson inadvertently observed the CMB in 1965 [Penzias & Wilson, 1965], researchers at Princeton University led by Robert Dicke immediately realized the importance of the discovery [Dicke et ah, 1965]. Although the presence of the 2.725K background had been established, it was another 28 years before we were first able to study the CMB in any detail with the COBE/DMR satellite mission. This was the first time that the tiny temperature anisotropies predicted by General Relativity and inflationary theories of the universe were observed in the CMB. These anisotropies are the result of physical processes at very early times in the evolution of our universe and allow us to examine the early universe and quantitatively describe it. I provide an overview of CMB theory and examine the physical processes which produce the CMB in detail in chapter 2 . When taken together with other constraints on the evolution of the universe such as big bang nucleosynthesis, we are able to formulate and test our theories of the universe and discard those which are inconsistent with our observations of the CMB. Observations of the CMB can also be used to constrain new ideas for models of the 1 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. universe such as those motivated by string theory. If we are to use the CMB in this way, we need a convenient way to characterize its properties. As we see in Chapter 2, for a Gaussian, isotropic CMB the power spectrum of CMB anisotropies provides a convenient way to do this and contains all of the the statistical information about the CMB. Since the COBE mission, many experiments have measured the temperature anisotropies in the CMB and for each the power spectrum has been estimated. Per haps the most advanced to date, certainly at angular scales down to a few tenths of a degree is the WMAP satellite mission, but many other experiments have probed the power spectrum at other scales, allowing us to build up a fairly complete picture of the anisotropies in the CMB. From the power spectrum we can extract cosmological parameters which describe our knowledge of the history, geometry and content of the universe. In addition, we can compute uncertainties on these parameters to tell us how likely a particular cosmological model is, given the experimental data. Power spectrum calculation is a mathematically, numerically and computationally complex and expensive task, which we examine in Chapter 3. We will examine the nature of this complexity and look at some approaches to the problem which have been pro posed over the last few years. Typically, these fall into two categories. The first are methods which can generate an exact power spectrum in a computationally feasible way, but which make strong assumptions in order to do this. Such approaches typi cally assume certain scanning strategies, or are limited to white noise models and so forth. The second class of approaches are the approximate power spectrum estima tors. These have the benefit of computational feasibility and a great deal of flexibility in terms of their ability to handle varied scanning strategies, noise properties and so on. However, as the name suggests, these estimators do not produce an exact power spectrum, but are approximate at some level. Until recently, this was the state of the art and no general, exact and computationally feasible power spectrum estimation 2 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. technique was available. Thus far we have established the utility of the power spectrum in describing the temperature anisotropies in the CMB, and we have examined some well tested power spectrum estimation tools and methods, albeit that these tools are approximate or may be of limited applicability. In Chapter 4, I will go on to demonstrate the appli cation of an approximate power spectrum estimator, the MASTER method [Hivon et al. , 2002], to data from the BEAST CMB experiment. The MASTER method is a pseudo-power spectrum estimator, meaning th at we calculate the power spectrum on a sky which has had pixels excised from it. These pixels might be removed for a variety of reasons, for example, due to contamination from Galactic foregrounds such as dust, free-free and synchrotron emission. Clearly, the power spectrum of this cut sky is not the same as the power spectrum of the full sky and the MASTER method uses Monte Carlo simulations to attem pt to correct for this difference and produce a full-sky power spectrum. BEAST is a ground-based CMB experiment located at the White Mountain Research Station in Barcroft, CA. Originally designed as a bal loon borne experiment, it gathered the data analyzed in this thesis in a ground-based observing mode from July to December 2001 and subsequently in February, August and September of 2002. As a CMB experiment located at a conventionally accessible observing site, rather than at the South Pole, BEAST presented unique challenges in data analysis and power spectrum estimation. The data were heavily contaminated with correlated noise and presented other difficulties typical of ground based CMB observations. Once the data were analyzed, a CMB power spectrum consistent with previous determinations was obtained. While approximate approaches to power spectrum estimation undoubtedly have their uses, they are, by definition, sub-optimal. Given the enormous expense of launching a satellite mission such as WMAP or the upcoming Planck mission, it would be nice to have a computationally feasible way to extract the exact power 3 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. spectrum from the data. Indeed, a novel solution has been developed by Wandelt et al. [2004] and Jewell et al. [2004], and the implementation of this method and its application to the WMAP data will be the main thrust of this thesis. The central idea of this statistical approach is not to calculate the power spectrum directly, which is where much of the computational cost and complexity arises, but to sample from possible realizations of the power spectrum conditioned on the data. Ideas from Bayesian statistics and Gibbs Sampling help to make this new scheme possible. In Chapter 5, I describe this new method, called MAGIC, in detail and look at its benefits and limitations. The computational demands of this method still require the use of supercomputing facilities and massive parallelization of the code and these computational demands are also discussed in detail. Since the WMAP data is the current state of the art in all-sky CMB experiments, it is a natural place to test this new method. After all, a new technique is of little practical value if its implementation is not possible. After briefly describing the WMAP experiment, I develop an implementation of MAGIC for the WMAP data in Chapter 6 . This includes extracting the power spectrum and also a full description of the likelihoods for each including any non-Gaussianities present. This allows some features of the WMAP data that have caused concern and debate to be addressed in a quantitative way. For example, it is possible to address the issue of a perceived lack of power in the quadrapole term. Once the full posterior for the C2 is explored, MAGIC shows that the WMAP data are, in fact, consistent with an even lower value of the quadrapole moment. One area of CMB data analysis that has been treated in a somewhat unsatis factory fashion in power spectrum estimation codes to date, is removing foreground contamination from the data. Even within the MAGIC method described in Chapter 5, the treatm ent of foregrounds is extremely simplistic. Dust, synchrotron emission and free-free emission from our Galaxy all contaminate CMB data and in order to 4 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. obtain an accurate estimate of the power spectrum, particularly for very small signals such as those from polarization, these foregrounds need to be estimated and removed. In Chapter 7, I present a method for self-consistently including foreground models in the Gibbs sampling framework of MAGIC. The method is quite general and does not restrict the foregrounds to any particular basis set, but rather allows them to be represented in the basis which is most convenient for their description. For example, the monopole and dipole components are most easily dealt with in the spherical har monic basis, while more traditional foregrounds such as dust are perhaps more simply treated in pixel space. I discuss some possible choices of specific foreground models which could be used to analyze foreground contamination in CMB data. Having established the mathematical description of the foreground sampling and discussed some possible choices of basis vector, I develop a specific set of foreground basis vec tors. This is a set of maps generated for each of the principal foreground components found in experimental CMB data. These maps are currently being used to attem pt to analyze the foreground contamination in the WMAP data. A distinct advantage of Gibbs Sampling over current CMB foreground removal techniques is that estimates in the uncertainties of the foreground removal scheme are obtained. This allows a meaningful comparison to be made with models of the various foregrounds and also estimates of the foreground contamination observed at other frequencies. The progress which has been made in the thesis is summarized in Chapter 8 . This includes a discussion of what has been achieved with Gibbs sampling to date and possible future directions for the Gibbs sampling approach. Applications to large, high resolution datasets such as the Planck data, polarized CMB data and more complex foreground modeling are all discussed. 5 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 2 T h eory o f th e C osm ic M icrow ave B ackground 2.1 In tr o d u ctio n There are many excellent textbooks and treatments of the principles underlying the cosmic microwave background (CMB) (e.g. Kolb and Turner [1994], Peebles [1993], Peacock [1999], Hu & Dodelson [2002]) and they present a much more complete treatment than is possible here. I would point the reader to these texts for the details and will present only a pedagogical outline of the CMB sufficient to understand the CMB power spectrum estimation techniques to be discussed later. The presence of the CMB is a direct and almost unique prediction of the big bang theory. The serendipitous discovery of this uniform background radiation by Penzias and Wilson in 1965 [Penzias & Wilson, 1965] provided strong evidence for the big bang theory and has created intense interest in cosmology. Subsequent observations (e.g. COBE [Smoot et al., 1991] and WMAP [Bennett et al., 2003a]) of the CMB have found small fluctuations in this background on the order of one part in one hundred thousand. These fluctuations are thought to be the seeds of structure formation in the Universe and studying them can tell us a great deal about the nature of the 6 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Universe and its contents. We now look at the place of the CMB in the early universe and give an overview of the nature of the physical perturbations which give rise to the anisotropies in the CMB. We will then introduce the power spectrum as a way of summarizing the statistics of the CMB and discuss the origins of the peak structure which is seen in the power spectrum. 2.2 T h e B ig B a n g The Big Bang theory posits that the universe began in a state of extremely high temperature and density and has been expanding ever since. General Relativity predicts a singularity at the ’beginning’ of the Universe where the temperature and density were infinite. At this scale, the scientific laws which we use to describe our universe, primarily General Relativity at these scales, break down. W ithout a theory of quantum gravity, we are unable to examine the history of the universe during this era, called the Planck Epoch. After about 10~43 s, the universe moves into a regime which can be examined using General Relativity. General Relativity provides one of the two so-called pillars on which the Big Bang theory rests. The second pillar is the Cosmological Principle, which says that on large scales the universe is homogeneous and isotropic. W ith these two pillars, the entire theoretical basis of Big Bang Cosmology is established and specific observable properties of the universe follow. Following the Big Bang, as time passes beyond the first 10“ 43 seconds and the Universe expands and cools, several important milestones are reached in the evolution of the universe. A list of the main events in the thermal history of the Universe can be found in Table 2.1. The epoch which is of most interest to us in the context of the CMB occurs when the temperature of the Universe has fallen to about 3000K. Prior to this time, the 7 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. T able 2.1: Brief Thermal History of the Universe. Time Since Big Bang Event Temperature Planck Era - physics unclear > 1032 K Inflation - exponential expansion 1027 K Neutrinos decouple - neutrino background 1010 K 3 min Nucleosynthesis - light elements form 109 K 300,000 yrs Recombination - CMB photons released 3000 K < 1 0 ~ 43 1 0 “ 35 2 s s s 109 yrs Galaxies begin forming 20 K 1 0 10 yrs Small scale structure forms (e.g. us!) 3K universe most recently existed as a tightly coupled fluid of photons and baryons. The tight coupling is provided by photons Compton-scattering off free electrons, resulting in a very short mean free path. As the temperature drops to about 3,000K the baryons and electrons are able to recombine to form Hydrogen and the universe becomes transparent to radiation. This process is referred to by cosmologists as recombination. Recombination is not instantaneous, but over a short period of time the mean free path of the photons increases. Eventually the number density of free electrons has dropped sufficiently that the photons are able to free-stream unhindered through the Universe. The surface formed on the sky at the time of recombination, when the photons scatter off electrons for the last time, is referred to as the surface of last scattering (SLS). Since recombination is not instantaneous, the SLS will have a finite thickness. It is the free-streaming photons which, if the Big Bang theory is correct, will be observed today as the CMB. Since the photons are limited to traveling at the speed of light and the universe has expanded by a factor of ~ 103 since the photons left the SLS, we expect to observe them at a lower temperature today then when they were emitted. Indeed, the observed temperature of the CMB, which is an almost perfect black-body at 2.725K, is exactly what we expect to see given the Big Bang theory. 8 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 2.2.1 F lu ctu ation s in the C M B Although the detection of the black body radiation from the CMB gives us strong evidence for the Big Bang theory, there is a great deal more cosmological information hiding in the CMB. In 1991, the Cosmic Background Explorer (COBE) discovered fluctuations in the CMB at the level of 1/100,000 K [Smoot et al., 1991]. A large part of these fluctuations is the signature of tiny density perturbations in the very early universe. As we shall see, there are other contributions to the temperature fluctuations we observe at our detectors here on Earth and each can tell us something about the universe in which we live. It is thought that the density perturbations grow by gravitational instability into the structures we see in the Universe today such as galaxies and clusters of galaxies. Therefore, determining the nature, size and distribution of these fluctuations will give us a significant insight into the process of structure formation in the universe. COBE was only able to map the largest scale fluctuations on the sky, but a number of subsequent experiments have mapped out these fluctuations down to angular scales on the sky of a few arcseconds. W ith this increased resolution has come an unprecedented level of accuracy in determining the parameters which we use to describe the universe. 2.3 N o ta tio n In the following, I shall assume the Friedmann-Robertson-Walker (FRW) model for the universe [Friedmann, 1922, 1924; Robertson, 1935, 1936; Walker, 1936]. FRW showed that for an homogeneous and isotropic universe, filled with a perfect fluid, the solution of Einstein’s equation for the line element ds can be written ds2 = g ^ x ^ x v = —dt 2 + a(t ) 2 1 dr + d©2) . —k r 2 9 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2 . 1 ) The independent components of Einstein’s equation become, d a \2 k 55 j + ^ 8irG . = — p + A' . (2'2) and _ ad2aa ‘adt2 / / da aa\ \2 a; k ^ ~ \a d t J I + 777w a(t)2 = ~ 8^ GP- ,__ . (2-3) I will now define the symbols used in these equations and these symbols and conventions will also be used throughout this thesis, is the metric tensor and the space-time vector x^ has one time and three spatial components, dt is an infinitesimal element of physical time, dr an element of comoving length and dQ an element of solid angle. The speed of light, c = l and in what follows I also set the Planck and Boltzmann constants to unity (h = ks = 1). G is the Gravitational constant. a(t) is the scale factor which describes the size of the universe and is assumed to be a function of time, t. The Hubble parameter is defined as H — Today, H = 100 h km /s/M pc and the WMAP best-fit value for h is 0.71^q!o3- This value will be assumed here, on the odd occasion that a value is required, k can take the values - 1 , 0 or + 1 , corresponding to an open, fiat or closed universe and -^yi is often referred to as the curvature term. A is the cosmological constant, p is the density of the fluid filling the universe and p the pressure. The critical density of the universe will be denoted P crit 8trG Q will then be used to denote the ratio of various other densities to the critical density. The ratio of total density to critical density is Qtot = of Qtot greater than 1, equal to 1 or less than 1 and values correspond to closed, flat and open universes respectively. I also define the m atter density ratio flm, the baryon density ratio Qb and the vacuum energy density ratio flA, and note th at for a flat universe Qtot = Gm+G A. The particle horizon is the region which separates the visible universe from the part of the universe from which light signals have not yet had time to reach us. The size of the Hubble horizon at time t is defined as H ~l (t) = 10 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 2.4 N e w to n ia n P er tu rb a tio n T h eo ry Here I present a brief overview of the perturbations in the CMB using a classical Jeans analysis (see, e.g. Kolb and Turner [1994]). For super-horizon scales, a more complex framework is required and the reader is referred to the texts mentioned in the introduction to this chapter for these details. However, on sub-horizon scales, the Newtonian treatment will suffice and provide us with some physical intuition about the fluctuations in the CMB. For a self gravitating, perfect fluid of one species, we have the continuity equation ^ + V • (pv) = 0, (2.4) the Euler equation dv 1 + (v • V ) v = — V P - V ^ , at p (2-5) V 2^ = 4txGp. ( 2 .6 ) -5 7 and the Poisson equation G is Newton’s gravitational constant and T is the Newtonian gravitational potential. p is the pressure, p the density and v the velocity of the fluid. If we now consider the expansion of the universe (i.e., the fluid), we find Po(Cr) = ^ypo(*o)> (2.7) V°(C r ) = ~~77r a at > (2‘8) ^0 (t, r) = Po(t, r ) r 2, (2.9) where ao and to are respectively the scale factor and time today. This is only valid if p <C p, that is, in the m atter dominated regime. We are interested in perturbing this solution and we will do this in the comoving position x, where r = ^ x . So we have p(t,x) = p0(t)[l +6(t,x)}, 11 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2 . 1 0 ) v ( t , x) = v 0( t , x ) + Vi ( t , x ) , (2.11) ^ (i,x ) = ^ 0 (i,x ) + (2 . 1 2 ) and all the perturbation terms are small compared to the unperturbed term. Now substitute eq. (2.10) - (2.12) into eq. (2.4) - (2.6), Fourier transform the result and keep only the linear terms to give ^-8k + i — k - v = 0, at a d ( a\ * U (2.13) 2 ., j v + ik c * + i k 'l' ‘ = 0 ’ (214) *‘= - ^ (£)’*. <**> where cs is the sound speed, cj = ?. These equations can be manipulated to give J t Sk + l l E I t 6 ~ k + O'^ = °- We can write this in terms of conformal time p, where ^ 4 + - 4 + [CjA;2 a <2-16> ^ 4nGpo(rj) ( — ) ] 4 = 0. \oo / (2.17) Now considering the quantity in square brackets in eq. (2.17), wedefine the Jeans wavenumber bysetting this to zero, cf k? —AirGpop-^ = 0. Thisgives = cs For k kj, that is for wavelengthsshorter a0 than the Jeans length, the expression in square brackets in eq. (2.17) reduces to the positive quantity c2k'2 and the result is a damped harmonic oscillator with angular frequency to = csk and damping rate For wavelengths much longer than the Jeans length eq. (2.17) simplifies to 4 + -4 a = 4nGp0(ri) \ ao / 8k. (2.19) For a m atter dominated universe a oc r f and po(p) = ^ ( i r ) 27? 2, Now if we substi tute into eq.(2.19) we find 4 4—4 p p1 = 12 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2.20) with solutions 6k oc 77 3 and 6k oc i f . At late times, the solution oc r f dominates and from eq.(2.6) we find th at is constant with respect to time. From the analysis it 2 can be seen that perturbations on scales larger than the Jeans scale grow as 6 oc a oc £3 while perturbations on scales smaller than the horizon oscillate as acoustic waves. The previous discussion is oversimplified in two main ways. First, it can only deal with sub-horizon scale fluctuations. A full General Relativistic treatm ent is required to treat super-horizon fluctuations which is beyond the scope of this work. Secondly, a single component perfect fluid has been assumed. Hopefully some insight has been given into the evolution of the primordial perturbations and the interested reader can find more details in the references. We will now move on to examine how these perturbations manifest themselves in the CMB anisotropy power spectrum. 2.5 T h e C M B A n gu lar P ow er S p ectru m W ith an understanding of the evolution of the fluctuations in the CMB, we turn to the problem of summarizing the properties of the CMB anisotropies. The usual way to do this is via the power spectrum. For isotropic, Gaussian fluctuations the power spectrum provides a complete statistical description of the CMB temperature anisotropy. If we consider ourselves to be at the center of an imaginary sphere looking out onto the surface of last scattering we expect to observe small differences in the temperature of this surface when we move from point to point on the sky. We are dealing with a spherical geometry, so it is typically most convenient to represent our observations of the temperature fluctuations, in some unit direction r, in the spherical harmonic basis £rji 00 y ( r) = E & E a*nY tm (r), (2.21) e=0 m = —t where T is the direction averaged temperature over the sky T = J dQT(9, <j>) and 6T(d, (j)) = T(0, (ft) —T. Here I have introduced an alternative way of representing the 13 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. unit vector r as a function of 9 and (j) since both representations of the unit vector will be useful later. The fluctuation power in each I mode is defined as Ct = (\aem\2). ( 2 .2 2 ) In order to investigate the temperature fluctuations at different scales, it is useful to perform a Fourier expansion so that O-tm- Qlm — (2.23) k Each aim is proportional to the effective temperature fluctuation at the surface of last scattering which in turn isproportional to the primordial perturbation 4k. Assuming Gaussian initial conditions, as predicted by inflation, we can write the probability distribution of 4k 1 _ii«ki2 P(<Sk) = - _ e 2^T . V27T0-k (2.24) We can now write down some of the expectation values of the quantity 4k <4k) = 0, <|<fk|2> = a 2, <|4k|4) = 3a l (2.25) and (4k4k ) = 0 , when k ^ k', (2.26) which isjust a consequence of the <fk being independent random variables. Now, since aem oc6k, the statistical properties of the aem will be proportional to those of the 4k. Similarly, since aem = a*n> the a^m’s and hence the C? , will also reflect the statistical properties of the primordial fluctuations and follow the relations in eq.(2.25). To be explicit {a>em) ~ 0, (a*emaiim>) = Sw&mm'Ci, (|n£m,|4) = 3(7f, where the * designates the complex conjugate. 14 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2.27) 2.5.1 C osm ic Variance It is not possible to measure the ensemble average of the Cg exactly. To see this, consider a perfect CMB experiment with no noise. If we measure 6 T /T and use eq.(2.21) for the case of a map with Npix pixels, we get A ^pix - = 7 ^ E T ( rr)YL M , (2.28) PtX p = 1 where r p isthe unit vector in the direction of pixel p. This means th at we canmeasure for a given I (21 + 1 )aim (one for each value of m at a given £).However, these (2^ + 1) agm only sample from the distribution of the agm . If we choose to calculate the average Cg, for example, we get c i= ^ E <2-29> m— — i Since we approximate the true Cg with this mean, the error we make is ((Cl - Cc? ) = (C,2) - 2(C,)C, + C? = ™yTy|E « l“'’”l1>+ E <K-l2l“^'l2))]-2Ci5?i-TE(!®'">l2> ' ' m m'^rn m ( 2 3°) Now applying the relations from eq.(2.28) we find ( ( C i - C , ? ) = 1I2 — C l (2.31) This quantity is called cosmic variance and it is a result of the fact th at we have only one universe to observe. If we had an infinity of universes in which to observe the CMB, then we could reduce the cosmic variance to zero. This is not the case, and the fractional uncertainty in our Cg estimate is y . It can be seen from this that the cosmic variance is highest for low multipoles £, where we have the fewest samples of the agm . To return to our idealized experiment, if we calculate C 2 from o u r p erfec t d a ta , th e fra c tio n a l u n c e rta in ty we will h av e in th e value is 63%. E ven at £=100, cosmic variance causes an uncertainty in C 100 of ~10%. If we consider an experiment with noise, which all real CMB experiments will have, or incomplete sky coverage, the error will exceed that due to cosmic variance. 15 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 6000 4 4000 2000 0 200 400 600 M ullfpole M o m e n t 800 1000 t F ig u re 2.1: Example power spectrum. The power spectrum was generated using CMBFAST [Seljak & Zaldarriaga, 1996] and cosmological parameters determined by the WMAP team. Parameter values are listed in table 2.2. The structure of the acoustic peaks described in the text can clearly be seen. 2.5.2 E xam ple Power Spectrum In order to be more concrete in the following discussion of anisotropies in the CMB and their effect on the shape of the power spectrum, the power spectrum for a fiducial cosmological model based on the WMAP data is shown in Figure 2.1. This power spectrum was generated using the CMBFAST code [Seljak & Zaldarriaga, 1996]. The main parameters used for this model are shown in Table 2.2. The positions, heights and numbers of the peaks are dependent on a number of physical processes in the early universe which provides us with a convenient way of investigating those processes. The anisotropies in the CMB can be divided into primary, which are the imprint of processes at the SLS, and secondary, which are the imprint of processes which occur between the photons leaving the SLS and arriving at the detector here on Earth. 16 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. T able 2.2: Cosmological parameters used to calculate example power spectrum shown in fig. 2 . 1 ____________________________________________ 2.6 Parameter Value Description T c m b (K) 2.725 CMB Temperature ^tot 1 .0 Total Density nA 0.73 Dark Energy Density Qb 0.044 Baryon Density &CDM 0.266 CDM Density 0.27 Matter Density n 7 (cm-3) 410.4 CMB Photon Density ns 0.93 Scalar Spectral Index h 0.71 Hubble Constant Tr 0.17 Reionization Optical Depth Zr 20 Redshift to Reionization P rim a ry C M B A n iso tr o p ies There are three sources of primary anisotropy in the CMB. • Photons which were in a potential well at the point they last scattered will have the imprint of a gravitational redshift from ’climbing’ out of the well. • The photons which last scattered off m atter which had a net flow away from us will have the imprint of a Doppler redshift. • The photons will carry the imprint of whatever form of primordial density per turbations were present at the SLS. For example, adiabatic perturbations. We can write the primary temperature fluctuations at the SLS as Y (r ) = I* - r ' v + e ° ] ^ ’ 17 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2-32) where ©0 is the monopole of the photon brightness fluctuation at a position on the SLS and r)sis is the conformal time at the SLS. I now examine each of these terms in order. 2.6.1 Prim ordial P erturbations At late times (that is, times much later than the time of matter-radiation equal ity), the cosmologically interesting fluctuation modes are all within the horizon and Newtonian density perturbation theory discussed in the previous section provides all the analytical power required. However, at times much earlier than matter-radiation equality, when the interesting modes are outside the horizon, there are two orthog onal types of primordial fluctuations which must be considered. The two types of fluctuations are Curvature, or adiabatic fluctuations and isocurvature, or isothermal fluctuations. At this point, isocurvature primordial perturbations are excluded by the experimental data and reasonable cosmological models and so are not considered further here. Adiabatic fluctuations are fluctuations in the energy density (Sp ^ 0) which can be characterized as fluctuations in the local value of the spatial curvature. By the equivalence principle, all species must participate in these fluctuations, that is (233 ) nB nx s where s is the entropy density and B refers to baryons and X to any other species. 2.6.2 Ordinary Sachs W olfe Effect On super-horizon scales, there is no microphysics and only the gravitational potential, 'L, has an effect on the fluctuations. Eq. (2.32) becomes ^ ( r ) = [* + eo]„„„ 18 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2.34) This is called the ordinary Sachs-Wolf effect. For adiabatic initial fluctuations, for example, this gives ^f(r) = 2.6.3 A cou stic O scillations In order to obtain a feel for the oscillations on sub-horizon scales, we can write the detailed harmonic oscillator solution for a multi-component fluid as ^ [(1 + R)Q 0 ] + j Q o = - j (1 + R)V - ^ [ ( 1 + R M where 0o is the monopole temperature fluctuation in Fourier space. R = ^ (2.35) is the baryon-to-photon ratio and 77 is the conformal time, with the dots representing deriva tives with respect to 77. T is the gravitational potential and $ is a general relativistic term representing perturbation in the space of the Friedmann-Robertson-Walker met ric. k is the Fourier mode wavenumber. The sound speed in the fluid is cs = j and the reason or keeping the two terms separate will soon become apparent. We now examine the behavior of this equation as we make various approximations. If we assume time-independence for R, T and <f>, then the solution is a simple harmonic oscillator ©0 (77, k) = —(1 + R ) ^ + Acos(kcsrj) + Bsin(kcsrj), (2.36) where A and B are constants determined by the initial conditions. Each k-mode now represents a plane wave propagating at the speed of sound, cs, in the photon-baryon fluid. Physically, one can think of competition between gravity and radiation pressure as giving rise to these oscillations. The fluid will tend to gather in the gravitational potential wells, which in turn will increase the radiation pressure in the well. This pressure will tend to push the fluid out of the well until enough fluid has left the well that gravity asserts itself once more and the fluid starts to flow back into the well again. If we use this description, then the terms in eq. (2.35) can have a physical meaning attached to them. The first term on the LHS is the inertia of the fluid, the 19 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. second term is due to radiation pressure and the RHS is the gravitational potential term. Now consider adiabatic initial perturbations with R=0. We will still assume the universe is m atter dominated (e.g. CDM is dominant). In this case, on large scales, the temperature fluctuation is 0 O= —§ty. If we go back far enough in the evolution of our universe, the horizon will be very small and all fluctuations can be thought of as super-horizon. Thus, we can write the initial conditions for the temperature fluctuations as ©o(0) = —f'k and 0 O= 0. Using this in eq.(2.36), we get ©o(r?) = - ' k + ^ c o s ( k c srisis) (2.37) Considering only the gravitational and adiabatic fluctuation terms from eq.(2.32) we can write the observed temperature fluctuation as ^fcos(kcsr)sis) (2.38) This will result in a series of peaks in the angular power spectrum at angular scales which correspond to kcsrjsis = nir where n = 1,2, 3.... Physically, this will correspond to scales (2.39) So the first peak of the power spectrum, which can be seen in fig. 2.1, will be caused by a k-mode that has oscillated one-half of a full period before the time of last scattering, the second peak corresponds to a wave that has gone through a full period and so forth. Now we will consider the Doppler term which was ignored above. The photons and baryons in the fluid are tightly coupled, so n7 oc n,b and the baryons are also non-relativistic so rib oc pb- Since the photons follow the blackbody distribution, we have n7 oc T 3. This gives us pb oc T 3, so we can write 6b(rj, x) = 3©(??, x). Using this in the continuity equation (2.4), we obtain k • v fc = 3 i0 o. 20 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (2.40) Now taking parallel and perpendicular components of v*,, we find th a t the differential of the perpendicular term is proportional to a ” 1 so it vanishes and the parallel term is V|| = x®ok, where k is a unit vector in the k direction. So we have 3i • VkiVsis) « ^-©o(/?s/s)k. (2.41) Taking the derivative of eq. (2.37) and equating the result with eq. (2.41) we obtain A % Vk(Vsis) = —k -j='!>sin(kcsr}sis) (2.42) The problem here is that the sine term from the Doppler fluctuations has the same amplitude but opposite phase to the cosine term from the ['k + 0 O] component of the fluctuations and the acoustic peaks will be erased. Since we know that the acoustic peaks are observed, something must be wrong with our simple assumptions. The problem arises from our neglect of the dynamical effect of the baryons by setting R=0. We will relax the assumption R—0 and see what happens, although we still assume no time dependence for R. If R ^ 0 the sound speed is lower in the photon-baryon fluid, cs = However, R also enters into the gravitational term on the right of eq. (2.36) and this will be bigger by a factor of (1 + R ). We can write Qo (?7) = -( 1 + R ) y + ^(1 + 3R)^cos(kcsrj), U (2.43) and the effective temperature fluctuation becomes [0O+ 'k] (rjsis) = - R 'k + ^(1 + 3R)tycos(kcsr)sis). (2.44) Where [©o+d'] oscillated around zero in our previous solution, it now oscillates around —R$!. Also, the cosine oscillations are now (1+3R) times as large as the previous case. If we perform the same calculation as before for the Doppler term we find the Doppler contribution is amplified by only a factor -^= = k . Hence the cosine term is a factor y /(l + R) larger than the sine term and the acoustic peaks have been recovered. The baryons contribute to the effective mass of the fluid, but not to the pressure, so 21 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. they cause a reduction in the angular frequency of the oscillations, causing the fluid to fall further into the potential well, hence the offset of the oscillations from zero. Since a higher baryon density at last scattering will produce a smaller R, we would expect that the higher the baryon density at the SLS, the higher the acoustic peaks should be. An example of a CMB power spectrum for 5 different values of fit is shown in fig. 2.2. The suppression of the first peak can clearly be seen as gets smaller. In addition to this effect, the peaks corresponding to odd values of n are enhanced relative to the even peaks due to the zero-point offset of the oscillations. Although we do expect to see less power between the peaks, we also do not expect to see the power spectrum dip to zero in this model because the Doppler effect will partially fill the gaps. We can also estimate the position of the first peak in a flat universe. The distance to the SLS is D = r/o — r)su, which is ~ r/0 for r/0 3> r/s/.s. Also, in m atter dominated i 1 1 epochs rj oc az. This gives D = Baft and r]s[s = B a 2ls. The physical scale A is projected to the angular scale 9 as 9= A ~ D 7T a0 ~ _ L ( i o 3) - | V/37T (2.45) and zs[s ~ 103. This scale corresponds where we have used the fact that ^ to about 0.33 deg or £ ~ 170. For the simple model we have developed, this is a reasonable estimate. In fact, detailed models predict the first peak to be closer to I ~ 220. Omitting any detailed discussion, I note that if there is time dependence in the T and $ terms, causing gravitational driving, there will be an enhancement of the acoustic peaks. Damping of the power spectrum on small scales can be caused by the finite thickness of the SLS (cancellation damping) or the fact that the mean free path of the photons increases gradually and not instantaneously (Silk damping). The foregoing discussion is intended only as a simple introduction to the acoustic peaks with some physical intuition and the interested reader is directed to the references for detailed discussions of these topics. 22 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. <N _o + 0 200 400 600 M u ltip o le M o m e n t 800 1000 I F ig u re 2.2: Variation in the acoustic peaks of the CMB with changing values of fib. The cosmological parameters from table 2.2 are used here, except that now fib is varied. Qm is kept constant by varying the CDM density Qcdm (he. = &cdm + — 0.27. The five spectra shown are for f2f,=0.02(yellow), 0.05 (brown), 0.10 (black), 0.15(red) and 0.20(green). The best-fit value for from WMAP is ttb = 0.0441°;°“ . The effect of variations in £2*, in changing the position and height of the acoustic peaks in the CMB can clearly be seen. 2 .7 S econ d ary C M B A n iso tro p y In addition to the primary anisotropies detailed above, the CMB can carry the im print of secondary anisotropies which occur between the photons leaving the SLS and arriving at our detectors. 2.7.1 Integrated Sachs W olfe Effect The Integrated Sachs-Wolfe (ISW) effect occurs when photons travel through timevarying gravitational potentials on their way from the SLS to Earth. If the potential well was temporally constant, the photon would experience no net redshift since the blueshift on entering the well would be exactly canceled by the redshift on exiting the 23 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. well. However, if the potential changes while the photon is in the well, the blueshift and redshift will not cancel exactly and there will a net fluctuation seen by an observer of the CMB. The Integrated Sachs Wolfe effect is typically split into an early and late component. 2.7.2 Early Integrated Sachs-W olfe Effect The time between matter-radiation equality and last scattering can have an effect on the growth or decay of gravitational potential wells. Adiabatic models have constant potential wells if the universe is fully m atter dominated and slightly deeper but con stant gravitational potentials in the fully radiation dominated regime. This implies that close to the time of matter-radiation equality, the wells are in a decaying mode. Since matter-radiation equality happens closest to the time of last scattering in uni verses where the m atter density is small, the early ISW effect is largest in universes for which Qoh2 is small. 2.7.3 Late Integrated Sachs-W olfe Effect For adiabatic models, the gravitational potential decays after the m atter dominated epoch if the universe is negatively curved, or in the presence of a cosmological con stant. Conversely, the potential will grow in the presence of positive spatial curvature. Therefore, if Qtot ^ 1, we expect a changing potential at later times, which gives rise to the Late ISW effect. 2.7.4 R ees-Sciam a Effect The Rees-Sciama effect produces fluctuations in the CMB via the time dependent gravitational potential during non-linear phases of the evolution of the universe. This is, for example, at late times when structures such as clusters and galaxies start to form. However, for most current models of this effect, the anisotropies introduced in 24 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. the CMB will be too small to observe, even with next-generation experiments such as Planck. 2.7.5 G ravitational Lensing Gravitational lensing and primordial gravitational waves can also cause anisotropy in the CMB. Gravitational lensing will cause a general smearing of the power in the CMB power spectrum, shifting some of the power from the peaks. Gravitational waves can cause tensor fluctuations in the metric, leaving a signature on CMB photons. This effect is primarily at low multipoles (I < 30). 2.7.6 R eionization If the CMB photons are scattered between the SLS and the observer, there will be an impact on our observations of the CMB. Reionization may occur locally or globally and both will have an effect on the CMB power spectrum. Photons may get scattered by local reionization such as in hot clusters, where free electrons will scatter the incoming photons. If the cluster has a net motion towards or away from us, the photons will experience a blueshift or redshift accordingly. This is the kinematic Sunyaev-Zel’dovich effect. The black body spectrum of the photons may also be affected by the high temperature of the free electrons in clusters. This results in a redshift below 218 GHz and a blueshift above 218 GHz. It is not thought that local reionization would have a large effect on the CMB power spectrum. If the universe were to reionize at an early time (say z ~ 100) this would have a major impact on the CMB. When looking in a particular direction, we would see photons from different parts of the sky Thomson-scattered into our line of sight. This would result in our seeing some sort of averaged power spectrum in any given direction on the sky. This has the effect of suppressing power on small angular scales in the power spectrum. The WMAP experiment showed that reionization may have 25 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. actually occurred much earlier than had been previously thought. The WMAP value for the age of the universe at reionization is tr = 180tgo°Myr (z ~ 20). 2.8 C o n clu sio n s I have briefly outlined some of the physics behind the peaks in the CMB power spectrum and have shown how the location and heights of the peaks are linked to the conditions in the early universe. In the next chapter I will go on to examine the difficulties associated with estimating the CMB power spectrum. 26 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 3 P ow er S p ectru m E stim a tio n C hallen ges and S olu tion s 3.1 In tr o d u ctio n In the previous chapter we saw that the power spectrum provides a convenient method of summarizing the information contained in the CMB anisotropies. We now go on to look at the nature of the power spectrum in more detail and uncover the computational difficulty associated with calculating it directly. We will then examine some current methods for overcoming this difficulty, including an approximate pseudoCe estimator. 3.2 T h e P ro b lem In what follows, we shall assume the CMB is Gaussian and isotropic, an assumption which has a strong basis both in theory and in the observed data. In this case, all of the properties of the CMB temperature anisotropies can be summarized by the power spectrum. In the previous chapter, I introduced the CMB power spectrum and discussed the physical origins of the structure we see in it. Here we are concerned 27 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. with the mechanics of generating the power spectrum from some set of experimental CMB data. The power spectrum is an attem pt to summarize the likelihood of a given set of Ci conditioned on the data, d or a sky map m. This Gaussian likelihood can be written P (S(C e)\d,N) oc (3.1) where S is the CMB signal, N is the experimental noise and C contains information about the covariance properties of the noise and the signal. From this equation it can be seen that direct calculation of the power spectrum would require the evaluation of a matrix determinant. In the best case, the operation count for this scales as Npix where NPiX is the number of pixels in the map. Alternatively, in the case that one is required to begin from time ordered data rather than CMB sky maps, the scaling would go as N?od where N tod is the number of samples in the time-ordered data. This would be the case for correlated, non-stationary, non-white noise, for example. Assuming the best case scenario for a typical megapixel CMB map, such as those obtained from the WMAP experiment, a single power spectrum calculation would require many years. For example, for a map of 3 x 106 pixels, a single determinant calculation would require ~ 1019 operations. Assuming a processor capable of 109 floating point operations per second (FLOPS) operating at 100% efficiency, the computational time would be ~ 10los or ~ 300 years. While parallelization may help this situation somewhat, typical methods for computing the power spectrum require multiple computations of the power spectrum and the computational time would still be several years. This is far too long to be a useful tool for cosmological analysis and the situation is considerably worse if one has to work with the time ordered data. 28 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 3.3 S o lu tio n s In order to overcome this challenge and extract power spectra from CMB data, there are currently three possible classes of solutions which are outlined below. 3.3.1 T h e B ru te Force A pproach For experiments covering only a small fraction of the sky or with low resolution, such as COBE, Npix is a small number and a brute force approach can be used. The COBEDMR data consisted of just 6,144 pixels and so the number of operations required for the brute force matrix determinant calculation in eq.(3.1) is ~ 1012 floating point operations. This is quite feasible on even a modest desktop computer today. As men tioned above, modern, all-sky, high resolution CMB experiments produce megapixel maps and so the brute force approach is mainly of historical interest. 3.3.2 E xperim ent Specific Techniques For certain experimental sky scanning strategies it is possible to exploit symmetries in the scan to reduce the computational complexity to a more tractable level and obtain an exact power spectrum. For example, Wandelt and Hansen [2003] showed that for an experiment that scans on rings which can be combined into an n-torus, the power spectrum estimation can be done in order N 2 operations. The operation count for a megapixel map is ~ 1012, which again is quite feasible. This method can also account for correlated noise, arbitrary beam shapes and partial sky coverage. The power spectrum resulting from this method is exact. An alternative approach, which is also exact, was proposed by Oh et al. [1999]. This approach also scales as N 2, although at the cost of assuming white noise, azimuthally symmetric beams and with limited ability to model foregrounds. How closely the requirements of such methods match real-world CMB experiments is a 29 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. m atter of some debate, although there is no doubt that for the class of observa tions for which they are applicable these methods are much faster than a brute force approach. 3.3.3 A pproxim ate M eth od s for Pow er Spectrum E stim ation In addition to the exact approaches outlined above, which are typically of very limited applicability, a class of general, approximate, but computationally feasible methods have also been proposed. One such method which has been applied to several exper imental data sets is the MASTER method of Hivon et al. [2002], This method, or close variants on it, have been successfully employed in the analysis of several CMB experiments, including the BEAST experiment which is discussed in detail in the next chapter. 3.4 T h e M A S T E R M eth o d The MASTER method is a binned pseudo-Q estimator [Hivon et al. , 2002; Wandelt, Hivon & Gorski , 2001]. The benefits of this estimator are its ease of implementa tion and the flexibility it offers, which allows testing the analysis with a number of cuts and filtering schemes designed to remove Galactic, terrestrial and instrumental foregrounds from experimental data. The term pseudo-Q simply refers to a power spectrum which is calculated on a map of the sky which has had pixels excised from it (e.g. to remove Galactic foreground contamination) and no corrections have been made for the couplings between the spherical harmonic modes introduced by this cut. This is clearly different to the full sky CVs, whose calculation is the ultimate goal of power spectrum estimation. In order to attem pt to undo the effects of the cut-sky map, the MASTER method uses a de-biasing scheme calibrated against Monte Carlo simulations. P s e u d o - a r e 30 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. calculated on the noisy maps over the observed region on the sky. The expectation values of these Pseudo-Q are modeled in terms of an ansatz which involves, as param eters, an instrumental transfer function Fg and a noise bias term Nt- These terms are estimated from Monte Carlo simulations of CMB signal and of experimental noise. 3.4.1 Form alism I briefly summarize the MASTER formalism here, following the development in Hivon et al. [2002]. For an experiment which covers a fraction of the sky f sky and uses a position dependent weighting scheme W (u ) (e.g. to reduce edge effects or down-weight noisy pixels), we can write (3‘2) fsky = i r l as the ith moment of the weighting scheme. Further, if we expand the window function in spherical harmonic space we get = m <3-3> Now we apply the window function IV(u) to a temperature fluctuation map AT(u) and decompose into pseudo-spherical harmonics dem giving atm = J d u A T (u)W (u )Y ;m(u) T (p)W (p )Y ^(p ), (3.4) v where we approximate the sum over the sky as a sum over the pixels, p, in the map, which each have area We can now write the pseudo-power spectrum as Ci = • ^ v j - m——t (3-5) We now relate the ensemble averages of the pseudo-CV to the full-sky Ci via {Ct) = Y , Mw (C t,), v 31 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (3.6) where Mw is the mode-mode coupling kernel which accounts for the couplings intro duced by the cut sky. This kernel is described in detail in Appendix A of Hivon et al. [2002] and depends only on the geometry of the cut sky. We can also include the effects of instrumental noise, time ordered data filtering and the instrumental beam (3.7) where B f combines the smoothing effects of both the instrumental beam and the finite pixel size of the CMB map. (Nt) is the average noise power spectrum and Ft is a transfer function that attem pts to undo the effects of data processing, such as filtering, on the time-ordered data. This Ft is not theoretically motivated, but is a simplifying ansatz that attempts to correct for the processing of the time-ordered data. A fiducial input power spectrum is now assumed and a number of noise-free fullsky realizations (i.e. maps) of this power spectrum are simulated. These maps are then ’observed’ using the experimental scanning strategy and data processing pipeline and the pseudo-power spectra Ct are computed and the ensemble average taken. In the same way, noise maps are simulated by taking random realizations of realistic simulations of the experimental noise and “observing” the resulting maps with the data processing pipeline. The power spectra of the processed maps are then aver aged to produce (Nt){mc)- We can now use eq.(3.7) to calculate Ft. Intuitively, this approach is appealing. We are attempting to undo the effects of our data processing so we take some known data and process it with our experimental data pipeline. We know what the data looked like when it went into the pipeline and we know how it looks after processing and the difference is encapsulated in the transfer function Ft. However, it is worth pointing out once more that this is an ansatz to describe a much more complex situation and is not theoretically motivated. In practice an iterative scheme is used to solve for the Ft and this is described in Hivon et al. [2002]. 32 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 3.4.2 B inning In order to reduce correlations between adjacent Ci th at are induced by the cut sky, a binning scheme is introduced. First we introduce the flattened power spectrum Ce — 1(1 + In order to bin the power spectrum, a binning operator Py is defined such that the binned power spectrum Cb = PyCe. The reciprocal of this, Qy, corresponding to an interpolation operation is also defined Phi = < ;f o < p(b) 1 i(f+b 2tt/M-l) >) ’ low .(< low 11 Z 0, otherwise — l low < / < /k+b — r — low and ^ + 1 ) ’ — low — — low Q lb 0, otherwise, where we have nuns bins, indexed by b. After somemanipulation of (3.7) we can write the binned Ci estimator as (C h) = K ^ P v t ( ( C i ) - ( N i ) ) , (3.8) where K hl) = PbiKwQi’v = PbiMii'FiiB},Qiiv ■ (3.9) To the extent to which the MASTER ansatz models the expectation values of the pseudo-C^ and to which our Monte Carlo procedure mimics the acquisition of the real data, we are guaranteed an unbiased power spectrum result. The unbiased power spectrum estimator of the full sky power spectrum, C*, is then given by C h = Kg}Pv l( ( C i ) - ( N i ) ) . (3.10) Lastly, we want to compute the error bars on the binned power spectrum esti mate C b- A number of Monte Carlo simulations containing both signal and noise 33 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. are performed. The covariance matrix of the estimates is calculated by computing the pseudo-6^ estimator on these simulations. The diagonal elements of the binned covariance matrix are the variances of the binned power spectrum. The result of applying this technique is an approximate, un-biased estimator of the CMB power spectrum which is obtained at relatively low computational cost in a short time. Although the approximate nature of the estimator is a drawback, its ease of application and speed have made it popular for many CMB experiments and a modified version of it was used for the 1st year WMAP data. In the next chapter I describe in detail the application of the formalism outlined above to the BEAST CMB data. 34 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 4 A p p lication o f an A p p roxim ate Pow er S p ectru m E stim ator to B E A S T D a ta 4.1 In tr o d u ctio n I present the application of the approximate power spectrum estimator discussed in the previous chapter to experimental data obtained from the Background Emis sion Anisotropy Scanning Telescope (BEAST). This provides a useful insight into the business of power spectrum estimation and will provide a solid foundation for the later discussion of an exact power spectrum estimator that has been developed called MAGIC (Magic Allows Global Inference of Covariance). I briefly describe the experi ment and then give a detailed description of the application of the MASTER method. The formalism of MASTER was detailed in the previous chapter and in this chapter I will focus on the mechanics of implementation. I will then discuss features of the data which make power spectrum estimation challenging, discuss the computational challenges and examine the power spectrum that was obtained from the data. 35 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 4.2 D e sc r ip tio n o f th e B E A S T ex p erim en t BEAST is a 2.2 meter off axis telescope, currently configured with an 8 element mixed Q (38-45 GHz) and Ka (26-36 GHz) band focal plane, and a modulating flat mirror. BEAST was designed as a high altitude balloon system and had two flights: May 20-21, 2000 and October 16, 2000. Subsequent to the second flight BEAST was reconfigured to take advantage of the University of California White Mountain Research Station, Barcroft Station, CA at an altitude of 3.8 km in the Eastern Sierra Mountains of California. This site has excellent seeing conditions, with precipitable water vapor typically half that of Mauna Kea. It is also accessible by conventional means, in contrast to other CMB experiments which may be located in inhospitable reaches such as the South Pole. A photograph of the telescope is shown in Fig. 4.1. The instrument was fully installed and operational at Barcroft in July, 2001, and took data nearly continuously until December 2001 (except for weather and several equipment failures due to power surges and lightning). Two more weeks of data were obtained in February 2002. A second data taking campaign proceeded in August and September of 2002. The data used for determining the power spectrum are taken from all three of these campaigns. The data presented here were gathered using the BEAST telescope in a fixed elevation mode. The telescope is kept at a fixed elevation near 90 degrees and the rotation of the Earth provides the map scanning. A rotating flat mirror produces scanning in the vertical direction. This strategy results in a sky coverage which forms an annulus centered on the North Celestial Pole. The annulus is 9 degrees wide and is located between 33 and 42 degrees in declination. The BEAST focal plane consists of an array of cryogenic High Electron Mobility Transistor (HEMT) receivers. Two at 30 GHz and six at 40 GHz. The effective receiver sensitivity in the white noise limit is less then 400 m K /s1^2 and the rms noise is about 20K. 36 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. F ig u re 4.1: Photograph of the BEAST telescope in its sliding roof garage at UC White Mountain research Station, Barcroft Station, CA. A more detailed discussion of the BEAST instrument can be found in the Childers et al. [2005] and a more detailed discussion of the optics can be found in Figueiredo et al. [2005]. 4.3 B E A S T and M A S T E R 4.3.1 T he D a ta and M ap M aking A total of 682 individual hours of experimental data are used for the analysis. The data are naturally divided into 55 minute sections by hourly calibration cycles. These 55 minute sections are a useful size for several reasons. In addition to the natural delineation by calibration cycle, 55 minutes is a very manageable data size for ma nipulation in the IDL software package on a desktop computer. The most important effect of this choice of 55 minutes is on the atmospheric offset subtraction described below. The sensitivity of the results to varying the time-scale of our atmospheric offset removal from the fiducial hour down to a minimum (set by sky rotation) of 600 37 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. seconds was tested and no significant changes were observed. The data have been inspected and spurious signal events, e.g. due to aircraft, have been removed. For detailed discussion of the data processing see Meinhold et al. [2005]. The data includes both the signal measured by the experiment and the experimental pointing at that instant. This information is used to construct a sky map of the observed signal. For all the maps created in the d ata analysis the HEALPix pixelization scheme is used. HEALPix 1 [Gorski et al, 1999] (Hierarchical Equal Area isoLatitude Pixelization) is, as the name suggests, a scheme for pixelizing the sphere. HEALPix is a mathematical structure that supports discretization of functions on a sphere at a variety of resolutions. It also allows statistical analysis to be performed on all-sky data sets in a quick and accurate manner. HEALPix has three main properties: • A sphere hierarchically tessellated into curvilinear quadrilaterals. Resolution is increased by simply sub-dividing each quadrilateral into 4 new ones. • Areas of all pixels at a given resolution are identical. • Pixels are distributed on lines of constant latitude. This is very useful for rapid calculation of spherical harmonic coefficients. The HEALPix pixelization scheme is shown in Fig. 4.2 at four different resolutions. The resolution of the map, that is the number of pixels nPiX, is determined by the n Side parameter of HEALPix. n side is the number of individual pixels which lie along one edge of a base HEALPix tile, shown in dark gray in fig. 4.2. The relationship between nPiX and n Side is nPiX = 12 x n s2ide. Another property of the HEALPix scheme which will be useful when considering computational scaling, is the fact that the pixels align on rings as a result of the isolatitudinal structure of the pixelization scheme. 1h t t p : / /w w w .eso .o rg /scien ce/h ealp ix / 38 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. F ig u re 4.2: The HEALPix pixelization scheme. The spheres contain 12, 48, 192 and 768 pixels corresponding to n Side parameters of 1, 2, 4 and 8 respectively. For a given HEALPix resolution parameter n Side, there are 4 x n side — 1 rings in the pixelization scheme. The BEAST maps were all analyzed with an HEALPix n Side parameter of 512, corresponding to a map containing 3,145,728 pixels and 2047 rings. Given the size of the experimental beam and the high sampling frequency which is possible with a ground based instrument (450Hz for BEAST), the effects of pixel smoothing are negligible and are ignored here. For the experimental data, HEALPix maps are created and the CMB power spectrum calculated using the routines available as part of the HEALPix anafast package. Further details of the map making process can be found in Meinhold et al. [2005]. 39 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 4.3.2 B eam and P oin tin g R econstru ction In order to produce an accurate CMB power spectrum from the BEAST data, a detailed knowledge of the experimental beam shape and pointing is required. Com parison of the amplitude and morphology of Cygnus A in the BEAST maps with simple Gaussian simulations of the beam leads to the conclusion that it is well mod eled by a circularly symmetric Gaussian profile with an effective FWHM of 23' ± 1 '. The measured telescope resolution is 19' ± 2'. The smearing from the design resolu tion to the measured effective FWHM of 23' is due to a combination of unmeasured pointing errors (telescope flex, long term telescope sag), residual errors in the point ing reconstruction algorithm, smoothing due to the finite HEALPix resolution of 6.9', and smearing from the initial flat rotation sectors (about 6.7'). 4.3.3 Foreground R em oval A foreground mask is applied to remove the Galaxy and point source contamination from known sources. All pixels with Galactic latitude |b |< 17.5° are removed from the analysis. The analysis pipeline was tested with a range of Galactic latitude cuts and it was found that below |b|= 17.5° there was significant Galactic foreground contamination, while above this Galactic latitude our results were relatively insensi tive to the choice of the cut. In addition to this, a separate analysis of the Galactic foregrounds for the BEAST experiment [Mejia et al., 2005] calculated the percentage contribution of each foreground to the observed temperature of the CMB in both the Q and Ka bands. It was found that using a mask for |b |< 17.5° gives an optimal compromise between maximizing the sky fraction observed by the experiment and minimizing the amount of foreground contamination. In this work it was also found that residual Galactic foregrounds outside the mask are small and they are ignored here. In order to remove the contaminating effects of point sources, a simple algorithm 40 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. to identify potential sources in the BEAST field was applied. First, the raw data map was smoothed with a 20 arc minute FWHM Gaussian. All pixels in this smoothed map with at least 100 observations and a data value (corrected for smoothing) at least 3 times the noise level were flagged. The noise level at each pixel was estimated from the ensemble of measurements at each pixel. A potential source was identified with the local maximum of neighboring marked pixels. There were 12 potential sources thus identified in the region |b|> 17.5°. Several of these have positions coincident with bright 3C radio sources and have fluxes of several Jy; others are probably false triggers caused by noise fluctuations. Pixels within a 1 degree diameter circle centered on each potential source were flagged and were excluded from the subsequent power spectrum analysis pipeline. The positions of these are evident in fig. 4.3. 4.3.4 Fig. D a ta A nalysis P ip elin e 4.4 shows an overview of the steps in the BEAST simulation and analysis pipeline. These steps are discussed in detail below. The first check on the MASTER method and BEAST pipeline was to test whether the initial input power spectrum had any effect on the final BEAST power spectrum. If the MASTER method is correctly implemented, the final power spectrum should be independent of the fiducial input spectrum used to calculate the transfer function described in §3.4.1. The pipeline was tested with two fiducial cosmological models and it was found that the final power spectrum was unchanged by this choice. The first model was a set of reasonable current estimates for cosmological parameters prior to the WMAP data release and the second was the best-fit power spectrum published by the WMAP team [Bennett et al, 2003a]. For the final analysis, the WMAP power spectrum was used to create random realizations of the pure CMB sky. The set of pure signal maps simulated using the fiducial cosmological model were scanned using the experimental pointing strategy read directly from information con41 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. S im u l a te d BEAST N o ise Map S i m u l a l o d BEAST S i g m i ] Ma p 0 00020 BEAST D a ta M ap S im u l a te d 3EAST S ig n a l »N o ise M ap •0C010 (2 * 0 0, JO.'Ji (249 0. 00 .0 ; G a l te b c Figure 4.3: Comparison of simulated and actual BEAST maps in Kelvin. The noise dominated nature of the BEAST data can be seen by comparing the noise map to the BEAST data map. Note that the temperature scale of the simulated signal map is one fifth th at of the other maps. 42 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Sim ulated Noise TOD Simulated Signal TOD Experimental Tim e Ordered Data (TOD) Filtering: Template removal and high A dd individual signal and noise maps together and calculate Cl estimate. Use these to estim ate error bars Map Making Routine: Project TO D onto sky using experimental scanning strategy Calculate pseudo- C l for each map. Take average of Signal Cls and average of Noise Cls Use MASTER formalism to calculate transfer function and produce binned Cl estimates Calculate experimental window function and use to calculate mode-mode coupling kernel BEAST power spectrum estimate F ig u re 4.4: Overview of the steps in the BEAST simulation and analysis pipeline. tained in the time-ordered data (TOD) files. It is expected that the time-averaged atmospheric contributions to the data will vary with elevation. To remove this fore ground a function of elevation angle is fitted to the TOD for each hour and subtracted from the TOD samples. Subsequently a 10Hz high pass filter is used. The simulation has now been subjected to exactly the same scanning and filtering as the real BEAST data and we project this simulated data back into a “BEAST processed” sky map. Power spectra were then generated from each signal map and averaged to produce an average signal-only power spectrum, as required in eq. (3.7). Since all of the pointing information is available, it is possible to create the exper imental window function on the sky. This is a simple geometrical construction which is 1 for any HEALPix pixel which the experiment observes and 0 elsewhere. This 43 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 2.0 0 .5 0 .0 0 .5 0 2 00 400 600 M ultipo le I F ig u re 4.5: Unbinned transfer function for BEAST. Monte Carlo noise is visible, which is smoothed by the binning process. The turnover at £ ~ 550 is caused by the ill conditioned mode-mode coupling kernel window function is used to calculate the mode-mode coupling kernel, M u' , which de pends only on the geometry of the observed region of sky. The ansatz for the expected p s e u d o - t h a t was proposed in Hivon et al. [2002] and discussed in detail in §3.4.1 is used here also. From the signal-only simulations the transfer function can also be calibrated. As a reminder, this equation is Fe = Mee}(C£s)(Ce) - \ B ? ) - \ (4.1) where (Ces) are the signal-only pseudo CJ( and (Ce) are the best-fit theory Ct from the WMAP experiment. Bf is the experimental beam, a Gaussian with FWHM of 23' in this case. Since the coupling kernel is ill conditioned we use an iterative approach for computing M^e}(Ces). The actual transfer function obtained for the BEAST experiment is shown in Fig. 4.5. To construct noise-only maps, our signal estimate for the map is subtracted from each sample in the experimental TOD and it is assumed that each hourly segment 44 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. of experimental data is now noise-dominated. It is further assumed that the noise is piecewise stationary over one hour sections of data and th at each one hour noise chunk is independent. The noise power spectra are estimated using a windowed FFT on each hourly segment, as described in Press et al. [1986]. Synthetic noise simulations can then be generated which have the same power spectrum as the actual noise from the experiment. The simulated noise TOD is filtered in the same manner as for the actual data and the signal simulations, and was then projected into a noise sky map. The average noise power spectrum was then calculated. Comparisons of the data map and the maps created in the simulation pipeline are shown in Fig.4.3. Now the Ci estimate is given by a M~a HCt, - (N't )) = -------- H b | ------- < 4 ' 2 ) where (Ne>) are the pseudo-CV from the noise Monte Carlo simulations and Ce are the pseudo-q from the data. In practice a binned version of the above equation is used as discussed in detail in §3.4.2. The binned mode-mode coupling kernel is shown in Fig. 4.6. By averaging the power spectrum over bins in £, the correlations between the Ce bins which were introduced by the sky cut are effectively reduced and the errors on the resulting power spectrum estimator are also reduced. Various different binning schemes were examined and finally a bin width of At' = 55 was chosen. The width of the diagonal band in the mode-mode coupling kernel is approximately 50 elements in I space. In order to produce uncorrelated t-bins, a bin width of at least this size is required. A bin width of 55 should assure this, while providing for a maximal number of uncorrelated binned points in the power spectrum. 40 signal and 40 noise simulations were created for the analysis, and the final step is to create a set of sky simulations by adding the signal simulations and noise simulations together. The covariance matrix Cw of the binned power spectrum is calculated from these simulations and the diagonal elements give the error bars on 45 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Figure 4.6: Mode-mode coupling kernel for the BEAST experiment. The z-axis is logarithmically scaled in order to show the off diagonal elements, which decrease rapidly. The width of the diagonal is approximately 25 in I either side of the peak. In order to avoid correlations between the bins in our final power spectrum, we therefore choose a bin width of 55 in I. the binned power spectrum estimator. The power spectrum obtained from this process is discussed in §4.4. 4.3.5 Error Checking At various stages in the pipeline tests for systematic errors were performed. For example, checks were performed for atmospheric and electrical noise at the TOD and map level. Jack-knife analyses were also conducted on different cuts of the data and no significant discrepancies were detected. Comparison was made of the power spectra obtained from maps of the first-half of the data against the second-half and odd numbered horns against even numbered hours. The results of these tests were found to be consistent and any systematic errors present are thought to be below the noise level. 46 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Calibration uncertainties for BEAST are dominated by uncertainties in atmo spheric temperature and limited by the lack of celestial calibration sources in the region of the sky scanned by BEAST, with the exception of Cygnus A. Based on calibration tests with Cygnus A, the calibration uncertainty is estimated to be ± 6%. The calibration of the instrument is discussed fully in Childers et al. [2005]. 4.3.6 C om p u tation al C onsiderations Despite being computationally feasible, the computing resources required to imple ment MASTER are still considerable. The code for the BEAST analysis pipeline was written and executed on an IBM SP RS/6000 (seaborg) at the National Energy Research Scientific Computing Center. The code was parallelized using MPI and the final production runs were executed on 640 processors. The parallelization scheme was implemented at the level of the hourly data segments and a master-slave ap proach to the parallelization was taken. Each slave processor works on a one hour data segment and returns the results to a master processor which keeps track of the results and reconstructs the entire time ordered data stream in the correct order. The slave then looks for additional work units until the entire TOD had been read and analyzed. This scheme worked well because of the computational scalings involved in our analysis. Our Fourier Transform operations (such as filtering the data) scale as A log N, where N is the number of data points in the time ordered data stream. The spherical harmonic transforms required by the analysis pipeline scale as 'rings, where l max is the maximum value of I to be included in the analysis (which was 600 for the BEAST analysis) and Nrings is the number of HEALPix rings in the maps, as described is §4.3.1. Finally, the Input/O utput requirements for the pipeline scale as Ntod■ By analyzing each hour of data individually, it was possible to speed the code up by a factor of ~ Nproc, where Nproc is the number of slave processors. This is only approximately true since there is some overhead involved with initializing the code 47 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. and housekeeping tasks performed by the master processor, which cannot be reduced by parallelization. In treating each hour of data independently we make the assump tion that there are no correlations in the noise between the hourly data segments. While this is an approximation, the high pass filtering we performed should ensure that it is a good approximation by removing large scale correlations from the data. In order to further minimize computational time, the HEALPix routines synfast (which makes a sky map from a power spectrum) and anafast (which calculates the power spectrum from a sky map) were modified so that they only synthesize and analyze the portion of the sky where BEAST scans. Since we are only interested in pseudo-C* for most of the simulations, cutting the rest of the empty sky does not cause any issues. Since the data set read in for the BEAST simulations is ~ 80GB and the output maps for 40 MC runs are ~1.7TB, compression algorithms for storing the output maps on disk were also implemented. The final power spectrum estimate required 40 Monte Carlo simulations. This allows the error bars on the power spectrum to be estimated to ~20% accuracy. This final run took approximately 1000 CPU hours on seaborg. 4 .4 T h e P ow er S p ectru m The CMB power spectrum extracted from the BEAST data is shown in Fig. 4.7. The power spectrum is shown in Table 1 . The 1-cr error bars shown in the figure should be interpreted with some caution. 40 Monte Carlo simulations allows the error bars to be calculated to within 20%, which is sufficient for our purposes here, but more simulations would lead to more accurate error bars. In addition, the Monte Carlo simulations are used to calculate the transfer function (Fe), which is then used to produce the Ce estimates and these same simulations are used to calculate the error bar on these estimates. Therefore, our estimate of the error bars on the power 48 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 8000 6000 t CJ ^ 4000 + ' —/ 2000 0 0 200 400 600 M ultipole M om ent I Figure 4.7: CMB anisotropy power spectrum for the BEAST experiment. Error bars are lcr spectrum is not unbiased and we underestimate the size of these error bars. Following on from eq. (3.10) we can introduce a binned noise estimate on the sky Nb = K h)PVi(Nt)Mc (4.3) In calculating the 8 binned Ci estimates, we effectively compute a binned transfer function Tb and a binned noise estimate N b for each bin. There are 40 Monte Carlo simulations of noise to estimate N b and 40 signal simulations to estimate Tb. Based on the number of degrees of freedom used to produce these 8 binned N b and Tb, it is estimated th at the bias in the error bar is approximately 15%, which is of the same order as the Monte Carlo uncertainty in the errors. However, since this latter effect is a systematic bias, the comparison of the BEAST power spectrum estimates and the resulting parameter estimates to WMAP should be taken as ”worst-case” consistency checks. A x 2 comparison of the BEAST data and the WMAP data was performed. For this comparison the WMAP data was assumed to have zero error. We find a x 2 49 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 8000 6000 n — i— i— i i m --------------------1---------- 1-------- 1— r —i— i tt WMAP ACBAR Boomerang02 DASI CBI MAT98 MAXIMA VSA Beast Beast o O A <1 □ 4000 ------------------1----------- 1----1— i— i— i i i * © • u 2000 0 i i i iii 10 Figure 4.8: Combined CMB anisotropy power spectrum for BEAST and other ex periments. parameter of 15.02. W ith 9 degrees of freedom this means a larger value of x 2 would occur approximately 10% of the time, so the BEAST power spectrum is marginally consistent with the WMAP result. The BEAST power spectrum over-plotted with a selection of recent CMB experiments is shown in Figure 4.8 4.5 C o n clu sio n s I have presented the angular power spectrum of the cosmic microwave background as measured by the BEAST experiment. This demonstrates that it is possible to extract cosmological signal from an easily accessible, ground based CMB experiment which is dominated by correlated noise and th at the resulting power spectrum is statistically consistent with previous results. I successfully implemented the MASTER method and although this method is approximate, it proved to be flexible and robust and, in the final run, produced a power spectrum with less than 1000 CPU hours of computational time. The existence 50 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. of the natural time scale for splitting the problem (55 minutes between instrument re calibrations) greatly simplified the parallelization of the BEAST analysis and provided a natural slicing when testing for systematics (e.g. even vs. odd hours). At the time it was analyzed the BEAST CMB dataset was one of the largest TO D ’s analyzed and this proved feasible within the MASTER framework. This suggests th at the analysis of future, larger CMB datasets (e.g. Planck) should be computationally feasible. Approximate power spectrum estimators have their uses, particularly for very large datasets where analysis must be performed in the domain of the time-ordered data. They also provide a method for quickly obtaining an approximate power spec trum from the data, highlighting areas in the data which might give cause for concern and providing a guide for an exact power spectrum estimate from the data. However, a method which combines an exact analysis with the flexibility and computational feasibility of an approximate estimator would be optimal. In the next chapter I shall go on to introduce such a method and and then demonstrate its application to the first-year WMAP data in Chapter 6. 51 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 5 A B ayesian A pproach to Pow er S p ectru m E stim a tio n - M A G IC 5.1 In tr o d u ctio n In this chapter I outline a new approach to CMB power spectrum estimation. This approach was developed simultaneously by Jewell et al. [2004], Wandelt et al. [2004] and Wandelt [2004]. The particular version of the approach which I will discuss here is called MAGIC (MAGIC Allows Global Inference of Covariance) and has several advantages over current methods of power spectrum estimation. It is a sampling technique, and, assuming sufficient samples are available, the method is exact, unlike the approximate methods detailed in the two previous chapters. In addition, the computational scaling is very favorable, and MAGIC is roughly as fast as any of the approximate methods, even for a very large dataset. A detailed discussion of the computational scaling is given in §5.4. It also lends itself naturally to parallelization, thus further speeding up computational time, assuming the availability of sufficient computing resources. Before discussing the algorithm and method in detail I first provide a brief overview of Bayesian statistics and Gibbs Sampling. 52 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 5.2 R eq u ired M a th em a tica l B ack grou n d 5.2.1 B ayesian S tatistics For our purposes only a working knowledge of Bayesian statistics is required. Bayes’ Theorem [Bayes , 1763] is a simple tool which allows calculation of conditional prob abilities. We begin with a definition of conditional probability: The probability of a hypothesis H conditional on a given body of data D is the ratio of the unconditional probability of the conjunction of the hypothesis with the data to the unconditional probability of the data alone, i.e. P(H\D)P(D) = P ( H and D) = P{D\H)P(H) (5.1) where P(H\D) is read as the ‘probability of the hypothesis H given the data D.’ In his posthumously published work Bayes [1763] provided us with a relation between conditional and marginal probabilities. Dividing (5.1) by P(D) we find P(H\D) = P{D\H)P(H) (5.2) P(D) Each term in (5.2) has a conventional name. P(H) is the prior probability of H or simply “the prior” . The prior encodes any knowledge we might have about the hypothesis H before we know anything about the data D. P(H\D) is the posterior probability or “the posterior” . This is posterior in the sense that our knowledge of this distribution is gained after we examine the data D. P(D\H) is the likelihood for D given H and is sometimes written L(D\H). P(D) is the prior or marginal probability of D, which is also sometimes called the evidence. This behaves as a normalizing constant in Bayes’ Theorem. Often we do not have any prior knowledge of the data D and so this constant is ignored and we simply write P(H\D) oc P(D\H)P(H). 53 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (5.3) 5.2.2 G ibbs Sam pling Gibbs sampling is a general method for probabilistic inference which is especially useful for applications where incomplete information is present. Geman and Geman [1984] introduced the idea of Gibbs sampling in the context of image processing. Tanner & Wong [1987] showed the usefulness of the method for problems with incom plete data and Gelfand and Smith [1990] demonstrated its broader applicability in a Bayesian context. One way to think of Gibbs sampling is as a special case of the Metropolis-Hastings algorithm where the proposal density q is the product of full conditional distributions. The Metropolis-Hastings algorithm is a method for producing a correlated sequence of draws from a target density that is difficult to sample by more traditional methods. For example, consider the density q(ip\ip') where we wish to draw ip1 from the source density 7r('0) given the current value ip- The Metropolis-Hastings algorithm is then a two step process. We draw a value from the generating density function and then accept or reject this value as the next step in the Markov chain with probability P(ip,ip'), where PtyW ) = >!] 0, if V) otherwise. If we reject the new value, the next sampled value is the same as the current value. Gibbs sampling corresponds to the case of sampling from a target P( X , Y) with a knowledge of P ( X \ Y ) and P{Y \X ). Furthermore, for the case of Gibbs sampling, P(ip\ip') = 1. Now consider a vector of parameters of interest $ and x nuisance parameters (@i . . . 0 X), both of which are initially unknown. We also have some observed data D. We begin at a random initial starting point (©i°\ ©2° \ ■■•, ©i°\ and then iterate repeatedly steps 1 to 4 1. Sample ©i*+1^ from P(© i|© 21\ . . . , 6x \ D). 54 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 2. Sample ©J+1) from P ( 0 2|0 (1i+1), ©$\ . . . , # , <f>(i), D). 3. Sample ©%+l) from P ( 0 D|0 (/ +1), . . . , 0 ^ , 4>« £>). 4. Sample from P(4>|©f+1), . . . , 6{t l\ 4>(i), D). The vectors 0 (° \ 0 ^ , . . . , 0® represent the realization of a Markov chain, where the transition probability kernel from 0 ' to 0 is defined as: k(&, ©) = p (©i|02, .. . , ex,*,D)* p ( e ' 21©;, ©3, ...,©*,$, d)* pt© '!© ;,...,© '^,*,/)), (5.4) The joint distribution of ( 0 ^ , . . . , ©x\ (f>(d) converges geometrically to P ( 0 i , . . . , 0 X, $ |P ) as i —> oo. This is different to Metropolis Hastings because in each step only one parameter, © d , is allowed to change. The strength of sampling in the context of power spectrum estimation is that the joint distribution of the parameters will converge to the joint probability of the parameters given the observed data. We shall see how this is particularly useful in the context of the MAGIC algorithm shortly. 5.2.3 C onjugate G radient M eth od In order to solve the systems of linear equations which will arise in our implementa tion of the MAGIC algorithm we prefer a computational technique th at is simple to implement and computationally fast. A good choice is the conjugate gradient algo rithm. This method is described in Press et al. [1986] and another excellent resource is the paper by Schewchuck 1. Here I present an outline of the central ideas sufficient to understand the following material. The conjugate gradient method is an algorithm for solving systems of linear equa tions of the type A x = b . It is an iterative approach to finding the nearest local ^ t t p r / Z www.cs.cm u .ed u /~ q u ak e-p ap ers/p ain less-co n ju g ate-g rad ien t.p d f 55 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. minimum of a function of n variables and uses conjugate directions to achieve this rather than the local gradient. It outperforms techniques such as steepest decent, particularly in the case that the vicinity of the minimum of the system is shaped like a long narrow valley. The algorithm proceeds by generating vector sequences of approximations to the solution of the system of equations, x, residuals, r, corresponding to these iterates and search directions, d, used in updating the iterates and residuals. For an equation of the form A x = b the iterative scheme is: d( p) =r ( 0) = b - A x ( p ) , (5.5) r b)r W ^ ~ »E (i+ l) = ( 5 ' 6 ) ^(i) “I- r(t+1) = r(i) - at(i)Ad(i), a r (i+ l)r b + l) P(i+1) - rT r , s (5.8) f.N. > d(i+1) = ^(i+i) + P(i+i)d(i)- ( 5 ‘9 ) (5.10) The convergence properties of the conjugate gradient method can be improved by the use of a preconditioner.That is, we provide an approximate solution, M, to the system of equations with the property that M ~ xr is easy to compute. The choice of a suitable preconditioner for the MAGIC algorithm is discussed later in this chapter. 5.3 T h e M A G IC A lg o rith m 5.3.1 Background The sampling approach to CMB power spectrum estimation was first suggested by a group at the University of Illinois (Wandelt, Larson, and Lakshminarayanan, 2004) and simultaneously an equivalent approach was proposed by a group at NASA’s Jet 56 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Propulsion Laboratories (JPL) (Jewell, Levin, and Anderson, 2004). Subsequent to these first expositions of the Gibbs Sampling method being published, the groups from Illinois and JPL agreed to collaborate on this project and its development has since been a joint effort. At Illinois, a basic version of the MAGIC algorithm was developed and applied to the COBE data. An independent computational algorithm called COMMANDER was developed at the JPL. Having two codes enabled our collaboration to cross check our results and provided opportunities to test different methods of coding the algorithm and the effects of variations in foreground removal techniques. All of the results published in this thesis are taken from the work I performed at Illinois, unless otherwise noted. These results were continually checked against, and agreed closely with, those obtained simultaneously at JPL. 5.3.2 T h e M A G IC algorithm Here I summarize the MAGIC algorithm in what I hope will be a clear and concise fashion. However, if further clarification, or an alternative description is required, the reader is referred to the original treatments by Wandelt [2004], Jewell et al. [2004] and Wandelt et al. [2004]. The main difference between the MAGIC approach and previous attem pts at power spectrum estimation is that rather than trying to solve the maximum likelihood problem directly, we sample from the probability density of the power spectrum Ci given the data d, P(C^|d). While directly sampling from this density is difficult or impossible, we can sample from the joint posterior density P(Ce, s|d), where s is the CMB signal map. Expectation values of any statistic of the Cg will converge to the expectation values of P(Cg\d). The theory of Gibbs sampling discussed in §5.2.2 tells us that if we can sample from the conditional densities P (s |G , d) and P(CV|s, d) then iterating the following two sampling equations will, after an initial burn-in period, 57 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. lead to a sample from the joint posterior P(Ce, s|d): si+1^ F ( s | Q , d ) , (5.11) P(Ce\si+1). (5.12) C f1 Here the symbol denotes drawing a random realization from the density on the right. 5.3.3 D erivin g th e Sam pling Equations We now examine the steps required to sample from the probability densities on the right hand side of (5.11) and (5.12). The first step is to choose a model for this data. Here we choose: d = A(s + f ) + n , (5.13) where d is the experimental data, A is the pointing matrix which maps the sky into the time ordered domain, s is the CMB signal and n is the experimental noise in the time ordered data. / represents the foreground contributions we wish to include. For CMB experiments the principal foregrounds are free-free, dust and synchrotron emission, but others arepossible. We will turn in Chapter 7to a detailed discussion of the treatm ent of foreground contamination and note that for the time being we choose a simplified treatment which does not require sampling over the foreground contributions. / will then disappear from the equations that follow, but will be resurrected in Chapter 7. The object of our analysis is to estimate the joint posterior P(s,Ce,d). We can rewrite this equation in a number of useful ways by applying Bayes’ Theorem: P(s,Ct,d) = P ( s, C e)P(d) = P(d\s,Ce)P(s,Ce) = P{d\s,Ct)P(s\Ct)P(Ct). 58 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (5.14) The right hand side of these equations then tells us how to go about sampling the left hand side. We may wish to include other components in the joint density, such as cosmological parameters, which we can also estimate, but for now we content ourselves with the above description. Now we consider modifying our equations to account for the more complex and realistic situation of multiple experiments or multiple frequency channels within one experiment. The model equation for the ith frequency channel in our CMB experi ment now becomes di = A i S Jr n i. 5.3.4 (5.15) C onstructin g th e D en sities We now turn to the task of constructing the various components on right hand side of equation (5.14), in order that we may sample from them and construct the posterior on the left hand side. Two of the terms are easily dealt with. While more general choices are possible (e.g. Wandelt et al. [2004]) we set the Ci prior P(Ce) to a constant and can then ignore it in what follows. Since we do not enter a discussion of model selection we will also ignore the evidence P(d) in the following. We now derive the equations for each channel of data, i, that is to be included in the analysis and combine these at the end. In order to construct the likelihood P(d|s) combining all channels, we note that p( d| S) = n p ^ i s)’ (5-i e ) i so we construct a Gaussian likelihood: p (di\s) = y/\2irNi\ (5.17) where N) is the noise covariance for channel i in the time-ordered data domain. 59 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Next we have the probability of the signal given the Ce = v i t s r ' Ts^ ( 5 -i 8 ) where S is the signal covariance. In spherical harmonic space we have S — ^ ^ SlrnCg $tti5rnmtSlfm! I m llm l lslm = Cm = (5-19) i m The matrix of Ce isblock diagonal. For each I, m runs from -i to +1, so there are blocks of 21 + 1within the Ce matrix. The signal prior can therefore be written ■ ■ 1 1 W4) =II---- sre"2 I ' <5'20) Now we can construct the posterior: P(s, Ce, d) = P{d\s)P{s\Ce)P(Ce) 1 e - f E Adi-AisfN-'ldi-Ai*} y/\2itNi I 1 (5.21) - l . T g - i , x cQnst y/\2nS\ The procedure for deriving the sampling equations from this density is to differ entiate with respect to the parameter of interest while holding all other paramters constant. This corresponds to computing the mean of the conditional density for that parameter. We differentiate once to obtain the conditional mean and a second time to obtain the conditional covariance of the parameter of interest. We are then able to set up a sampling scheme which will allow us to implement the Gibbs sampling equations (5.11) and (5.12). From (5.22), it can be seen that as long as the prefactors of each exponential do not depend on the variable of interest, they will not affect the differentiation and can safely be ignored. For the log posterior, this gives us: In(P(s,Ce,x,d)) oc - Ais]TN ^ l \di - Ais] - ^ sTS ~ 1s . 60 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (5.23) 5.3.5 Signal Sam pling For the conditional mean we want J^(ln(P(s, Ce, d)) = 0, so ~ i Ais]TNtr l [di ~ A is\ ~ \ s T s ~ls) = 0> (5‘24) Evaluating (5.24) gives J 2 ATNt~1[di - A ix } - S - 1x = 0, i (5.25) where x is now the mean signal. We also note that the matrix N has so far referred to the noise in the time ordered data. It is often more convenient to work in terms of maps and the relationship between the time ordered data noise Ntod and the map noise Nmap is Nmap = A TN todA. Furthermore ATN~oldA m = AT N~olddi. So we have N - 1m = (N~1 + S ~ 1)x. (5.26) This equation is more numerically convenient to sample from if we rearrange it slightly. Multiplying both sides by S 1^2 gives S 1^2N ~ 1m = (S1/2N ~ 1 + S~1/2)x. Now taking a factor of S~l!2 out of the bracket on the right hand side gives S ll2N ~ lm = (S ll2N~lS 1/2 + l ) £ - 1/2x. (5.27) We now have to find the second differential of (5.23) to obtain the conditional covari ance DSig of the signal, which is trivial D;i=(N~'+S~'). (5.28) In order to be able to sample the signal we need to generate both a map with mean given by (5.27) and a fluctuation term. The fluctuation term is not trivial to sam ple because we need to simulate a map with covariance D sig without being able to compute its square root. However, we can compute the square root of S because it is 61 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. diagonal in spherical harmonic space and we can compute N 1/ 2 = A r N ^ from the time ordered data. So the solution is to generate two vectors £ and x of independent Gaussian random variates of zero mean and unit variance and then solve (5.27) for a different left hand side, £ + S 1/ 2 iV-1/2x = ( S ^ N ^ S 1' 2 + 1)S ~ l/2y. (5.29) In practice it is computationally cheaper to solve (5.27) and (5.29) together since the parenthesis on the right hand side need only be computed once. S ll2N ~ lm + £ + S 1/ 2 7V-1/2x = {Sll2N ~ lS ll2 + lJ S " 1/2^ + y). 5.3.6 (5.30) Including E xperim ental B eam s It is quite straightforward to include the experimental beams in the analysis detailed above. We can simply modify the noise matrix for each channel Nt by applying the beam to it, giving [Bj] 2 Nt l B f . We now replace N equations with B ? N ~ 1 B f and with 1 throughout the preceding B? N{ 2 . In the version of the MAGIC algorithm implemented in the next chapter, the experimental beams are taken into account in this way. 5.3.7 Cg Sam pling In order to find the sampling equation for the Ce, we examine eq. (5.20). The density for the Ce factorizes to an inverse Gamma distribution due to the special form of S and sampling from this is very simple. For each £ we compute = J2m=-i ls/ml2 anfl a p-vector of (2£ — 1 ) Gaussian random variates with zero mean and unit variance. Then (5.31) \P i \2 62 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Given this sample from P(Ce, s|d), the marginalization over s to obtain P(C*|d) is trivially accomplished: just ignore the s in the tuple (Ce, s). Of course this is not to say that the s sample is purely ancillary in the analysis: it is also of interest to explore P (s|d ) which is a full representation of the CMB signal content of the data. 5.4 C o m p u ta tio n a l S calin g The scaling of this Monte Carlo method depends on three factors and can be repre sented Ci wjL + C2Ntod log N tod + C3A U (5.32) where C\, C2 and C3 are variable scaling factors. The C\ term is the computational scaling for performing F F T ’s, the C2 term is the scaling of spherical harmonic trans forms and C 3 is the time taken to read one TOD element from disk if insufficient RAM is available on the machine to hold the full TOD array. Which of these terms dom inates will depend on many factors, including the machine architecture, the amount of available RAM, the efficiency with which the compiler can optimize fast Fourier Transforms (F F T ’s) and spherical harmonic transforms (SHT’s), the efficiency of the specific code implementation, the ease of parallelizing the code and so forth. We perform the sampling in (5.30), by means of solving linear systems of equations using the Conjugate Gradient algorithm described in 5.2.3. The computational cost of this method is therefore highly dependent on the choice of a good preconditioner for this system. This choice can have an impact on the number of iterations which we require for convergence of our solver and hence can reduce (or increase) both C\ and C2 in eq. (5.32). We experimented with several preconditioners and found examples which gave convergence in ~ 20 iterations. The number of samples we need to produce in order to adequately describe the statistics of the power spectrum will depend upon the degree of correlation between 63 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. successive samples. The longer the correlation length, the more samples we require and therefore the more SHT’s and F F T ’s we must perform, once again increasing C i and C '2 • Furthermore, a long burn-in period would also result in an increased number of required samples, although we did not experience extended burn-in times with MAGIC. A detailed discussion of the computational scaling issues and the choice and con struction of preconditioners can be found in Eriksen et al. [2004]. Application of the algorithm described here to the first year WMAP data is described in the next chapter. 64 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 6 A p p lication to W M A P d ata 6.1 In tr o d u ctio n In this chapter I show how MAGIC was applied to the 1st year WMAP data in order to estimate the CMB power spectrum. I give a brief overview of the WMAP experiment and a characterization of the data. Further information on the WMAP data can be found in Bennett et al. [2003a] and references therein. I go on to give details of our implementation of the MAGIC algorithm and the computational resources required. Finally I present the power spectrum obtained and compare this with the power spectrum obtained by the WMAP team. I also investigate some of the possible anomalies which have been suggested to exist in the WMAP data, such as a perceived lack of power in the quadrapole moment. 6.2 The W M A P D a ta The WMAP satellite was launched in June 2001 and reached its L2 orbital position in July of that year. Our analysis is based upon the published first-year data. The experiment is of differential design, with two 1.5m Gregorian telescopes sitting backto-back on the satellite. A diagram of the satellite is shown in Figure 6.1. 65 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. upper oinni antenna .4 x 1 6 m primary reflectors dual b ad i-to b aek Gregorian optics “ secondary mitectora Focal Plane Array (FPA) box m ■■■■ < passive thermal radiatois . Seed horns truss structure with microwave diffraction shielding • « ^ T N. top deck — ‘ thermally isolating instrument cylinder Receiver Box (RXB) inside -tv - star trsckers (2) thrusters (3) Z reaction w heels S3) A +x ' deployed sdai array wf web shielding warm S/C and ' Instrument electronics -'*w+Y F ig u re 6.1: The WMAP Satellite. Image taken from WMAP Explanatory Supple ment T able 6.1: Characteristics of the WMAP frequency bands. K Ka Q V W 20-25 28-36 35-46 53-69 82-106 AVeff 4 5 8 13 19 Number of Radiometers 2 2 4 4 8 10.1-7.7 8.9-6.7 7.8-6.0 8.8-7.4 9.7-8.3 0.88 0.66 0.51 0.35 0.22 Band Frequency Range (GHz) ©fwhm (deg) Beam Size (deg) The data from the WMAP experiment consists of observations in 5 frequency bands from 23-100 GHz. Each focal plane contains 10 radiometers, 4 in W-band (94 GHz), 2 in V-Band (61 GHz), 2 in Q-band (41 GHz), 1 in Ka-band (33 GHz) and 1 in K-band (23 GHz). Most of the cosmologically interesting information is contained in the Q, V and W-bands. W-band also has the best resolution with a beam size of 0.22 degrees. The characteristics of the radiometers are shown in Table 6.1 The characteristics of the beams for each radiometer are established using obser- 66 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. .-> (</< a r<) r a ! f-'lu rn M osdu S ii.il f i f-’o n i l t 4 £ (1 J -4 *4 V ..™ | it< q i i i _• ] - . . 4 A „,m Figure 6.2: The WMAP Beam Characteristics. Courtesy of the WMAP Science Team. vations of Jupiter and the beam profiles are shown in Figure 6.2. It is important to account for these differing beam profiles during the power spec trum estimation process. These beam profiles, the sky masks and various other useful tools and data were downloaded from the WMAP legacy archive h 6.3 A p p ly in g M A G IC to 6.3.1 Im plem entation W M A P We used two independently developed implementations of the algorithm to produce our results: MAGIC [Wandelt, 2004] and Commander [Jewell et al., 2004]. These are both parallel implementations of the sampling algorithm. Having results from both MAGIC and Commander allowed us to cross-check results and provided valuable redundancy. Our tests of the codes are detailed in Eriksen et al. [2004] and these showed th at both codes were producing results consistent with each other. In both codes our noise model for each channel consisted of combining the published number of observations per pixel with the noise per sample for each channel and assuming 1h ttp :/ / lam bda.gsfc.n a sa.g o v /p ro d u c t / m ap 67 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Figure 6.3: The WMAP KpO Mask. Blue regions are typically excised from most current power spectrum estimation techniques. We assign artificially high noise to these pixels so that MAGIC considers these pixels to be uninformative. The small regions which are masked outside the Galactic plane are point sources. the noise to be uncorrelated. It was also shown in Eriksen et al. [2004] that residual correlated noise in the WMAP maps has negligible effect on the results. The data input to these codes were the first year WMAP maps for all 8 of the cosmologically interesting frequency bands (2 at Q-, 2 at V- and 4 at W-band). For all channels the foreground template corrected maps following Bennett et al. [2003a] were used. Two analyses were performed with these data. First the data in the V and W channels was analyzed separately by band, using unweighted averages of the channel maps and computing an effective beam window function. For these runs the conservative KpO mask was used and single, long chains, containing 1000 samples for each band were run. The KpO mask is shown in 6.3. Q-band was analyzed more aggressively, combining the channels at the likelihood level [Eriksen et al., 2004], and using the smaller WMAP Kp2 mask to assess potential foreground contamination. These analyses will be labeled Q, V and W in the following. The second analysis uses the optimal combination of all 8 channels in the likeli hood, using their respective noise specifications and beam window functions. This 68 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. joint analysis will be referred to as QVW in the following. For the QVW analysis the more aggressive Kp2 mask was used. Both the Q and QVW analyses were done in parallel, short chains (~10 chains, ~100 samples each) initialized with plausible start ing spectra inspired by the WMAP analysis [Hinshaw et al, 2003]. The Q and QVW analysis shown here were performed at JPL using the COMMANDER algorithm, although similar analyses have also been performed using MAGIC. 6.3.2 Foreground R em oval The WMAP cleaned maps are used for the MAGIC analysis, which have had fore ground removal algorithms applied to them. However, we apply a second set of foreground removal tools to eradicate any residual foreground contamination. We include a stochastic model of the monopole and dipole contribution to the WMAP data. This is encoded as a uniform improper prior which expresses our ignorance of the residual level of monopole and dipole in the data. In practice, this means adding a sampling step to the MAGIC algorithm which searches for components in the map with unknown amplitude, but which only have contributions in spherical harmonic space in the (£, m) component where £ = 0 or 1 and m runs from —£ to +£. In the limit of infinite variance in the monopole/dipole component, the sampling equation for the residual monopole/dipole map, tumd, simplifies to m MD = m R + N^x, (6-1) where m,R is the residual map constructed by taking the original WMAP cleaned map and subtracting the latest signal sample and the latest samples of all other components (e.g. the infinite variance foreground described below), x is a vector of N p iX random variables with zero mean and unit variance and N is the noise matrix for the WMAP channel under consideration. An example residual monopole/dipole map is shown in fig 6.4. 69 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Figure 6.4: A sampled WMAP residual monopole/dipole map. The units are Kelvin. The second foreground component that is included is a stochastic model of the residual Galactic foreground contribution to the data. The WMAP kpO mask, as shown in fig.6.3, is taken and the variance of the noise in the pixels contained inside the mask is artificially increased. In theory this variance should be infinite, but we are limited by machine precision to just using the largest values for the variance that are allowed on a 32 bit processor. We refer to this foreground as the infinite variance foreground (IVF). In the limit of this infinite variance, the sampling equation for the IVF map, m jv F , is essentially the same as eq. (6.1), uiivf = mR + N^X- A example IVF map is shown in fig. 6.5. The result of the IVF approach is that the MAGIC algorithm considers the pixels inside the kpO mask to be uninformative for the purposes of sampling the CMB signal. However, the algorithm will attem pt to reconstruct possible realizations of the CMB signal within the masked region based on the data outside the mask. This can be seen in fig.6.6. Unfortunately, the IVF is not a panacea for all the ills of foreground contamination. The disadvantage of using the IVF approach is that the algorithm has a longer “memory” because the IVF component carries information from sample to sample which would otherwise have been ignored by the algorithm. 70 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. -0.0010 0.0010 Figure 6.5: A sampled WMAP infinite variance foreground map. The units are Kelvin. The WMAP KpO Mask is used to assign artificially high noise to the pixels within the mask. The minimum and maximum values of the map have been limited to better illustrate the nature of the IVF. This manifests itself as a higher correlation length between samples. This results in less independent samples for a given number of Gibbs samples. The Commander algorithm did not include an IVF component and the correlation length of the samples there was estimated to be ~15, while the MAGIC algorithm has a correlation length closer to ~ 50. On a positive note, the IVF results in faster convergence of the conjugate gradient solver, thus producing samples slightly faster than the Commander approach. 6.4 C o m p u ta tio n a l C o n sid era tio n s While the MAGIC algorithm represents a viable method for exact power spectrum estimation, it still requires significant computational resources, and efficient parallelization is essential. In real world data analysis situations it is unlikely that a single set of samples of the power spectrum will be sufficient. Even after a period of testing which typically involves short runs of just a few samples, there may still be developments and improvements to the code after which it is desirable to generate 71 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. another set of samples. In addition to this, future CMB experiments will include several enhancements which will increase the computational expense. Polarization information will create three times as many maps as are currently being analyzed for temperature alone. Experiments such as Planck will be at higher resolution than the WMAP data requiring a HEALPix n side parameter of 1024 or perhaps 2048, (giving maps of 12 x 106 or ~ 50 x 106 pixels respectively). In addition, analysis will be required up to a higher maximum multipole moment lmax of 2000 or more. Recall that the computational scaling of the spherical harmonic transform is imaxN ringS and Nrings = 4ATsj(ie —1. It can be seen that compared with the WMAP analysis, a Planck analysis might easily require a factor of 256 times the computing power to produce the same sample rate. This needs to be kept in mind when considering the adequacy of the current sampling rate and attem pts that have been made to improve this. In order to be a useful cosmological data analysis tool, the algorithm needs to produce ~ 500 independent samples in a period of approximately one day. Our first incarnation of the algorithm for the WMAP data produced only a few samples a day. 6.4.1 P arallelization The most simple available parallelization of the algorithm is to run several Gibbs Sampling chains in parallel on N nodes different machines (or different nodes of the same machine) to produce more samples. As long as each chain is initialized with a different random number seed, one can obtain a factor of Nnodes more samples per day this way. Despite implementing this simple step, the sample rate per day was still too low for the MAGIC algorithm, numbering in the few tens of samples per day. Since a large fraction of the computing time for MAGIC is spent in calculating spherical harmonic transforms (SHT), this was a natural target for parallelization. Fully parallel SHT routines, based on code originally provided by Anthony Lewis, were implemented th at operate by decomposing the sphere into individual HEALPix 72 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. rings, performing the analysis on these rings in parallel and then reconstructing the data after the analysis. In addition to having different chains running simultaneously, it is now possible to have each chain generate many more samples per day, on the order of a few hundred. Although each of these samples is not independent (the correlation length for MAGIC samples is ~50), this provides a sufficient number of uncorrelated samples in a reasonable amount of time. This sample rate was achieved even for an analysis containing all 8 cosmologically interesting channels of WMAP. 6.4.2 Facilities The codes were run both at the facilities of the National Energy Research Council and at the NCSA facilities at the University of Illinois. The seaborg machine at NERSC is a ~ 6000 processor machine running previous generation Power 3 processors each of which is capable of approximately 1.3 GFlops peak performance, tungsten and cop per, the two NCSA machines are more modern and use Intel Xeon processors capable of 6.4Gflops per processor (in double precision). The NCSA machines are clearly considerably faster and we found our parallel SHT routines were able to obtain bet ter compiler optimization benefits and platform specific optimizations on tungsten, seaborg has a higher number of total processors and is more suited to some truly mas sively parallel tasks. Our initial analysis and runs were performed on seaborg using ~ 256 processors. However, after implementing the parallel SHT routines and discov ering the increased compiler and platform optimization performance enhancements available on tungsten, most of the more recent analysis has been performed there. On tungsten, blocks of around 64 or 128 processors were used for the MAGIC runs. This leads to the few hundred samples per day rate quoted in the previous section. 73 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Figure 6.6: The posterior mean CMB signal for WMAP. That is, (s)p(s|d) of the CMB signal s given the WMAP first year data, including all 8 channels in the Q, V and W bands. This map represents the CMB signal content of the WMAP data. It contains detail where the data warrants it and smoothes where the data is poor. For example, the algorithm is able to reconstruct the largest scale modes in the Galactic plane based on the signal phases and correlations it discovers in regions of the map outside the Galactic plane. The smallest scale which can be reproduced in the Galactic plane is set by the mask size and is a natural consequence of Wiener filtering. 6.5 M ean S ign al M ap Figure 6.6 shows the Bayes estimate (posterior mean) for the signal QVW analysis. This map represents the CMB signal content in the Q, V, and W bands of the WMAP data. This map shows detail where the data warrants it and smoothes where the data is poor. The CMB signal is clearly visible outside the Galaxy and the algorithm has reconstructed the signal in the Galactic plane for the largest scale modes, consis tent with the signal phases and correlations it has discovered on the high latitude sky. There is insufficient information in the map for the algorithm to reconstruct the smaller scale modes in the Galactic plane. This result is a non-linear generaliza tion of the Wiener filter which does not assume prior information about the signal correlations but discovers them from the data. 74 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 6.6 P ow er S p ectru m Figure 6.7 shows the power spectra we obtain for Q-, V- and W-band and the com bined QVW analysis. Large scale features in the power spectra are reproduced across all the channels, although there are some differences in the details. This is to be ex pected, particularly given the simplistic nature of the foreground removal algorithm used. In Figure 6.8, we show the probability distribution of the Cg samples for some I of interest. Since the original WMAP analysis was released there has been some controversy regarding a lack of power at the largest angular scales (lowest Ps) e.g. [Efstathiou, 2004; Slosar, Seljak, & Makarov, 2004]. It can be seen from the plots that our estimates of these C /s are consistent with the original determinations made by the WMAP team, but that there are significant, non-Gaussian uncertainties on our estimates which indicate that low values of Ce for the largest angular scales are not so unexpected. We also over-plot the results using the WMAP internal linear combination map with the Kp2 mask from Efstathiou [2004] in our figure. It can be seen that there is good agreement between the values obtained in our analysis and that of Efstathiou. For easy reference we show the cumulative probability distributions for the lowest I in Figure 6.9. One can read off the probability that the actual theory Ce_ are lower than the prediction from any particular model. Probabilities close to 0 or 1 indicate a poor fit, or out-lier. A search over all strongly signal dominated I from 2 to 350 using our V and W band analyses resulted in 10 outliers (within 0.01 of 0 or 1) for V band (I =54, 73, 114, 117, 121, 179, 181, 209, 300, 322) and 9 outliers for W band (£=73, 82, 117, 121, 181, 261, 334, 341, 344), with four of these outliers (£ = 73, 117, 121, 181) common between V and W bands. For 349 tests this corresponds to only a slightly higher number of outliers than expected based on counting statistics, and is entirely unremarkable if we only count those outliers that appear in both bands. 75 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Q-band 1oooo c 8000 6000 4000 2000 0 100 200 300 Multipole Moment t 400 500 400 500 400 500 400 500 V —b a n d 10000 L ............................................ ................ ........................ 8000 6000 4000 2000 0 100 200 300 Multipole Moment I W -band 10000 6000 o. 4000 2000 0 100 200 300 Multipole Moment I 10000 r Q +V +W -band Com bined ---- —......................... —...................... 8000 6000 o 4000 2000 0 100 200 300 Multipole Moment I F ig u re 6.7: The WMAP Power Spectra. Four power spectra obtained from the Gibbs sampling algorithm. The uppermost panel shows the Q-band power spectrum, then V-band, W-band and finally the combined analysis from all eight cosmologically interesting WMAP channels. The red line is the modal Ce value for each I. The V and W results were obtained using the conservative KpO mask while the Q-band result uses the Kp2 mask. This was done to assess potential foreground contamination, for which no evidence is found. 76 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 0 0 1100 t(<+1)<V2n 0 * * ) 2200 3250 « l + 1 )0 ^2 * 6500 0*K>) 0 0 3250 t{t* 1)CV2f! OxK*) 6500 3250 W +1>V2fT 6500 0*K*) 0 0 3250 t(<+ 1}C(/ rT (MKr) 6500 3250 t( t+ 1>cy2* IjiK’) 6500 2 0 3250 6500 0 !(»+1)Cy2n (,**) 0 3250 ti.t* l)<V 2n 6500 (#*K') 0 3250 1(1+1JC /2 * (>*•) 6500 3250 t(i+ 1 )cy 2 n (^K1) 6500 F ig u re 6.8: The probability distributions of the Ce samples for a selected set of I. This gives the probability of obtaining a value of Ce within a given histogram bin. We plot results for the significantly non-Gaussian low-1 multipoles and for selected higher values based on their deviation from the best-fit A-CDM model. There are 50 bins in each histogram. Red, green and blue histograms are Q, V and W-band respectively. Black is the combined QVW analysis. The dotted vertical line is the WMAP best estimate of the Ce value. The solid line is the WMAP best fit theory Ce s and the dash-dot-dot line is the average of the samples from our algorithms. For I < 7 the dashed lines show the WMAP-ILC values from Efstathiou [2004] which compare favorably with our results. Based on this £-by-£ analysis the “bite” in the spectrum does not seem particularly extraordinary, with only one outlier in the range of 200-220. However, all Ce at 205 < i < 210 are quite far below the best fit and will most likely be noted as highly significant outliers in a full multivariate analysis of the joint posterior density. 6 .7 D iscu ssio n s and C on clu sion s The Gibbs sampling technique introduced by Jewell et al. [2004] and Wandelt et al. [2004] has been implemented and applied to the WMAP data. The presentation of results has focused on the inferred marginalized likelihoods for each i. I find that these are broadly consistent with previous determinations of the power spectrum, in particular the original analysis by the WMAP team. However, by virtue of using a Bayesian analysis it is possible to present a more complete picture of the uncertainties in the estimates across all relevant scales than has previously been possible. Previous likelihood-based work to generate a full probabilistic description of individual Ce 77 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 1=5 JT 0.6 _Q O O k_ XI CL 0.4 0.2 0.0 0 1000 2000 3000 e(e+ i)c(/2 7 i O k2) 4000 F ig u re 6.9: Graph for evaluating theoretical power spectrum models at the low £. We plot the probability P (C jheory < Ct) against £{£ + l)Ct/2ir for the largest angular scales probed by WMAP. Theorists can evaluate the goodness of fit of their model Ct with the WMAP data by reading off the probability from this graph. If P is within 0.025 of 1 or 0, the model would be ruled out at the 5% level based on that £ alone. The vertical dotted line is the WMAP best fit theory assuming a constant scalar spectral index n s. The probability is ~90% that the the actual theory C 2 is smaller. (e.g. Slosar, Seljak, & Makarov [2004]) were limited to looking at only the lowest £ due to the computational difficulty involved in the calculations for £ greater than ~ 10. The analysis presented here covers the region from £ = 2 to £ = 500. A multivariate treatment like MAGIC allows a rigorous assessment of the significance of anomalies in the power spectrum, some of which have caused much debate within the cosmology community. The analysis presented here shows that the perceived lack of power on large angular scales is, in fact, not an issue when the full likelihoods are examined. The data support even lower values of the lowest Ct . The “bite” in the power spectrum near the peak around a Ct of ~209 is also evident in the MAGIC analysis, though again, the likelihood distributions are marginally consistent with the best-fit theory values. In addition, MAGIC allows the use of the full multivariate posterior 78 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. density as the input to parameter estimation. Work in this direction is already under way, for example Chu et al. [2004]. This is particularly interesting since the Bayesian approach yields a converging analytic approximation to the likelihood of Ct given the data, it is not necessary to construct parametric approximations to the likelihood such as the Gaussian or shifted log-normal approximations. In the next chapter I discuss the inclusion of a more thorough treatm ent of foreground contamination in the MAGIC algorithm. 79 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 7 D eta iled Foreground M od els 7.1 In tr o d u ctio n An appealing feature of the Gibbs sampling approach described so far is its ability to incorporate virtually any real-world complications, as discussed by Wandelt [2004], Jewell et al. [2004] and Wandelt et al. [2004]. A few examples of this flexibility are applications to 1 / / noise, asymmetric beams or arbitrary sky coverage. Foreground estimation and removal is also of great interest and can be included in a consistent way within the Gibbs framework. For the analysis presented so far we have limited ourselves to including two simple foreground components: (1) a stochastic model of the monopole and dipole in the maps, and (2) a simple stochastic model of the Galac tic foreground component. These models are encoded in terms of uniform, improper priors that express our ignorance of the monopole and dipole in the maps as well as the foreground contributions in the regions that are flagged in the foreground masks supplied by the WMAP team. Traditionally this masked region is excluded from the analysis altogether. In the Bayesian approach it is modeled as a region where the fore grounds are completely unknown. Sampling from the posterior density reconstructs the signal in the masked region, based on the correlation structure discovered on the unmasked portion of the sky. The details of these foreground models were discussed 80 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. in Chapter 6 and are outlined in the above papers as well as in Eriksen et al. [2004]. While this simple approach to foreground estimation and removal was adequate for the purposes of demonstrating the feasibility of the Gibbs sampling approach, a more thorough treatment is required. If we are to detect very small signals, such as polarization in experimental data we need to remove foreground contamination as completely and accurately as possible. The basic Bayesian formalism for treating foregrounds has been developed by Jewell et al. [1999] and Wandelt et al. [2004], but here I will present a specific implementation within the context of Gibbs sampling and suggest possible choices of foreground models. As the quality of CMB data continues to improve and polarization data become available, the correct treatm ent and removal of foregrounds from the data will become increasingly important. Our interest in foreground emission is twofold; We wish to remove the foreground components from our CMB data, but there is also interest in establishing the foregrounds themselves, as they affect many observational astronomy programs and are intrinsically interesting to many astronomers. If full likelihood distributions for given foregrounds can be created, it will finally be possible to confront foreground models with CMB foreground data which includes an estimate of the uncertainties. Although the WMAP team included a treatment of foregrounds in their analysis of the first year data [Bennett et al., 2003b], no uncertainties in the foregrounds were estimated, resulting in foreground maps of somewhat limited utility. Here I present a quite general framework for including foreground components within the Gibbs sampling algorithm. Foregrounds can be included in real space, spherical harmonic space or any other basis which presents itself as suitable for a particular foreground. The method can also be made as complex or simple as required. The simplest case is somewhat equivalent to template removal, with the addition of fully describing the uncertainties in the template fit. The other extreme is equivalent to estimating and removing foregrounds in a complete basis (for example, pixel by 81 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. pixel). 7.2 T h e F oregrounds Foregrounds and their effects on observations of the CMB are the subject of several good papers and reviews, and here I list a selection. A good source of background material on microwave foregrounds is the 1999 review edited by Tegmark and OliveriaCosta1. Bouchet & Gispert [1999] derive some semi-analytic constraints on foreground contamination, and Bennett et al. [2003b] look at foregrounds in the context of the WMAP experiment. Tegmark et al. [2003] describe a second analysis of the fore grounds in the WMAP data and produce new cleaned maps. Tegmark & Efstathiou [1996] discuss methods for foreground removal, and, as previously mentioned, Jewell et al. [1999] look at foreground removal in a Bayesian context. Papers relating to specific foreground components are listed in the sections below which describe each foreground. I point the reader to these papers for detailed descriptions and provide only a high level overview here. There are three main sources of microwave radiation from our Galaxy which are of most interest in CMB data analysis. Synchrotron emission, free-free emission and emission from dust all need to be estimated and removed from the data. Additionally, microwave emission from Galactic and extra-Galactic point sources also needs to be considered. I now examine each of these foreground sources in turn. 7.2.1 Synchrotron Em ission For a review of synchrotron emission in the context of CMB observations see Smoot [1999]. Synchrotron emission is caused by the acceleration of relativistic electrons in a magnetic field. In terms of the Galaxy, these are cosmic ray electrons, from 11999, A SP Conf. Ser. 181: M icrowave Foregrounds 82 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. supernovae and other high energy astrophysical phenomena, which are accelerated in interstellar magnetic fields. Since interstellar magnetic fields are typically weak, very high energy electrons are required to produce the observed synchrotron emission. Our understanding of the mechanism of emission is on firm theoretical ground but the Galactic synchrotron emission is extremely complex, with synchrotron self-absorption, foreground thermal absorption and a variety of other effects complicating the picture. The predicted spatial distribution is also poorly constrained. Given our incomplete knowledge of the sources of both cosmic ray electrons and interstellar magnetic fields, this is perhaps not surprising. The energy spectrum of cosmic ray electrons is expressed as a relativistic electron number density distribution with an empirical power law N (E ) ~ E " 1. The flux density spectral index a is related to 7 by a = —( 7 — l ) / 2 for frequencies above a few GHz where synchrotron self-absorption becomes unimportant. Since the electron density, N(E), and the Galactic magnetic field B show significant variation across the sky, the synchrotron emission exhibits a significant variation with frequency and position. The diffuse component of synchrotron emission in the Galaxy, which is thought to account for ~ 90% of emission shows an empirical variation in a over the range —0.5 < a < —1.1 [Lawson et al., 1987). Currently, the main source of observational information about the spatial distri bution of synchrotron emission in the Galaxy comes from the 408MHz maps produced by Haslam et al. [1981]. We see a ~ 5° band of strong synchrotron emission around the Galaxy together with fainter emission extending a significant height out of the Galactic plane. 7.2.2 Free-Free Em ission Galactic free-free emission, or Bremsstrahlung, is primarily due to electron-ion scat tering. The frequency dependence of this foreground is the best established (y2-15±a02), 83 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. but the amplitude and spatial distribution are not. Unfortunately, dust dominates at higher frequencies and synchrotron emission at lower frequencies, with the CMB dominating at intermediate frequencies, so obtaining a clear picture of free-free emis sion is very difficult. Ho; maps can provide a template for the free-free emission, except in regions where there is a high dust optical depth. For regions of lower dust optical depth, corrections can be made to the H a maps to provide a good guide to the distribution of free-free emission on the sky. For WMAP the Finkbeiner [2003] all sky Ha map is used to estimate the free-free emission in regions of low dust optical depth and this remains a reasonable approach. It should be noted that the use of Ha maps as tracers of the free-free emission is only an approximation, with several possible sources of error. 7.2.3 D u st The morphology and properties of dust grains is a complex and interesting topic which is far beyond the scope of this thesis, and I provide only a brief overview here. For a review of the topic and its implications for cosmological measurements see e.g. Finkbeiner & Schlegel [1999] and Draine & Lazarian [1999a] and references therein. Dust emission at microwave frequencies has three possible sources. In the infrared at ~ 100/irn the emission is due mainly to vibrational electric dipole emission, that is, emission due to thermal fluctuations in the charge distribution in the grain. The tail of this emission can extend down to lower frequencies (e.g. 10-60 GHz). The other thermal mechanism by which dust grains may radiate at microwave frequencies is via magnetic dipole emission (i.e. through thermal fluctuations in the magnetization of the grain). This mechanism might make a significant contribution to dust emission at higher frequencies, particularly if dust grains contain magnetic materials, although current observations show no evidence for this. Observations at higher frequencies have shown an excess emission from dust, or a source tightly correlated with the dust, 84 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. above th at expected from the above thermal mechanisms. Draine & Lazarian [1999a] speculate th at if there were a large population of small spinning dust grains, for which there is strong independent evidence, they could radiate by rotational electric dipole emission, that is, emission due to the rotating electric dipole moment of the rapidly spinning grains. Recent work [Finkbeiner et al., 2004] seems to lend support to the presence of a spinning dust emission component. Current efforts to model the dust seem to be converging on a two component model. At higher frequencies (v > 500GHz) a v'l& emissivity law is observed, which is attributed to the dominance of carbon dust species at higher frequencies and more recently to the spinning dust mentioned above. At lower frequencies, compounds such as olivine ([Mg, Fe ^S iO t) and orthopyroxene ([Mg, Fe] Si0 3) dominate the emission and the emissivity profile is flattened to z/1'5. Diffuse thermal dust emission has been extensively examined at infrared wave lengths (e.g. by FIRAS on COBE and IRAS) and its spatial distribution at these wavelengths is fairly well established, even if the exact mechanisms for the emission are not certain. In order to use these observations in the context of the CMB fore ground removal, the typical approach is to extrapolate from the infrared maps (e.g. Schlegel et al. [1998]) to produce microwave dust maps [Finkbeiner et al., 1999]. 7.2.4 P oint Sources There are many point sources of microwave radiation and these need to be removed from the data. Point sources differ from other foregrounds in th at they are typically identified before the analysis is performed, and the pixels which contain the point source are simply excised from the data. Catalogs of such point sources are avail able. Conversely, it may be desirable to search through our data in order to identify previously unknown point sources, particularly if a new experiment has significantly higher sensitivity than previous efforts. It would be possible to perform an approxi85 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. mate analysis using Gaussian priors for the point sources as in Hobson & McLachlan [2003]. A full treatment and search algorithm within the Gibbs Sampling framework would likely require non-Gaussian priors, and this is beyond the scope of the current discussion. In this work, the WMAP point source catalog, which contains 208 point sources, is used. Pixels where a point source is present are assigned artificially high noise so th at the algorithm considers these pixels to be uninformative. 7.3 M o d el E q u ation s We will use the same model equation as (5.13) d - A(s + / ) + n, where again d is the experimental data, A is the pointing matrix that maps the sky into the time ordered domain, s is the CMB signal and n is the experimental noise in the time ordered data. / represents the foreground contributions we wish to include, and where we discarded this term in the previous analysis, we will now keep it. As before, the objective of our analysis is to estimate the joint posterior P(s, Ct, f , d). We can rewrite this equation in a number of useful ways by applying Bayes’ Theorem: P ( s , C t , f , d ) = P( s,C t,f\d)P(d) = P(d\s,f,Ct)P(s,f,Ct) = P(d\s, /, Ct)P(s\Ct)P(f)P(Ce), (7.1) where foreground and signal have been assumed to be independent. The right hand side of these equations then tells us how to go about sampling the left hand side. We may wish to include other components in the joint density, such as cosmological parameters, which we can also estimate, but again we will content ourselves with the above description. 86 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. For multiple experiments or multiple frequency channels within one experiment the model equation becomes di — Ai(s + Fi) + rij, (7.2) where the index i represents each of the various frequency channels. We also want to consider the case of multiple foregrounds and we shall represent them with the index j. So we have (7.3) 7.4 F oregrou nd E q u ation s We now consider how we will treat the foregrounds in order that we can develop a way to sample them. I will not discuss specific choices of model here, but leave everything in general terms and defer a detailed discussion of possible model choices to 7.6.1. We will divide the sphere up into a suitable basis for our analysis. Note that the basis does not have to be spatial and could, for example, be the spherical harmonic basis if a particular foreground is most easily represented in this basis. The elements of the basis are denoted with the index k. For each foreground under consideration (denoted by the index j ) we will assume some prior knowledge of the distribution of this foreground in whichever basis that we choose to describe it. We denote this foreground as / . We then say that the distribution of our foreground template in the given basis is correct, but that we do not know the amplitude of the foreground, x, that is supported by our data. It is these x parameters which we will be sampling over and which will characterize our knowledge of th e foregrounds. O u r ta s k th e n , is to find th e likelihoods of o u r x parameters. We can write the j th foreground in channel i as (7.4) k 87 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. The total foreground contribution to channel i, would be P'i — ^ 'J'Eijkfijkjk Mathematically, the fak for /c = 1, 2 , . . . define a basis that spans the space of possible foregrounds. Each fak is the A;th basis vector of the j th foreground in the ith channel. The i index allows for variations in the spatial structure of a given foreground between different frequency bands. Because of the generality of this representation this method allows for a great deal of flexibility. 7.5 C o m p letin g th e B ayesian D e sc r ip tio n We now turn to the task of constructing the various components on the right hand side of equation (7.1), in order that we may sample from them and construct the posterior on the left hand side. We will treat the Ce prior P(Cg) and the evidence P(d) as in Chapter 5 and ignore them in what follows. We now derive the equations for each channel of data, i, that is to be included in the analysis and combine these in the end. In order to construct the likelihood P(d\s, F ), we note that P(d\s,F) = l [ P ( d i\s,Fi), i (7.6) and so we construct a Gaussian: pal,. p \~ 1 /7 7 n where N is the noise covariance in the time-ordered data domain. We can substitute for the Fi using (7.5) giving P (di\si) F y ~ ____ C ~^ xiikfiik)FN , x ijkfijk)} ( 7 .8 ) where the f tjk are known, from the specification of the foreground prior as described in §7.6.1. R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Next we have the probability of the signal given the CV, which is independent of the foreground and so does not change from Chapter 5. (7.9) P{s\Ct) where S is the signal covariance. Since it is independent of the foreground, it will not appear in the sampling equations for the foregrounds after we differentiate and sampling from it will be identical to eq.(5.12). Previously we represented the foreground with the term P (f ), but from the above discussion of our approach to the foregrounds, it should be clear th at what we really want to investigate is the distribution of the values of the Xijk parameters. We can then use these to construct the foreground maps as required. Hence we take P(x) oc P( f ) - Assuming our parameters to be drawn from a multivariate Gaussian distri bution, we have 2 1 2 i j k [x i j k x i j k \ r C i3] oil j l k A x i ' j ' k l x i'i'k‘ (7.10) where Cijk%>j'k' is the covariance of the x^k components and x^k is the mean. We may include information about the frequency or spatial interdependence between the x ’s by specifying C and x. For example if we expect the foreground contribution to be correlated between channels, possibly channels at different frequencies, we can include this in the covariance matrix C. A simple example of this would be examining the W-band of the W M A P data, which contains four channels. Since all of these channels are at the same frequency, we might expect the foreground contributions of each of the foregrounds to be almost perfectly correlated between these four channels, and we can introduce this through the covariance C. Specifically, this would be in the ii' component of the C matrix. One could also imagine situations in which spatial correlations are present, so we could couple through the kk' component of the C matrix to include this and similarly in the j j ' component for correlations between the 89 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. various foregrounds. The detailed choices which might be made for the C matrix are discussed in §6. Finally, we are now in a position to construct the posterior: P(s, Ce, x, d) = P(d\s,f)P(s\Ce)P(x)P(Ce) 1 "I V |27to| 7.5.1 e~^ ^~‘lm 1^1 (7.11) o ~ k ' P , d di - A i ( s + T , j k x i j k f i i k ) ] T N i 1 [ d < - ^ i( s + E ^ f c X i j k f i j k ) ] "1 —r— v |27rC| x const (7 12) D erivin g th e Sam pling Equations The procedure for deriving the sampling equations from this density is identical to that in Chapter 5. We differentiate with respect to the parameter of interest, in this case the x^k while holding all other parameters constant. Following similar arguments to the derivation of the signal sampling equations, we find th at the log posterior is: ln(P(s, Ce, x, d)) oc ~ i ~2 - Ai(s + ^ xijkf ijk)]T N r ^ d i - Ai(s + ^ X i j k f i j k ) ] jk jk ~~ ^i3^r ^'ijki>i>k>\x i'3'k' ~ Xi'j'k'}- mQ (7-13) ijk Since the term for P(s|C*) does not depend on x, this will disappear in the differ entiation and so we can ignore it. For the conditional mean we want Qx® (ln(P(s, Ce, x, d)) = 0, so d 1 (—2 'y ' 1^* _ Ai(s + ^ ^ %ijkfijk)\ Ni [di —Ai(s + ^ ^x ijkfijk)]) dximn 2 i. jk jk Xijk] ^ijki'j'k'\Xi'i'k' Xi'j'k’]) ~ 0- 2 ijk We note that g ^ ( J 2 j k x ijkfijk) = Sufimn90 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (7-14) Evaluating (7.14) gives - 'y]([fiuAifimn}T N r l [dj - Ai{s + x ijkfijk)]) + Clmnijk\Xijk i jk x ijk] — 0. (7.15) We now rearrange in terms of x l]k and as before make substitutions for Nmap — A t N - \ A and (ATN ^ A ) m = A r Nto\d l. So we have (^ 1 ij k & ilflm n N m ap.fi jk + C lm n ijk^X ijk = f lm n ^m a p i(m i ~ S) T ^ l m n i j k X ijkj (?-16) ijk which gives us an expression for the mean x^kWe now have to find the second differential to obtain the conditional covariance D of the x parameters, that is —Aj(s + i jk Xjjkfijk)] Ntod. [di — Ai(s + y ^x ijkfjjk)}) jk j[x ijk ~ Xijk] Cijki'j'k'[Xi'j'k' ~ x i'j'k']) = JT O X °P (y « ]([fiilAiflm n\ [di — Ai(s + i Xjjkfijk)]) ~ C\ m n i j k X i j k + C lm n ijk X ijk) jk (7.17) - D - 1 - - Y l d W i m n f N r ^ i o A i U , ) - C ^ nopq, i (7.18) and again we replace AT N~o\ A with N ~l , giving D-' = £ ( [& W L A i , / . * , ) + C,i!„v r P-:19) i The final step in creating the foreground sampling equation is to generate a fluctuation term with the covariance given by eq. (7.19), in an analogous way to that for the signal sampling in eq. (5.29). The required fluctuation term yijk is ( J 2 &ilfLnNmaPifijk + C ^ nijk)yijk = C - ^ + ^(tf«./y*W ~^/zm n) ijk i (7-20) where the ( and x are vectors of Gaussian random variates with a length of the number of individual x^k components for all channels i, foregrounds j and basis vectors k. 91 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. As in the signal sampling, we can combine eq. (7.16) and eq. (7.20) and solve for (x + y)ijk- We now have the sampling equations for the foregrounds and our MAGIC algo rithm includes an extra sampling step: xm ^ P(x\s, d), si+1 7.5.2 (7.21) P(s\Ci, d, x i+1 (7.22) P(C*|si+1). (7.23) Including Instrum ent C haracteristics For real-world experiments it is useful to be able to include the characteristics of the instrument in the data analysis. For example, we can take into account different beam shapes in different channels (either at the same frequency or at different frequencies) by simply including the beam in the f,jk so th at fijk = Btfffk and f ? k = where f ^ sk is the unsmoothed foreground component. We can also include the frequency response of the detector in each channel in our analysis by exploiting the presence of the i index. Each of the foreground basis vectors fijk can be constructed by convolving the instrument frequency response, bfu) in that channel with the foreground model: fijk = 7.6 J d v fjk(y) bi{y) (7.24) Im p lem en ta tio n s In order to implement the ideas outlined in the previous section we need to make several choices. First, we have to select the basis in which we will represent the various foregrounds. One idea would be to choose physical sky maps as the basis. For example to define a specific instance of a foreground model we could simply supply 92 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. the algorithm with maps which are zero everywhere, except in the region which is to be covered by the kth basis vector. The size of our spatial regions is completely flexible. Secondly, we need to populate the covariance matrix C and we examine possible ways of achieving this in the following section. While a spatial basis may be the easiest choice, some foregrounds might be better represented in a spherical harmonic or wavelet basis and the model outlined here allows complete flexibility with regard to these choices. 7.6.1 P ossib le C hoices of B asis V ectors In order to elucidate the ideas presented so far, I consider two specific foreground models and how these might be implemented within the Gibbs framework. Consider a CMB experiment with three channels, e.g. a low, intermediate and high frequency band, which are labeled f= l, 2 and 3 and consider the three main foregrounds, free-free emission (j = 1), dust (j — 2) and synchrotron emission (j = 3). For simplicity in the first model, we assume all three channels to be uncorrelated and further th at there are no correlations between any of the foregrounds. Clearly this is a significant oversimplification, but it will provide a point of departure from which to form more complex models. For our foregrounds we take actual experimental maps of each component (extrap olated where necessary from lower or higher frequency observations) and use them to construct 100 different foreground maps at each frequency i. Different maps could be produced with plausible spatial variations in the spectral index used to extrapolate to microwave frequencies (e.g. for synchrotron) or in the model parameters (e.g. for dust). By reducing the number of basis vectors we limit the algorithm to exploring the dominant modes of uncertainty in the foreground model. Each of the k maps will have an associated Xij factor and our Gibbs algorithm will determine these x^k conditioned on the data, in order to select the optimal combination of the 100 maps 93 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. to form the actual foreground map. To summarize then, we have three channels with three foregrounds per channel each containing a basis set of 100 maps and again note that the presence of the i index on the fijk allows for variations in the spatial structure of each foreground between frequency bands. This is a total of 900 CMB foreground basis vectors. Our covariance matrix C will have 9002 = 180000 elements, which is computationally straight forward to invert by brute force. Furthermore, due to the special conditions we imposed regarding correlations between foregrounds and channels (i.e. th at there are none) the matrix will be sparse and block diagonal. To illustrate this, we consider a matrix Cijki>j>k', where the j components run along the columns and the j 1components down the rows. Since there are no correlations between foregrounds the off diagonal elements will be zero, i.e. \ Cii j k i ' j fk f — C ilfc i'lfc ' 0 b 0 C i 2 k i '2 k ’ 0 (7.25) 0 C a k i 'i k ’J V 0 If we now consider each block of the diagonal our assumption that the channels are uncorrelated gives further diagonal structure to the sub-matrix. For the synchrotron foreground j f = [1,1] and with i running across the rows and i' down the columns, we have C 'ilk i'\k ' — C llk U k ’ 0 0 0 C 2lk21k' 0 0 0 ^ (7.26) C3\k3lk' J In order to complete the description of the covariance matrix C, we have to specify the nature of the covariances between k and k' and then populate each of the diagonal elements of (7.25) for the three possible values of j . Due to our choice of basis, this is quite straightforward. We set up the expected values Xjjk of our foreground components to be 1 in each basis. We then assign variances to indicate our uncertainty 94 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. in our foreground basis vectors and use these to generate the diagonal elements of the covariance matrix. The covariance matrix C is now fully defined for this simple model and we can plug this into our sampling equations along with our definitions of the foregrounds, the fak in order to start the Gibbs sampling process. At each iteration the sampling algorithm will return a vector x^k which define the relative amount of each foreground map k which will contribute to the total map of that particular foreground at th at particular frequency. In order to implement more complicated real world models there are several ways to proceed. Staying within the framework outlined above but now introducing corre lations between foregrounds both spatially and in frequency, we need to populate the off-diagonal elements of (7.25). We set up the expected values x^k of our foreground components to sum to 1 in each basis. That is, we assume that our set of basis vectors spans the range of all possible foreground models. We then assign variances to indicate our uncertainty in our foreground basis vectors. The covariance matrix is similar to before, but now need to vary i and j in addition to k. If we consider the 3rd element of the first row in (7.25), corresponding to the dust-synchrotron covariance, the j j ' = (1,3) sub-matrix would be C i \ k i f3k' C n k l3 k ’ C 21 fcl3 fc' C3lJfcl3fc' Cuk23k’ C ,21fc22fc/ C z \k 2 3 k ’ \pUk33h’ C 2 1*33 /0' (7.27) C : n k : i3 k ’ j and the covariance matrix for the kk' in element ii' = (1,2), for example, is Covarnk23k' = Corri ik23k' Vvarnkvar23k' (7.28) and similarly for all the possible combinations of i and i! for each combination of j and j 1. We may wish to set the covariance in the off-diagonal elements by considering the level of correlation between the various foreground components. For example it may be desirable to have very high correlation between multiple channels at the same 95 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. frequency when considering the same foreground. Once the covariance matrix is thus populated, the sampling can proceed as before. Since the C matrix is no longer sparse or block diagonal, the matrix inversion will have a significantly increased computa tional cost. Since this inversion need only be performed once, this should not be too troublesome. If the number of elements in the C matrix is increased significantly, by increasing the number of maps in the k basis for example, the inversion may become too computationally expensive, so a balance must be struck between the level of detail in the foreground model and the size of the matrix G. The above are just two examples of possible ways of modeling the foregrounds, and it is quite possible to take any number of entirely different approaches to specifying the model. For example, we could perhaps split the sky up into bands in galactic latitude and make each band into a basis vector. Each basis vector has a known spatial distribution of foreground in it, but with unknown amplitude x and the sampling algorithm will find these x, conditioned on the data. In this case we may want to ensure that there are not large discontinuities in the foreground amplitudes at the borders of the bands. That is, we want to introduce some positive correlation between nearby regions in k-space, so that the x^k vary somewhat smoothly across the final foreground map. We might choose something like C (x ,x ’) = A 2r * ' 2 + l ]n, (7.29) so th at the correlation between x and x' depends on the angular distance between the centers of the two regions. A is just a scaling constant. For a simple case we can include information about the correlation between the bands both in terms of frequency and foreground component by providing some simple model of correlation as a function of height above the Galactic plane. Alternatively, we might find a useful way to represent a particular foreground in spherical harmonic space and this would then form our basis for th a t foreground. 96 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. In each case, careful consideration has to be given to the effect our choices have on the structure of the covariance matrix C. If the computational cost of inverting the matrix becomes prohibitive it is still possible to implement an iterative PCG scheme as long as C is structured in the chosen basis. This is in the spirit of the Wiener filtering approach described in e.g., Tegmark & Efstathiou [1996] and Bouchet & Gispert [1999]. 7.7 A S p ecific C M B F oregrou nd B asis W ith the theoretical framework for foreground sampling firmly established and the suggestions for possible choices of foreground basis sets outlined above, I will now go on to examine one such choice of basis in detail. Specifically, the basis will be a set of maps of the three principal foreground components: dust, free-free and syn chrotron emission. I consider two cases of this model. One in which just one map per foreground, per channel is included, which is somewhat equivalent to the template removal schemes which are currently employed on CMB data sets. It is of interest to see if the MAGIC algorithm can produce similar results to current foreground removal techniques. It will also provide a description of the uncertainty in such template re moval schemes, a feature which has hitherto been lacking. The second model is a set of multiple foreground maps (i.e. basis vectors) per foreground, per channel. This will allow the algorithm a great deal of flexibility to select the optimal combination of basis vectors that best describes each foreground at a particular frequency. Below I detail the method for generating multiple maps of each of the CMB foreground components. Work is currently in progress to use this specific choice of basis vectors to analyze the foreground contamination in the WMAP data, but unfortunately no firm results have yet been produced. 97 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 7.7.1 D u st B asis Set For the dust foreground, the multi-component model of Finkbeiner et al. [1999] is used. For such a model with k components, the emission, /, at frequency v can be written r p’" ______ I2kfkQ k(v)B u(Tpk)______ Y .k fkQk{vo)BV(1{Tpk) K Wo{ak,Tpk) p’100’ 1 where f k is a normalization factor for the fcth component, Tpk is the temperature in pixel p of component k, K Wo is the DIRBE color-correction factor [Boggess et al., 1992] and /p,ioo is the 100/im flux in pixel p in the DIRBE-calibrated 100p m map, as discussed in Schlegel et al. [1998]. BV{T) is the Planck function at temperature T. Q{v) is the ratio of the emission efficiency cross section to the classical cross section of the grain. Following the development in Finkbeiner et al. [1999] and skipping some of the detail, we note that the emission opacity (effective area per mass) for a spherical grain of radius a, Kem(i>), is related to Q{u) by (7.3D The frequency dependence is taken to be a power law Kem{u) = Kem{uQ){v/uQ)a\ (7.32) where a k is the emissivity index and Kem(r/o) is the opacity of species k at a reference frequency of u0 — 3 THz. We will define f k as the fraction of power absorbed and re-emitted by component k, so the power fractions are forced to sum to unity J2fk = 1k (7-33) If we demand that each grain species in our multi-component model is in equilibrium with the interstellar radiation field, the energy of all species is related by r -t Ki Jo r «r(i/)B „(rO *- = - t Kj Jo 98 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. (7.34) T able 7.1: Parameter values for multi-component foreground model dust extrapola tion. ._____ ._____ ._______ .______ . ai where k* isthe effective field. Using eq. Q?2 h qi/Qi 1.50 2.60 0.0309 11.2 1.60 2.65 0.0330 12.0 1.65 2.70 0.0360 13.0 1.70 2.75 0.0390 14.0 1.80 2.80 0.0420 15.0 absorption opacity of species j to the interstellar radiation (7.31) we can solve for the temperature of onecomponent as a function of the other rp4+ai Q }^ {a j) i \a i-a irp 4 + o c j Ti T> tn - (7-35) where q = Ke" > 0)/K*, (7.36) is the ratio of far-IR emission cross section to the UV/optical absorption cross sec tion. We now have a two-species dust model which is parameterized by three global parameters (fk,Qk,ctk), where qk is the ratio of opacities, and one parameter that varies with position on the sky Tk(x). Finkbeiner et al. [1999] find th at a two component model with the following param eters produces a %2 per degree of freedom of 1.85. ot\ = 1.67, a-i = 2.70, f \ = 0.0363 and ^ = 13.0. This is the model that I use for the template fitting mode, as described in the introduction to this chapter. For the multi-band, multi-component mode, 5 maps are generated with various realistic values for a.\. a,2 , /ia n d ^ extrapolated from Table 2 of Finkbeiner et al. [1999]. The values for the parameters for each of the 5 maps are shown in 7.1. An example of a dust map extrapolated to the WMAP W-Band (90 GHz) using this model with the best fit parameters is shown in fig. 7.1. 99 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 0 6370 Figure 7.1: Dust emission map extrapolated from 100/rra DIRBE maps. Units are mK and the color scheme of the map has been histogram equalized in order to illustrate the spatial distribution of emission on the sky. 7.7.2 Synchrotron B asis Set In order to generate synchrotron maps at a given frequency the usual approach is taken, which is to begin with a map of synchrotron emission at 408M H z as provided by Haslam et al. [1981]. This map is then scaled to the relevant WMAP frequencies via = U 4 0 8 (7. 37) where x = Iw/ kTc m b ■ The spatial variation in the synchrotron spectral index a is encoded in a map which varies over the sky. This map is used unchanged for the template mode analysis. In order to generate different five different foreground components (fijk for j=synchrotron) for the multi-component foreground mode, I scale this spectral index map by -10%, -5%, 0, +5% and +10%. The initial spectral index map is shown in fig. 7.2. The resulting map at 90 GHz, obtained by applying eq.(7.37) is shown in fig. 7.3. 100 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Figure 7.2: Synchrotron spectral index map used to determine the value of the spectral index in eq. 7.37. In order to generate additional possible synchrotron foreground maps for the multi-component foreground removal mode, this map is scaled by -10%, -5%, 0%, +5% and +10%. 0 .0 0 0 2 9 • 0 27 Figure 7.3: Synchrotron emission map extrapolated from 408MHz Haslam map. Units are mK and a histogram equalized color scheme is used for illustrative purposes. 101 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 7.7.3 Free-Free B asis Set The free-free emission maps are based on the H a emission maps of Schlegel et al. [1998]. These maps are converted to the relevant WMAP frequency and a dust extinction correction is applied based on the Schlegel et al. [1998] extinction map at H a and the procedure described in Dickinson et al. [2003]. The fraction of dust used in calculating the H a absorption is a parameter and can rim from 0 (no absorption) to 1 (full extra-Galactic absorption). A full description of the procedure is given in Dickinson et al. [2003]. The free-free spectral index (5 is a slow function of frequency and electron temper ature [Bennett et al., 1992]. This must be accounted for or a significant discrepancy can occur in the predicted radio emission over a wide range of frequencies. Oster [1961] gives an accurate formalism for the optical depth for free-free emission as = 3.014 x 10-2T - £ f ^Hz° X M 4.995 x lO"2^ ) + 1.5M Te(k))](£M cm-6pc), (7.38) where EM is the emission measure, defined as E M = f nidi. Alternatively the approximation of Altenhoff et al. [1960] can be used Tealtenhoff = 8-235 x 10- 2Te^ f ^ x (E M cm-epc). (7.39) Dickinson et al. [2003] uses a factor, o, from Mezger & Henderson [1967] which is the ratio between these two formulae. Teoat°r= 0 . 3 6 6 x M 4-995 x lO”2^ 1) + 1.5/n(Te(K))]. (7.40) altenhoff The brightness temperature in Kelvin can now be written Tb = 8.235 x l O - ' a ^ T ^ f a + 0.08){EM crn-epc), (7.41) where the factor (1+0.08) is the contribution from the fraction of Heatoms which are assumed to be singly ionized. 102 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Table 7.2: Free-free brightness temperatures per unit Rayleigh assuming Te=7000K Frequency Free-free brightness temperature (GHz) per unit Rayleigh 30 5.83 44 2.56 70 0.94 100 0.43 ( /iK ) One more relation for Ha intensity is required to obtain the relationship between radio emission and H a emission. This is taken from Valls-Gabaud [1998] in units of ergcm^2s ^ 1sr~l I (H a ) = 9.41 x 1 0 - 8 T4- 1 0 1 7 1 0 - ° O 29/r 4 (£;Mcm- 6pc). (7.42) The relationship between radio emission and H a emission from eq.(7.42) and (7.41) can then be shown as = 8.396 x 103 a ^ ^ T 4° '667 100 029/T4(l + 0.08), (7.43) * Ha where T4 is the electron temperature in units of 104K . The free-free brightness tem perature conversion factors are given in Dickinson et al. [2003] and those of relevance to WMAP are shown in table 7.2. It is worth noting th at there is a ± 1 0 0 0 A' un certainty in the electron temperature, which causes a ~ 10% error in eq. 7.43. This formalism is used to create the free-free maps at WMAP frequencies which can then be used in the foreground sampling algorithm. For each of the different foreground models listed above, code provided by Carlo Baccigalupi2 and the Planck CTP Working Group was used to generate the foreground fijk s and I am grateful for their assistance. 2h ttp ://w w w .s is s a .it/ bacci/w ork/co sm o lo g y /co sm ic microwave b ack g ro u n d /fo reg ro u n d s/ 103 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. 0.0 3.5 F ig u re 7.4: Free-free emission map in mK, histogram equalized color scheme for illustrative purposes. The map was extrapolated to 90GHz 7.8 C o n clu sio n s I have presented a detailed formalism for representing foregrounds within the Gibbs Sampling framework. The method is flexible and is not restricted to a specific basis set for the description of a particular foreground. It is also computationally fast, assuming one can precompute the covariance matrix C. I have examined in detail some possible choices of foreground model and covariance matrix, C. I also discussed the importance of making careful choices to maintain the computational feasibility of the Gibbs algorithm. Although any number of foreground components is supported in principle, the matrix inversion on the LHS of eq. (7.16) and eq. (7.20) must be computationally feasible. The requirement for calculating the inverse of this matrix, which is an N^lements computational operation, is likely to be the limiting factor on the number of foreground components that could be included in a given model. Extension to polarization is also straightforward to include in the framework, although this also leads to an increased set of basis vectors and thus a 104 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. larger C matrix to invert. Finally, I presented very specific foreground maps which can be used as a basis set for performing foreground removal on CMB data sets. This particular basis set is currently being implemented to analyze and remove the foreground contamination in the WMAP data, and results should be forthcoming shortly.. 105 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C hapter 8 C onclusion 8.1 S u m m ary o f P rogress Below I summarize the progress which has been made in the course of working on this thesis. 1. Application of an approximate power spectrum estimator to BEAST data 2. Development and refinement of Gibbs Sampling method for PS Estimation 3. Application of MAGIC to WMAP data 4. Development of specific framework for treating foregrounds within MAGIC 5. Presentation of a specific set of foreground basis vectors for analyzing the WMAP data I have described the problems with power spectrum estimation for CMB experi ments and noted some methods for overcoming these difficulties. I demonstrated the application of an approximate method (MASTER) to the BEAST data and showed that it was possible, even with noisy, ground based data, to extract a power spectrum which was consistent with previous determinations of the CMB power spectrum. The 106 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. result was consistent with the WMAP data at the 2a level. I discussed the compu tational demands of such a method and steps which can be taken to parallelize the algorithm to accelerate the data analysis. I then went on to discuss the main topic of this thesis, the MAGIC method. This is a Gibbs sampling approach to power spectrum estimation and has the advantage of being both exact, in the limit of sufficient Gibbs samples, and computationally feasible. The algorithm is flexible and can account for experimental beam shapes, polarization and foreground contamination, in addition to a range of other real-world experimental complications. I discussed the computational scaling of the method and the specific details of the parallelization scheme which was employed. I demonstrated the application of this new method to the first year WMAP data and obtained a power spectrum which was in broad agreement with the determination made by the WMAP team, but which difFered in the details. Because the MAGIC algorithm explores the full posterior likelihood of each C e , it is possible to examine the detailed differences between the MAGIC result and the WMAP team result and also assess how well the best-fit cosmological model is supported by the WMAP data. I found that, with a full characterization of the error bars, it was possible to determine that most of the deviations of the data from the best-fit cosmological model were within the errors and that assuming Gaussian error bars is inaccurate, particularly at low I. I went on to describe a method for including foregrounds within the framework of Gibbs sampling. The formalism is flexible and allows foregrounds to be dealt with in the basis set which is most convenient for their sampling. I discussed choices of basis vector in broader terms, and I then detailed two specific models for foreground removal in the context of the WMAP data: a simple template removal type analysis and a more complex analysis involving multiple foreground maps and multiple basis vectors for each frequency channel. I noted that work to implement this scheme on 107 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. the WMAP data is well advanced and firm results should shortly be forthcoming. 8.2 T h e P ro m ise o f G ib b s S am p lin g M e th o d s The Gibbs sampling approach offers the opportunity to perform an exact power spec trum analysis on CMB data sets in a computationally feasible way. This is the first time that an exact scheme of broad applicability has been available. W ith the up coming launch of the Planck satellite scheduled for 2007, the need for comprehensive power spectrum estimation tools has never been greater. The computational chal lenges of expanding the MAGIC algorithm are considerable. I showed that the Planck data will probably require at least 256 times more computing power then the WMAP analysis and that is assuming th at there is no need to go back to the time ordered domain in order to account for correlated noise. The extension of MAGIC to handle polarized CMB data and asymmetric experimental beams are both important projects and both are under way. The possible inclusion of parameter estimation within the sampling algorithm is also very intriguing, rather than the post-hoc method that is currently used. Including any or all of these features in MAGIC, or any Gibbs sampling algorithm will lead to a dramatic increase in computational time for the sampling process and clever parallelization schemes. Advances in raw computing power will be needed if the algorithm is to grow with the size and complexity of the observations. Gibbs sampling offers us the hope of exact power spectrum estimation on large and complex CMB datasets, and this is an exciting and challenging time to be involved in the business of extracting cosmological and astrophysical information from observations of the cosmic microwave background. 108 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. R eferen ces Alpher, R. A., & Herman, R. C. 1950, Reviews of Modern Physics, 22, 153 Altenhoff W., Mezger P.G., Wendker H., Westerhout G. 1960, Veroff Sternwarte Bonn, No 59, 48 Thomas Bayes (1763), ”An Essay towards solving a Problem in the Doctrine of Chances” , Philosophical Transactions of the Royal Society of London, 53 Bennett, C. L., et al. 1992, ApJL, 396, L7 Bennett, C. L. et al. 2003a, ApJS, 148, 1 Bennett, C. L. et al. 2003b, ApJS, 148, 97 Boggess, N. W., et al. 1992, ApJ, 397, 420 Bouchet, F. R., & Gispert, R. 1999, New Astronomy, 4, 443 Childers, J., et al. 2005, ApJS, 158, 124 Chu, M., Eriksen, H. K., Knox, L., Gorski, K. M., Jewell, J. B., Larson, D. L., O’Dwyer, I. J., & Wandelt, B. 2005, PRD, 71, 103002 Dicke, R. H., Peebles, P. J. E., Roll, P. G., & Wilkinson, D. T. 1965, ApJ, 142, 414 Dickinson, C., Davies, R. D., & Davis, R. J. 2003, MNRAS, 341, 369 Draine, B. T., & Lazarian, A. 1999, ASP Conf. Ser. 181: Microwave Foregrounds, 181, 133 Draine, B. T., h Lazarian, A. 1999, ApJ, 512, 740 Draine, B. T., & Lazarian, A. 1998, ApJ, 508, 157 Efstathiou, G. 2004, MNRAS, 348, 885 Eriksen, H. K., et al. 2004, ApJS, 155, 227 Figueiredo, N., et al. 2005, ApJS, 158, 118 Finkbeiner, D. P., Davis, M., h Schlegel, D. J. 1999, ApJ, 524, 867 109 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Finkbeiner, D. P., & Schlegel, D. J. 1999, ASP Conf. Ser. 181: Microwave Fore grounds, 181, 101 Finkbeiner, D. P., Schlegel, D. J., Frank, C., &; Heiles, C. 2002, ApJ, 566, 898 Finkbeiner, D. P. 2003, ApJS, 146, 407 Finkbeiner, D. P., Langston, G. I., h Minter, A. H. 2004, ApJ, 617, 350 Friedmann, A. 1922, Z. Phys., 10, 377 Friedmann, A. 1924, Z. Phys., 21, 326 Gamow, G. 1949, Reviews of Modern Physics, 21, 367 Gelfand, A. E. and Smith, A. F. M. 1990, J. Am. Stat. Asso. 85, 398 Geman, S. & Geman, D. 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence, 12:609-628 Gorski, K. M., Hivon, E., & Wandelt, B. D., 1999, in Evolution of Large-Scale Struc ture: from Recombination to Garching, ed. A. J. Banday, R. K. Sheth, & L. N. da Costa (Garching, Germany: European Southern Observatory), 37 Keith Grainge et al. Mon. Not. R. Astron. Soc. 000, 15 (2002) Hanany, S. et al. 2000, ApJL, 545, L5 Haslam, C. G. T., Klein, U., Salter, C. J., Stoffel, H., Wilson, W. E., Cleary, M. N., Cooke, D. J., & Thomasson, P. 1981, AAP, 100, 209 Hinshaw, G. et al. 2003, ApJS, 148, 135 Hivon, E., Gorski, K. M., Netterfield, C. B., Crill, B. P., Prunet, S., & Hansen, F. 2002, ApJ, 567, 2 Hobson, M. P., & McLachlan, C. 2003, MNRAS, 338, 765 Hu, W., & Dodelson, S. 2002, ARA&A, 40, 171 Huey, G, Cyburt, R. H., Wandelt, B.D., 2003, astro-ph/0307080, Phys. Rev. D, in press Jewell, J., h et al. 1999, ASP Conf. Ser. 181: Microwave Foregrounds, 181, 357 Jewell, J., Levin, S., & Anderson, C. H. 2004, ApJ, 609, 1 Knox, L., Bond, J. R., Jaffe, A. H., Segal, M., & Charbonneau, D. 1998, PRD, 58, 83004 Kolb, E.W. and Turner, M.S., The Early Universe, 1994, Addison-Wesley 110 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Kuo, C. L. et al. 2002, Bulletin of the American Astronomical Society, 34, 649 Lawson, K. D., Mayer, C. J., Osborne, J. L., Sz Parkinson, M. L. 1987, MNRAS, 225, 307 Meinhold, P. R., et al. 2005, ApJS, 158, 101 Mejia, J., et al. 2005, ApJS, 158, 109 Mezger, P. G., & Henderson, A. P. 1967, ApJ, 147, 471 Miller, A.D. et al. 1999, ApJ, 524, L1-L4 O’Dwyer, I. J., et al. 2004, ApJL, 617, L99 Oh, S. P., Spergel, D. N., & Hinshaw, G. 1999, ApJ, 510 551 Oster L., Rev. mod. Phys., 33, 525 Padin, S. et. al., 2001, ApJ 549, LI Peacock, J. A. 1999, Cosmological physics. Publisher: Cambridge, UK: Cambridge University Press, 1999. ISBN: 0521422701 Peebles, P.J.E., Principles of Physical Cosmology, 1993, Princeton University Press Penzias, A. A., & Wilson, R. W. 1965, ApJ, 142, 419 Press, W.H., Flannery, B.P., Teukolsky, S.A., & Vetter, W.T., Numerical Recipes The Art of Scientific Computing, Cambridge University Press, Cambridge 1986 Robertson, H. P. 1935, ApJ, 82, 284 Robertson, H. P. 1936, ApJ, 83, 187 Schlegel, D. J., Finkbeiner, D. P., & Davis, M. 1998, ApJ, 500, 525 Seljak, U., &; Zaldarriaga, M. 1996, ApJ, 469, 437 Slosar, A., Seljak, U., & Makarov, A. 2004, PRD, 69, 123003 Smoot, G. F., et al. 1991, ApJL, 371, LI Smoot, G. F. 1999, ASP Conf. Ser. 181: Microwave Foregrounds, 181, 61 Spergel, D. N., et al. 2003, ApJS, 148, 175S Tanner, M. A., & Wong, W. H. 1987, Journal of the American Statistical Association 82, 528-550 Tanner, M. A. 1996, Tools for statistical inference, 3rd ed. Springer-Verlag, New York. Ill R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. Tegmark, M., & Efstathiou, G. 1996, MNRAS, 281, 1297 Tegmark, M., de Oliveira-Costa, A., & Hamilton, A. J. 2003, PRD, 68, 123523 Valls-Gabaud, D. 1998, Publications of the Astronomical Society of Australia, 15, 111 Walker, A. G. 1936, Proc. Lond. Math. Soc. (2) 42, 90 Wandelt, B.D., Hivon, E.F, Gorski, K.M. 2001, PRD, 64, 083003 Wandelt, B. D., Larson, D. L., &; Lakshminarayanan, A. 2004, PRD, 70, 083511 Wandelt, B. D. 2004, in “Stanford 2003, Statistical problems in particle physics, astrophysics and cosmology” , Lyons, L. et al. (Eds.), p. 229 [astro-ph/0401623] Wandelt, B. D. and Hansen, F. 2003, PRD, 67, 023001 112 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. C urriculum V ita e Ian John O’Dwyer Astronomy Department, University of Illinois, 1004 W. Green St., Urbana, IL, 61801 [email protected] Office Phone: 217 333 9343 U niversity Education 1999-present Astronomy Department, University of Illinois PhD 2005 (expected) MS 2001 (Supervisor: P rofessor B .D . W andelt) Thesis title: ‘Statistical Analysis of the Cosmic Microwave Background: Power Spectra and Foregrounds ’ 1991-1994 University of Cambridge, Cambridge, UK MA (Cantab.) 1996 Engineering BA(Hons) 1994 W ork Experience 1994-1999 Analyst, Goldman, Sachs and Co., London, UK and Tokyo, Japan Observing E xperience Wrote proposal and was awarded observing time on the NOAO Kitt Peak 2.1m tele scope. Completed several observing runs on the lm telescope at Mount Laguna, CA. Teaching Experience Instructor - Astronomy 100, Summer 2005 Teaching Assistant - Various undergraduate and graduate level classes 1999-2005 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. IT Skills • Fortran 90 and C programming languages • MPI and OPENMP parallelization schemes on several large scale architectures • AWK and PERL scripting languages • IDL and Mathematica • I RAF data reduction package Sum m er Schools and W orkshops 2004 Santa Fe Cosmology Workshop, Santa Fe, NM 2002 Chalonge School, Palermo, Sicily Awards 2002 Univeristy of Illinois teaching award Fellowship to attend 2002 Chalonge School Croxton and Garry award for achievement in Chemistry Professional Organizations Graduate student member of American Physical Society Graduate student member of American Astronomical Society R eferees USA P r o fe sso r M. B ersa n elli Physics D epartm ent University of Milano via Veloria 16 20133 Milano Ita ly email: [email protected] uiuc.edu Tel: 1-217-333-9357 email: [email protected] Tel: 02-50317619 P ro fesso r B . D . W a n d e lt Physics D epartm ent University of Illinois 1110 W. Green Street U rb a n a, IL 61801 P ro fesso r B . D . F ie ld s D epartm ent of A stronom y University of Illinois 1002 W Greeen St U rb a n a ,IL 61801 USA email: [email protected] astro.uiuc.edu Tel: 1-217-333-5529 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission. P u b lic a tio n s and P re -P rin ts (available at http://www.astro.uiuc.edu/~iodvryer/pubs.php) Cosm ological Param eter C onstraints as Derived from W M A P D ata via Gibbs Sam pling and the Blackwell-Rao E stim ator M. Chu, H.K. Eriksen, L. Knox, K. Gorski, J.B. Jewell, D.L. Larson, I. J. O’Dwyer, B.D. Wandelt, 2005, PRD, 71, 103002 Bayesian Power Spectrum A nalysis o f th e First-Year W M A P D ata I. J. O’Dwyer, H.K. Eriksen, B.D. Wandelt, J.B. Jewell, D.L. Larson, K. M. Gorski, A.J. Banday, S. Levin, P.B. Lilje, 2004, ApJL, 617, L99 Power Spectrum E stim ation from H igh-R esolution M aps by Gibbs Sam pling H. K. Eriksen, I. J. O’Dwyer, J. B. Jewell, B. D. Wandelt, D. L. Larson, K. M. Gorski, S. Levin, A. J. Banday, P. B. Lilje, 2004, ApJS, 155, 227 T he CM B Power Spectrum from the Background Em ission A nisotropy Scanning T elescope (B E A ST ) Experim ent Ian J. O’Dwyer, Marco Bersanelli, Jeffrey Childers, Newton Figueiredo, Doron Halevi, Gregory G. Huey, Philip M. Lubin, Davide Maino, Nazzareno Mandolesi, Joshua Marvil, Peter R. Meinhold, Jorge Mejia, Paolo Natoli, Hugh O’Neill, Agenor Pina, Michael D. Seiffert, Nathan C. Stebor, Camilo Tello, Thyrso Villela, Benjamin D. Wandelt, Brian Williams, Carlos Alexandre Wuensche, 2005, ApJS, 158, 93 A M ap of th e Cosm ic M icrowave Background from the B E A S T E xperi m ent Peter R. Meinhold, Marco Bersanelli, Jeffrey Childers, Newton Figueiredo, Todd C. Gaier, Doron Halevi, Charles R. Lawrence, Miikka Kangas, Alan Levy, Philip M. Lubin, Marco Malaspina, Nazzareno Mandolesi, Joshua Marvil, Jorge Mejia, Paolo Natoli, Ian O’Dwyer, Hugh O’Neill, Shane Parendo, Agenor Pina, Michael D. Seif fert, Nathan C. Stebor, Camilo Tello, Fabrizio Villa, Thyrso Villela, Benjamin D. Wandelt, Brian Williams, Carlos Alexandre Wuensche, 2005, ApJS, 158, 101 G alactic Foreground C ontribution to the B E A ST CM B A nisotropy Maps Jorge Mejfa, Marco Bersanelli, Carlo Burigana, Jeff Childers, Newton Figueiredo, Miikka Kangas, Philip Lubin, Davide Maino, Nazzareno Mandolesi, Josh Marvil, Peter Meinhold, Ian O’Dwyer, Hugh O ’Neill, Paola Platania, Michael Seiffert, Nathan Stebor, Camilo Tello, Thyrso Villela, Benjamin Wandelt, Carlos Alexandre Wuensche, 2005, ApJS, 158, 109 Hard X-ray Em ission A ssociated w ith W hite Dwarfs I. J. O’Dwyer, Y.-H. Chu, R. A. Gruendl, M. A. Guerrero, R. F. Webbink , 2003, AJ, 125, 2239 Variable H -a Line Em ission from the Central Star o f th e H elix N ebula Robert A. Gruendl, You-Hua Chu, Ian J. O’Dwyer, Martin A. Guerrero , 2001, AJ, 122, 308 R eproduced with perm ission of the copyright owner. Further reproduction prohibited without permission.

1/--страниц