Constructing Mock X-ray Observations


The photon_simulator analysis module has been deprecated; it is no longer being updated, and it will be removed in a future version of yt. Users are encouraged to download and use the pyXSIM package instead.


If you just want to create derived fields for X-ray emission, you should go here instead.

The photon_simulator analysis module enables the creation of simulated X-ray photon lists of events from datasets that yt is able to read. The simulated events then can be exported to X-ray telescope simulators to produce realistic observations or can be analyzed in-line.

For detailed information about the design of the algorithm in yt, check out the SciPy 2014 Proceedings..

The algorithm is based off of that implemented in PHOX for SPH datasets by Veronica Biffi and Klaus Dolag. There are two relevant papers:

Biffi, V., Dolag, K., Bohringer, H., & Lemson, G. 2012, MNRAS, 420, 3545

Biffi, V., Dolag, K., Bohringer, H. 2013, MNRAS, 428, 1395

The basic procedure is as follows:

  1. Using a spectral model for the photon flux given the gas properties, and an algorithm for generating photons from the dataset loaded in yt, produce a large number of photons in three-dimensional space associated with the cells of the dataset.

  2. Use this three-dimensional dataset as a sample from which to generate photon events that are projected along a line of sight, Doppler and cosmologically shifted, and absorbed by the Galactic foreground.

  3. Optionally convolve these photons with instrument responses and produce images and spectra.

We’ll demonstrate the functionality on a realistic dataset of a galaxy cluster to get you started.


Currently, the photon_simulator analysis module only works with grid-based data.

Creating an X-ray observation of a dataset on disk

import yt
#yt.enable_parallelism() # If you want to run in parallel this should go here!
from yt.analysis_modules.photon_simulator.api import *
from yt.utilities.cosmology import Cosmology


For parallel runs using mpi4py, the call to yt.enable_parallelism should go before the import of the photon_simulator module, as shown above.

We’re going to load up an Athena dataset of a galaxy cluster core:

ds = yt.load("MHDSloshing/virgo_low_res.0054.vtk",

First, to get a sense of what the resulting image will look like, let’s make a new yt field called "density_squared", since the X-ray emission is proportional to \(\rho^2\), and a weak function of temperature and metallicity.

def _density_squared(field, data):
    return data["density"]**2
ds.add_field("density_squared", function=_density_squared, units="g**2/cm**6")

Then we’ll project this field along the z-axis.

prj = yt.ProjectionPlot(ds, "z", ["density_squared"], width=(500., "kpc"))
prj.set_cmap("density_squared", "gray_r")

In this simulation the core gas is sloshing, producing spiral-shaped cold fronts.


To work out the following examples, you should install AtomDB and get the files from the xray_data auxiliary data package (see the Auxiliary Data Files for use with yt’s Photon Simulator for details on the latter). Make sure that in what follows you specify the full path to the locations of these files.

To generate photons from this dataset, we have several different things we need to set up. The first is a standard yt data object. It could be all of the cells in the domain, a rectangular solid region, a cylindrical region, etc. Let’s keep it simple and make a sphere at the center of the domain, with a radius of 250 kpc:

sp = ds.sphere("c", (250., "kpc"))

This will serve as our data_source that we will use later. Next, we need to create the SpectralModel instance that will determine how the data in the grid cells will generate photons. By default, two options are available. The first, XSpecThermalModel, allows one to use any thermal model that is known to XSPEC, such as "mekal" or "apec":

mekal_model = XSpecThermalModel("mekal", 0.01, 10.0, 2000)

This requires XSPEC and PyXspec to be installed. The second option, TableApecModel, utilizes the data from the AtomDB tables. We’ll use this one here:

apec_model = TableApecModel("$SPECTRAL_DATA/spectral",
                            0.01, 20.0, 20000,

The first argument sets the location of the AtomDB files, and the next three arguments determine the minimum energy in keV, maximum energy in keV, and the number of linearly-spaced bins to bin the spectrum in. If the optional keyword thermal_broad is set to True, the spectral lines will be thermally broadened.


SpectralModel objects based on XSPEC models (both the thermal emission and Galactic absorption models mentioned below) only work in Python 2.7, since currently PyXspec only works with Python 2.x.

Now that we have our SpectralModel that gives us a spectrum, we need to connect this model to a PhotonModel that will connect the field data in the data_source to the spectral model to actually generate photons. For thermal spectra, we have a special PhotonModel called ThermalPhotonModel:

thermal_model = ThermalPhotonModel(apec_model, X_H=0.75, Zmet=0.3,

Where we pass in the SpectralModel, and can optionally set values for the hydrogen mass fraction X_H and metallicity Z_met. If Z_met is a float, it will assume that value for the metallicity everywhere in terms of the solar metallicity. If it is a string, it will assume that is the name of the metallicity field (which may be spatially varying).

The ThermalPhotonModel iterates over “chunks” of the supplied data source to generate the photons, to reduce memory usage and make parallelization more efficient. For each chunk, memory is set aside for the photon energies that will be generated. photons_per_chunk is an optional keyword argument which controls the size of this array. For large numbers of photons, you may find that this parameter needs to be set higher, or if you are looking to decrease memory usage, you might set this parameter lower.

The method keyword argument is also optional, and determines how the individual photon energies are generated from the spectrum. It may be set to one of two values:

  • method="invert_cdf": Construct the cumulative distribution function of the spectrum and invert it, using uniformly drawn random numbers to determine the photon energies (fast, but relies on construction of the CDF and interpolation between the points, so for some spectra it may not be accurate enough).

  • method="accept_reject": Generate the photon energies from the spectrum using an acceptance-rejection technique (accurate, but likely to be slow).

method="invert_cdf" (the default) should be sufficient for most cases.

Next, we need to specify “fiducial” values for the telescope collecting area, exposure time, and cosmological redshift. Remember, the initial photon generation will act as a source for Monte-Carlo sampling for more realistic values of these parameters later, so choose generous values so that you have a large number of photons to sample from. We will also construct a Cosmology object:

A = 3000.
exp_time = 4.0e5
redshift = 0.05
cosmo = Cosmology()

Now, we finally combine everything together and create a PhotonList instance:

photons = PhotonList.from_scratch(sp, redshift, A, exp_time,
                                  thermal_model, center="c",

By default, the angular diameter distance to the object is determined from the cosmology and the cosmological redshift. If a Cosmology instance is not provided, one will be made from the default cosmological parameters. The center keyword argument specifies the center of the photon distribution, and the photon positions will be rescaled with this value as the origin. This argument accepts the following values:

  • A NumPy array or list corresponding to the coordinates of the center in units of code length.

  • A YTArray corresponding to the coordinates of the center in some length units.

  • "center" or "c" corresponds to the domain center.

  • "max" or "m" corresponds to the location of the maximum gas density.

  • A two-element tuple specifying the max or min of a specific field, e.g., ("min","gravitational_potential"), ("max","dark_matter_density")

If center is not specified, from_scratch will attempt to use the "center" field parameter of the data_source.

from_scratch takes a few other optional keyword arguments. If your source is local to the galaxy, you can set its distance directly, using a tuple, e.g. dist=(30, "kpc"). In this case, the redshift and cosmology will be ignored. Finally, if the photon generating function accepts any parameters, they can be passed to from_scratch via a parameters dictionary.

At this point, the photons are distributed in the three-dimensional space of the data_source, with energies in the rest frame of the plasma. Doppler and/or cosmological shifting of the photons will be applied in the next step.

The photons can be saved to disk in an HDF5 file:


Which is most useful if it takes a long time to generate the photons, because a PhotonList can be created in-memory from the dataset stored on disk:

photons = PhotonList.from_file("my_photons.h5")

This enables one to make many simulated event sets, along different projections, at different redshifts, with different exposure times, and different instruments, with the same data_source, without having to do the expensive step of generating the photons all over again!

To get a set of photon events such as that observed by X-ray telescopes, we need to take the three-dimensional photon distribution and project it along a line of sight. Also, this is the step at which we put in the realistic values for the telescope collecting area, cosmological redshift and/or source distance, and exposure time. The order of operations goes like this:

  1. From the adjusted exposure time, redshift and/or source distance, and telescope collecting area, determine the number of photons we will actually observe.

  2. Determine the plane of projection from the supplied normal vector, and reproject the photon positions onto this plane.

  3. Doppler-shift the photon energies according to the velocity along the line of sight, and apply cosmological redshift if the source is not local.

  4. Optionally, alter the received distribution of photons via an energy-dependent galactic absorption model.

  5. Optionally, alter the received distribution of photons using an effective area curve provided from an ancillary response file (ARF).

  6. Optionally, scatter the photon energies into channels according to the information from a redistribution matrix file (RMF).

First, if we want to apply galactic absorption, we need to set up a spectral model for the absorption coefficient, similar to the spectral model for the emitted photons we set up before. Here again, we have two options. The first, XSpecAbsorbModel, allows one to use any absorption model that XSpec is aware of that takes only the Galactic column density \(N_H\) as input:

N_H = 0.1
abs_model = XSpecAbsorbModel("wabs", N_H)

The second option, TableAbsorbModel, takes as input an HDF5 file containing two datasets, "energy" (in keV), and "cross_section" (in \(cm^2\)), and the Galactic column density \(N_H\):

abs_model = TableAbsorbModel("tbabs_table.h5", 0.1)

Now we’re ready to project the photons. First, we choose a line-of-sight vector normal. Second, we’ll adjust the exposure time and the redshift. Third, we’ll pass in the absorption SpectrumModel. Fourth, we’ll specify a sky_center in RA and DEC on the sky in degrees.

Also, we’re going to convolve the photons with instrument responses. For this, you need a ARF/RMF pair with matching energy bins. This is of course far short of a full simulation of a telescope ray-trace, but it’s a quick-and-dirty way to get something close to the real thing. We’ll discuss how to get your simulated events into a format suitable for reading by telescope simulation codes later. If you just want to convolve the photons with an ARF, you may specify that as the only response, but some ARFs are unnormalized and still require the RMF for normalization. Check with the documentation associated with these files for details. If we are using the RMF to convolve energies, we must set convolve_energies=True.

ARF = "acisi_aimpt_cy17.arf"
RMF = "acisi_aimpt_cy17.rmf"
normal = [0.0,0.0,1.0]
events = photons.project_photons(normal, exp_time_new=2.0e5, redshift_new=0.07, dist_new=None,
                                 absorb_model=abs_model, sky_center=(187.5,12.333), responses=[ARF,RMF],
                                 convolve_energies=True, no_shifting=False, north_vector=None,

In this case, we chose a three-vector normal to specify an arbitrary line-of-sight, but "x", "y", or "z" could also be chosen to project along one of those axes.

project_photons takes several other optional keyword arguments.

  • no_shifting (default False) controls whether or not Doppler shifting of photon energies is turned on.

  • dist_new is a (value, unit) tuple that is used to set a new angular diameter distance by hand instead of having it determined by the cosmology and the value of the redshift. Should only be used for simulations of nearby objects.

  • For off-axis normal vectors, the north_vector argument can be used to control what vector corresponds to the “up” direction in the resulting event list.

  • psf_sigma may be specified to provide a crude representation of a PSF, and corresponds to the standard deviation (in degrees) of a Gaussian PSF model.

Let’s just take a quick look at the raw events object:

{'eobs': YTArray([  0.32086522,   0.32271389,   0.32562708, ...,   8.90600621,
         9.73534237,  10.21614256]) keV,
 'xsky': YTArray([ 187.5177707 ,  187.4887825 ,  187.50733609, ...,  187.5059345 ,
        187.49897546,  187.47307048]) degree,
 'ysky': YTArray([ 12.33519996,  12.3544496 ,  12.32750903, ...,  12.34907707,
        12.33327653,  12.32955225]) degree,
 'ypix': array([ 133.85374195,  180.68583074,  115.14110561, ...,  167.61447493,
        129.17278711,  120.11508562]),
 'PI': array([ 27,  15,  25, ..., 609, 611, 672]),
 'xpix': array([  86.26331108,  155.15934197,  111.06337043, ...,  114.39586907,
        130.93509652,  192.50639633])}

We can bin up the events into an image and save it to a FITS file. The pixel size of the image is equivalent to the smallest cell size from the original dataset. We can specify limits for the photon energies to be placed in the image:

events.write_fits_image("sloshing_image.fits", clobber=True, emin=0.5, emax=7.0)

The resulting FITS image will have WCS coordinates in RA and Dec. It should be suitable for plotting in ds9, for example. There is also a great project for opening astronomical images in Python, called APLpy:

import aplpy
fig = aplpy.FITSFigure("sloshing_image.fits", figsize=(10,10))
fig.show_colorscale(stretch="log", vmin=0.1, cmap="gray_r")
fig.set_axis_labels_font(family="serif", size=16)
fig.set_tick_labels_font(family="serif", size=16)

Which is starting to look like a real observation!


The binned images that result, even if you convolve with responses, are still of the same resolution as the finest cell size of the simulation dataset. If you want a more accurate simulation of a particular X-ray telescope, you should check out Storing events for future use and for reading-in by telescope simulators.

We can also bin up the spectrum into energy bins, and write it to a FITS table file. This is an example where we’ve binned up the spectrum according to the unconvolved photon energy:

events.write_spectrum("virgo_spec.fits", bin_type="energy", emin=0.1, emax=10.0, nchan=2000, clobber=True)

We can also set bin_type="channel". If we have convolved our events with response files, then any other keywords will be ignored and it will try to make a spectrum from the channel information that is contained within the RMF. Otherwise, the channels will be determined from the emin, emax, and nchan keywords, and will be numbered from 1 to nchan. For now, we’ll stick with the energy spectrum, and plot it up:

import as pyfits
f ="virgo_spec.fits")
pylab.loglog(f["SPECTRUM"].data.field("ENERGY"), f["SPECTRUM"].data.field("COUNTS"))
pylab.xlim(0.3, 10)
pylab.xlabel("E (keV)")

We can also write the events to a FITS file that is of a format that can be manipulated by software packages like CIAO and read in by ds9 to do more standard X-ray analysis:

events.write_fits_file("my_events.fits", clobber=True)


We’ve done some very low-level testing of this feature, and it seems to work, but it may not be consistent with standard FITS events files in subtle ways that we haven’t been able to identify. Please email if you find any bugs!

Two EventList instances can be added together, which is useful if they were created using different data sources:

events3 = events1+events2


This only works if the two event lists were generated using the same parameters!

Finally, a new EventList can be created from a subset of an existing EventList, defined by a ds9 region (this functionality requires the pyregion package to be installed):

circle_events = events.filter_events("circle.reg")

Creating a X-ray observation from an in-memory dataset

It may be useful, especially for observational applications, to create datasets in-memory and then create simulated observations from them. Here is a relevant example of creating a toy cluster and evacuating two AGN-blown bubbles in it.

First, we create the in-memory dataset (see Loading Generic Array Data for details on how to do this):

import yt
import numpy as np
from yt.utilities.physical_ratios import cm_per_kpc, K_per_keV
from yt.units import mp
from yt.utilities.cosmology import Cosmology
from yt.analysis_modules.photon_simulator.api import *
import aplpy

R = 1000. # in kpc
r_c = 100. # in kpc
rho_c = 1.673e-26 # in g/cm^3
beta = 1.
T = 4. # in keV
nx = 256

bub_rad = 30.0
bub_dist = 50.0

ddims = (nx,nx,nx)

x, y, z = np.mgrid[-R:R:nx*1j,

r = np.sqrt(x**2+y**2+z**2)

dens = np.zeros(ddims)
dens[r <= R] = rho_c*(1.+(r[r <= R]/r_c)**2)**(-1.5*beta)
dens[r > R] = 0.0
temp = T*K_per_keV*np.ones(ddims)
rbub1 = np.sqrt(x**2+(y-bub_rad)**2+z**2)
rbub2 = np.sqrt(x**2+(y+bub_rad)**2+z**2)
dens[rbub1 <= bub_rad] /= 100.
dens[rbub2 <= bub_rad] /= 100.
temp[rbub1 <= bub_rad] *= 100.
temp[rbub2 <= bub_rad] *= 100.

This created a cluster with a radius of 1 Mpc, a uniform temperature of 4 keV, and a density distribution from a \(\beta\)-model. We then evacuated two “bubbles” of radius 30 kpc at a distance of 50 kpc from the center.

Now, we create a yt Dataset object out of this dataset:

data = {}
data["density"] = (dens, "g/cm**3")
data["temperature"] = (temp, "K")
data["velocity_x"] = (np.zeros(ddims), "cm/s")
data["velocity_y"] = (np.zeros(ddims), "cm/s")
data["velocity_z"] = (np.zeros(ddims), "cm/s")

bbox = np.array([[-0.5,0.5],[-0.5,0.5],[-0.5,0.5]])

ds = yt.load_uniform_grid(data, ddims, 2*R*cm_per_kpc, bbox=bbox)

where for simplicity we have set the velocities to zero, though we could have created a realistic velocity field as well. Now, we generate the photon and event lists in the same way as the previous example:

sphere = ds.sphere("c", (1.0,"Mpc"))

A = 3000.
exp_time = 2.0e5
redshift = 0.05
cosmo = Cosmology()

apec_model = TableApecModel("/Users/jzuhone/Data/atomdb_v2.0.2",
                            0.01, 20.0, 20000)
abs_model = TableAbsorbModel("tbabs_table.h5", 0.1)

thermal_model = ThermalPhotonModel(apec_model, photons_per_chunk=40000000)
photons = PhotonList.from_scratch(sphere, redshift, A,
                                  exp_time, thermal_model, center="c")

events = photons.project_photons([0.0,0.0,1.0],

events.write_fits_image("img.fits", clobber=True)

which yields the following image:

fig = aplpy.FITSFigure("img.fits", figsize=(10,10))
fig.show_colorscale(stretch="log", vmin=0.1, vmax=600., cmap="jet")
fig.set_axis_labels_font(family="serif", size=16)
fig.set_tick_labels_font(family="serif", size=16)

Storing events for future use and for reading-in by telescope simulators

If you want a more accurate representation of an observation taken by a particular instrument, there are tools available for such purposes. For the Chandra telescope, there is the venerable MARX. For a wide range of instruments, both existing and future, there is SIMX. We’ll discuss two ways to store your event files so that they can be input by these and other codes.

The first option is the most general, and the simplest: simply dump the event data to an HDF5 file:


This will dump the raw event data, as well as the associated parameters, into the file. If you want to read these events back in, it’s just as simple:

events = EventList.from_h5_file("my_events.h5")

You can use event data written to HDF5 files to input events into MARX using this code.

The second option, for use with SIMX, is to dump the events into a SIMPUT file:

events.write_simput_file("my_events", clobber=True, emin=0.1, emax=10.0)

which will write two files, "my_events_phlist.fits" and "my_events_simput.fits", the former being a auxiliary file for the latter.


You can only write SIMPUT files if you didn’t convolve the photons with responses, since the idea is to pass unconvolved photons to the telescope simulator.

The following images were made from the same yt-generated events in both MARX and SIMX. They are 200 ks observations of the two example clusters from above (the Chandra images have been reblocked by a factor of 4):

../../_images/ds9_sloshing.png ../../_images/ds9_bubbles.png

In November 2015, the structure of the photon and event HDF5 files changed. To convert an old-format file to the new format, use the convert_old_file utility:

from yt.analysis_modules.photon_simulator.api import convert_old_file
convert_old_file("old_photons.h5", "new_photons.h5", clobber=True)
convert_old_file("old_events.h5", "new_events.h5", clobber=True)

This utility will auto-detect the kind of file (photons or events) and will write the correct replacement for the new version.

At times it may be convenient to write several EventLists to disk to be merged together later. This can be achieved with the merge_files utility. It takes a list of

from yt.analysis_modules.photon_simulator.api import merge_files
merge_files(["events_0.h5", "events_1.h5", "events_2.h5"], "merged_events.h5",
             add_exposure_times=True, clobber=False)

At the current time this utility is very limited, as it only allows merging of EventLists which have the same parameters, with the exception of the exposure time. If the add_exposure_times argument to merge_files is set to True, the lists will be merged together with the exposure times added. Otherwise, the exposure times of the different files must be equal.