# Calculating Dataset Information¶

These recipes demonstrate methods of calculating quantities in a simulation, either for later visualization or for understanding properties of fluids and particles in the simulation.

## Average Field Value¶

This recipe is a very simple method of calculating the global average of a given field, as weighted by another field. See Processing Objects: Derived Quantities for more information.

import yt

field = "temperature"  # The field to average
weight = "cell_mass"  # The weight for the average

ad = ds.all_data()  # This is a region describing the entire box,
# but note it doesn't read anything in yet!

# We now use our 'quantities' call to get the average quantity

print("Average %s (weighted by %s) is %0.3e %s" % (field, weight, average_value, average_value.units))


## Mass Enclosed in a Sphere¶

This recipe constructs a sphere and then sums the total mass in particles and fluids in the sphere. See Available Objects and Processing Objects: Derived Quantities for more information.

import yt

# Create a 1 Mpc radius sphere, centered on the max density.
sp = ds.sphere("max", (1.0, "Mpc"))

# Use the total_quantity derived quantity to sum up the
# values of the cell_mass and particle_mass fields
# within the sphere.
baryon_mass, particle_mass = sp.quantities.total_quantity(["cell_mass", "particle_mass"])

print("Total mass in sphere is %0.3e Msun (gas = %0.3e Msun, particles = %0.3e Msun)" % \
((baryon_mass + particle_mass).in_units('Msun'), \
baryon_mass.in_units('Msun'), particle_mass.in_units('Msun')))


## Global Phase Plot¶

This is a simple recipe to show how to open a dataset and then plot a couple global phase diagrams, save them, and quit. See 2D Phase Plots for more information.

import yt

# This is an object that describes the entire box

# We plot the average velocity magnitude (mass-weighted) in our object
# as a function of density and temperature
plot = yt.PhasePlot(ad, "density", "temperature", "velocity_magnitude")

# save the plot
plot.save()


This recipe demonstrates how to subtract off a bulk velocity on a sphere before calculating the radial velocity within that sphere. See 1D Profile Plots for more information on creating profiles and Field Parameters for an explanation of how the bulk velocity is provided to the radial velocity field function.

import yt
import matplotlib.pyplot as plt

# Get the first sphere
sp0 = ds.sphere(ds.domain_center, (500., "kpc"))

# Compute the bulk velocity from the cells in this sphere
bulk_vel = sp0.quantities.bulk_velocity()

# Get the second sphere
sp1 = ds.sphere(ds.domain_center, (500., "kpc"))

# Set the bulk velocity field parameter
sp1.set_field_parameter("bulk_velocity", bulk_vel)

# Radial profile with correction for bulk velocity

# Make a plot using matplotlib

fig = plt.figure()

ax.set_xlabel(r"$\mathrm{r\ (kpc)}$")
ax.set_ylabel(r"$\mathrm{v_r\ (km/s)}$")
ax.legend(["Without Correction", "With Correction"])

fig.savefig("%s_profiles.png" % ds)


## Simulation Analysis¶

This uses DatasetSeries to calculate the extrema of a series of outputs, whose names it guesses in advance. This will run in parallel and take advantage of multiple MPI tasks. See Parallel Computation With yt and Time Series Analysis for more information.

import yt
yt.enable_parallelism()
import collections

# Enable parallelism in the script (assuming it was called with
# mpirun -np <n_procs> )
yt.enable_parallelism()

# By using wildcards such as ? and * with the load command, we can load up a
# Time Series containing all of these datasets simultaneously.

# Calculate and store density extrema for all datasets along with redshift
# in a data dictionary with entries as tuples

# Create an empty dictionary
data = {}

# Iterate through each dataset in the Time Series (using piter allows it
# to happen in parallel automatically across available processors)
for ds in ts.piter():

# Fill the dictionary with extrema and redshift information for each dataset
data[ds.basename] = (extrema, ds.current_redshift)

# Convert dictionary to ordered dictionary to get the right order
od = collections.OrderedDict(sorted(data.items()))

# Print out all the values we calculated.
print("Dataset      Redshift        Density Min      Density Max")
print("---------------------------------------------------------")
for key, val in od.items():
print("%s       %05.3f          %5.3g g/cm^3   %5.3g g/cm^3" % \
(key, val[1], val[0][0], val[0][1]))


## Smoothed Fields¶

This recipe demonstrates how to create a smoothed field, corresponding to a user-created derived field, using the add_volume_weighted_smoothed_field() method. See Using yt to view and analyze Gadget outputs for how to work with Gadget data.

import yt

unit_base = {'UnitLength_in_cm'         : 3.08568e+21,
'UnitMass_in_g'            :   1.989e+43,
'UnitVelocity_in_cm_per_s' :      100000}

bbox_lim = 1e5  # kpc

bbox = [[-bbox_lim, bbox_lim],
[-bbox_lim, bbox_lim],
[-bbox_lim, bbox_lim]]

# Create a derived field, the metal density.
def _metal_density(field, data):
density = data['PartType0', 'Density']
Z = data['PartType0', 'metallicity']
return density * Z

# Add it to the dataset.
units="g/cm**3", particle_type=True)

# Add the corresponding smoothed field to the dataset.

'SmoothingLength', 'Density',
'metal_density', ds.field_info)

# Define the region where the disk galaxy is. (See the Gadget notebook for
# details. Here I make the box a little larger than needed to eliminate the
# margin effect.)
center = ds.arr([31996, 31474, 28970], "code_length")
box_size = ds.quan(250, "code_length")
left_edge = center - box_size/2*1.1
right_edge = center + box_size/2*1.1
box = ds.box(left_edge=left_edge, right_edge=right_edge)

# And make a projection plot!
yt.ProjectionPlot(ds, 'z',
('deposit', 'PartType0_smoothed_metal_density'),
center=center, width=box_size, data_source=box).save()


## Time Series Analysis¶

This recipe shows how to calculate a number of quantities on a set of parameter files. Note that it is parallel aware, and that if you only wanted to run in serial the operation for pf in ts: would also have worked identically. See Parallel Computation With yt and Time Series Analysis for more information.

import yt
import matplotlib.pyplot as plt
import numpy as np

# Enable parallelism in the script (assuming it was called with
# mpirun -np <n_procs> )
yt.enable_parallelism()

# By using wildcards such as ? and * with the load command, we can load up a
# Time Series containing all of these datasets simultaneously.

storage = {}

# By using the piter() function, we can iterate on every dataset in
# the TimeSeries object.  By using the storage keyword, we can populate
# a dictionary where the dataset is the key, and sto.result is the value
# for later use when the loop is complete.

# The serial equivalent of piter() here is just "for ds in ts:" .

for store, ds in ts.piter(storage=storage):

# Create a sphere of radius 100 kpc at the center of the dataset volume
sphere = ds.sphere("c", (100., "kpc"))
# Calculate the entropy within that sphere
entr = sphere["entropy"].sum()
# Store the current time and sphere entropy for this dataset in our
# storage dictionary as a tuple
store.result = (ds.current_time.in_units('Gyr'), entr)

# Convert the storage dictionary values to a Nx2 array, so the can be easily
# plotted
arr = np.array(list(storage.values()))

# Plot up the results: time versus entropy
plt.semilogy(arr[:,0], arr[:,1], 'r-')
plt.xlabel("Time (Gyr)")
plt.ylabel("Entropy (ergs/K)")
plt.savefig("time_versus_entropy.png")


## Simple Derived Fields¶

This recipe demonstrates how to create a simple derived field, thermal_energy_density, and then generate a projection from it. See Creating Derived Fields and Projection Plots for more information.

import yt

# You can create a derived field by manipulating any existing derived fields
# in any way you choose.  In this case, let's just make a simple one:
# thermal_energy_density = 3/2 nkT

# First create a function which yields your new derived field
def thermal_energy_dens(field, data):
return (3/2)*data['gas', 'number_density'] * data['gas', 'kT']

# It will now show up in your derived_field_list
for i in sorted(ds.derived_field_list):
print(i)

# Let's use it to make a projection
yt.ProjectionPlot(ds, "x", "thermal_energy_density", weight_field="density", width=(200, 'kpc')).save()


## Complicated Derived Fields¶

This recipe demonstrates how to use the add_gradient_fields() method to generate gradient fields and use them in a more complex derived field.

import numpy as np
import yt

# Open a dataset from when there's a lot of sloshing going on.

# Define the components of the gravitational acceleration vector field by
# taking the gradient of the gravitational potential

# We don't need to do the same for the pressure field because yt already
# has pressure gradient fields. Now, define the "degree of hydrostatic
# equilibrium" field.

def _hse(field, data):
# Remember that g is the negative of the potential gradient
h = np.sqrt((hx * hx + hy * hy + hz * hz) / (gx * gx + gy * gy + gz * gz))
return h
display_name='Hydrostatic Equilibrium')

# The gradient operator requires periodic boundaries.  This dataset has
# open boundary conditions.  We need to hack it for now (this will be fixed
# in future version of yt)
ds.periodicity = (True, True, True)

# Take a slice through the center of the domain
slc = yt.SlicePlot(ds, 2, ["density", "HSE"], width=(1, 'Mpc'))

slc.save("hse")


## Using Particle Filters to Calculate Star Formation Rates¶

This recipe demonstrates how to use a particle filter to calculate the star formation rate in a galaxy evolution simulation. See Filtering Particle Fields for more information.

import yt
import numpy as np
from matplotlib import pyplot as plt

def formed_star(pfilter, data):
filter = data["all", "creation_time"] > 0
return filter

requires=["creation_time"])

filename = "IsolatedGalaxy/galaxy0030/galaxy0030"

time_range = [0, 5e8] # years
n_bins = 1000
hist, bins = np.histogram(formation_time, bins=n_bins, range=time_range,)
inds = np.digitize(formation_time, bins=bins)
time = (bins[:-1] + bins[1:])/2

sfr = np.array([masses[inds == j+1].sum()/(bins[j+1]-bins[j])
for j in range(len(time))])
sfr[sfr == 0] = np.nan

plt.plot(time/1e6, sfr)
plt.xlabel('Time  [Myr]')
plt.ylabel('SFR  [M$_\odot$ yr$^{-1}$]')
plt.savefig("filter_sfr.png")


## Making a Turbulent Kinetic Energy Power Spectrum¶

This recipe shows how to use yt to read data and put it on a uniform grid to interface with the NumPy FFT routines and create a turbulent kinetic energy power spectrum. (Note: the dataset used here is of low resolution, so the turbulence is not very well-developed. The spike at high wavenumbers is due to non-periodicity in the z-direction).

import numpy as np
import matplotlib.pyplot as plt
import yt

"""
Make a turbulent KE power spectrum.  Since we are stratified, we use
a rho**(1/3) scaling to the velocity to get something that would
look Kolmogorov (if the turbulence were fully developed).

Ultimately, we aim to compute:

1  ^      ^*
E(k) = integral  -  V(k) . V(k) dS
2

n                                               ^
where V = rho  U is the density-weighted velocity field, and V is the
FFT of V.

(Note: sometimes we normalize by 1/volume to get a spectral
energy density spectrum).

"""

def doit(ds):

# a FFT operates on uniformly gridded data.  We'll use the yt
# covering grid for this.

max_level = ds.index.max_level

ref = int(np.product(ds.ref_factors[0:max_level]))

low = ds.domain_left_edge
dims = ds.domain_dimensions*ref

nx, ny, nz = dims

nindex_rho = 1./3.

Kk = np.zeros( (nx//2+1, ny//2+1, nz//2+1))

for vel in [("gas", "velocity_x"), ("gas", "velocity_y"),
("gas", "velocity_z")]:

Kk += 0.5*fft_comp(ds, ("gas", "density"), vel,
nindex_rho, max_level, low, dims)

# wavenumbers
L = (ds.domain_right_edge - ds.domain_left_edge).d

kx = np.fft.rfftfreq(nx)*nx/L[0]
ky = np.fft.rfftfreq(ny)*ny/L[1]
kz = np.fft.rfftfreq(nz)*nz/L[2]

# physical limits to the wavenumbers
kmin = np.min(1.0/L)
kmax = np.min(0.5*dims/L)

kbins = np.arange(kmin, kmax, kmin)
N = len(kbins)

# bin the Fourier KE into radial kbins
kx3d, ky3d, kz3d = np.meshgrid(kx, ky, kz, indexing="ij")
k = np.sqrt(kx3d**2 + ky3d**2 + kz3d**2)

whichbin = np.digitize(k.flat, kbins)
ncount = np.bincount(whichbin)

E_spectrum = np.zeros(len(ncount)-1)

for n in range(1,len(ncount)):
E_spectrum[n-1] = np.sum(Kk.flat[whichbin==n])

k = 0.5*(kbins[0:N-1] + kbins[1:N])
E_spectrum = E_spectrum[1:N]

index = np.argmax(E_spectrum)
kmax = k[index]
Emax = E_spectrum[index]

plt.loglog(k, E_spectrum)
plt.loglog(k, Emax*(k/kmax)**(-5./3.), ls=":", color="0.5")

plt.xlabel(r"$k$")
plt.ylabel(r"$E(k)dk$")

plt.savefig("spectrum.png")

def fft_comp(ds, irho, iu, nindex_rho, level, low, delta ):

cube = ds.covering_grid(level, left_edge=low,
dims=delta,
fields=[irho, iu])

rho = cube[irho].d
u = cube[iu].d

nx, ny, nz = rho.shape

# do the FFTs -- note that since our data is real, there will be
# too much information here.  fftn puts the positive freq terms in
# the first half of the axes -- that's what we keep.  Our
# normalization has an '8' to account for this clipping to one
# octant.
ru = np.fft.fftn(rho**nindex_rho * u)[0:nx//2+1,0:ny//2+1,0:nz//2+1]
ru = 8.0*ru/(nx*ny*nz)

return np.abs(ru)**2