Symbolic Units¶
This section describes yt’s symbolic unit capabilities. This is provided as quick introduction for those who are already familiar with yt but want to learn more about the unit system. Please see General Data Analysis and Visualizing Data for more detail about querying, analyzing, and visualizing data in yt.
Originally the unit system was a part of yt proper but since the yt 4.0 release,
the unit system has been split off into its own library, unyt
.
For a detailed discussion of how to use unyt
, we suggest taking a look at
the unyt documentation available at https://unyt.readthedocs.io/, however yt
adds additional capabilities above and beyond what is provided by unyt
alone, we describe those capabilities below.
Selecting data from a data object¶
The data returned by yt will have units attached to it. For example, let’s query
a data object for the ('gas', 'density')
field:
>>> import yt
>>> ds = yt.load('IsolatedGalaxy/galaxy0030/galaxy0030')
>>> dd = ds.all_data()
>>> dd['gas', 'density']
unyt_array([4.92775113e-31, 4.94005233e-31, 4.93824694e-31, ...,
1.12879234e-25, 1.59561490e-25, 1.09824903e-24], 'g/cm**3')
We can see how we get back a unyt_array
instance. A unyt_array
is a
subclass of NumPy’s NDarray type that has units attached to it:
>>> dd['gas', 'density'].units
g/cm**3
It is straightforward to convert data to different units:
>>> dd['gas', 'density'].to('Msun/kpc**3')
unyt_array([7.28103608e+00, 7.29921182e+00, 7.29654424e+00, ...,
1.66785569e+06, 2.35761291e+06, 1.62272618e+07], 'Msun/kpc**3')
For more details about working with unyt_array
, see the the documentation for unyt
.
Applying Units to Data¶
A unyt_array
can be created from a list, tuple, or NumPy array using
multiplication with a Unit
object. For convenience, each yt dataset has a
units
attribute one can use to obtain unit objects for this purpose:
>>> data = np.random.random((100, 100))
>>> data_with_units = data * ds.units.gram
All units known to the dataset will be available via ds.units
, including
code units and comoving units.
Derived Field Units¶
Special care often needs to be taken to ensure the result of a derived field will come out in the correct units. The yt unit system will double-check for you to make sure you are not accidentally making a unit conversion mistake. To see what that means in practice, let’s define a derived field corresponding to the square root of the gas density:
>>> import yt
>>> import numpy as np
>>> def root_density(field, data):
... return np.sqrt(data['gas', 'density'])
>>> ds = yt.load('IsolatedGalaxy/galaxy0030/galaxy0030')
>>> ds.add_field(("gas", "root_density"), units="(g/cm**3)**(1/2)",
... function=root_density, sampling_type='cell')
>>> ad = ds.all_data()
>>> ad['gas', 'root_density']
unyt_array([7.01979425e-16, 7.02855059e-16, 7.02726614e-16, ...,
3.35975050e-13, 3.99451486e-13, 1.04797377e-12], 'sqrt(g)/cm**(3/2)')
No special unit logic needs to happen inside of the function: the result of
np.sqrt
will have the correct units:
>>> np.sqrt(ad['gas', 'density'])
unyt_array([7.01979425e-16, 7.02855059e-16, 7.02726614e-16, ...,
3.35975050e-13, 3.99451486e-13, 1.04797377e-12], 'sqrt(g)/cm**(3/2)')
One could also specify any other units that have dimensions of square root of density and yt would automatically convert the return value of the field function to the specified units. An error would be raised if the units are not dimensionally equivalent to the return value of the field function.
Code Units¶
All yt datasets are associated with a “code” unit system that corresponds to
whatever unit system the data is represented in on-disk. Let’s take a look at
the data in an Enzo simulation, specifically the ("enzo", "Density")
field:
>>> import yt
>>> ds = yt.load('Enzo_64/DD0043/data0043')
>>> ad = ds.all_data()
>>> ad["enzo", "Density"]
unyt_array([6.74992726e-02, 6.12111635e-02, 8.92988636e-02, ...,
9.09875931e+01, 5.66932465e+01, 4.27780263e+01], 'code_mass/code_length**3')
we see we get back data from yt in units of code_mass/code_length**3
. This
is the density unit formed out of the base units of mass and length in the
internal unit system in the simulation. We can see the values of these units by
looking at the length_unit
and mass_unit
attributes of the dataset
object:
>>> ds.length_unit
unyt_quantity(128, 'Mpccm/h')
>>> ds.mass_unit
unyt_quantity(4.89045159e+50, 'g')
And we can see that both of these have values of 1 in the code unit system.
>>> ds.length_unit.to('code_length')
unyt_quantity(1., 'code_length')
>>> ds.mass_unit.to('code_mass')
unyt_quantity(1., 'code_mass')
In addition to length_unit
and mass_unit
, there are also time_unit
,
velocity_unit
, and magnetic_unit
attributes for this dataset. Some
frontends also define a density_unit
, pressure_unit
,
temperature_unit
, and specific_energy
attribute. If these are not defined
then the corresponding unit is calculated from the base length, mass, and time unit.
Each of these attributes corresponds to a unit in the code unit system:
>>> [un for un in dir(ds.units) if un.startswith('code')]
['code_density',
'code_length',
'code_magnetic',
'code_mass',
'code_metallicity',
'code_pressure',
'code_specific_energy',
'code_temperature',
'code_time',
'code_velocity']
You can use these unit names to convert arbitrary data into a dataset’s code unit system:
>>> u = ds.units
>>> data = 10**-30 * u.g / u.cm**3
>>> data.to('code_density')
unyt_quantity(0.36217187, 'code_density')
Note how in this example we used ds.units
instead of the top-level unyt
namespace or yt.units
. This is because the units from ds.units
know
about the dataset’s code unit system and can convert data into it. Unit objects
from unyt
or yt.units
will not know about any particular dataset’s unit
system.
Comoving units for Cosmological Simulations¶
The length unit of the dataset I used above uses a cosmological unit:
>>> print(ds.length_unit)
128 Mpccm/h
In English, this says that the length unit is 128 megaparsecs in the comoving frame, scaled as if the hubble constant were 100 km/s/Mpc. Although \(h\) isn’t really a unit, yt treats it as one for the purposes of the unit system.
As an aside, Darren Croton’s research note on the history, use, and interpretation of \(h\) as it appears in the astronomical literature is pretty much required reading for anyone who has to deal with factors of \(h\) every now and then.
In yt, comoving length unit symbols are named following the pattern < length
unit >cm
, i.e. pccm
for comoving parsec or mcm
for a comoving
meter. A comoving length unit is different from the normal length unit by a
factor of \((1+z)\):
>>> u = ds.units
>>> print((1*u.Mpccm)/(1*u.Mpc))
0.9986088499304777 dimensionless
>>> 1 / (1 + ds.current_redshift)
0.9986088499304776
As we saw before, h is treated like any other unit symbol. It has dimensionless units, just like a scalar:
>>> (1*u.Mpc)/(1*u.Mpc/u.h)
unyt_quantity(0.71, '(dimensionless)')
>>> ds.hubble_constant
0.71
Using parsec as an example,
pc
Proper parsecs, \(\rm{pc}\).
pccm
Comoving parsecs, \(\rm{pc}/(1+z)\).
pccm/h
Comoving parsecs normalized by the scaled hubble constant, \(\rm{pc}/h/(1+z)\).
pc/h
Proper parsecs, normalized by the scaled hubble constant, \(\rm{pc}/h\).
Overriding Code Unit Definitions¶
On occasion, you might have a dataset for a supported frontend that does not
have the conversions to code units accessible or you may want to change them
outright. yt
provides a mechanism so that one may provide their own code
unit definitions to yt.load
, which override the default rules for a given
frontend for defining code units.
This is provided through the units_override
argument to yt.load
. We’ll
use an example of an Athena dataset. First, a call to yt.load
without
units_override
:
>>> ds = yt.load("MHDSloshing/virgo_low_res.0054.vtk")
>>> ds.length_unit
unyt_quantity(1., 'cm')
>>> ds.mass_unit
unyt_quantity(1., 'g')
>>> ds.time_unit
unyt_quantity(1., 's')
>>> sp1 = ds1.sphere("c", (0.1, "unitary"))
>>> print(sp1["gas", "density"])
[0.05134981 0.05134912 0.05109047 ... 0.14608461 0.14489453 0.14385277] g/cm**3
This particular simulation is of a galaxy cluster merger so these density values are way, way too high. This is happening because Athena does not encode any information about the unit system used in the simulation or the output data, so yt cannot infer that information and must make an educated guess. In this case it incorrectly assumes the data are in CGS units.
However, we know a priori what the unit system should be, and we can supply
a units_override
dictionary to yt.load
to override the incorrect
assumptions yt is making about this dataset. Let’s define:
>>> units_override = {"length_unit": (1.0, "Mpc"),
... "time_unit": (1.0, "Myr"),
... "mass_unit": (1.0e14, "Msun")}
The units_override
dictionary can take the following keys:
length_unit
time_unit
mass_unit
magnetic_unit
temperature_unit
and the associated values can be (value, "unit")
tuples, unyt_quantity
instances, or floats (in the latter case they are assumed to have the
corresponding cgs unit). Now let’s reload the dataset using our
units_override
dict:
>>> ds = yt.load("MHDSloshing/virgo_low_res.0054.vtk",
... units_override=units_override)
>>> sp = ds.sphere("c",(0.1,"unitary"))
>>> print(sp["gas", "density"])
[3.47531683e-28 3.47527018e-28 3.45776515e-28 ... 9.88689766e-28
9.80635384e-28 9.73584863e-28] g/cm**3
and we see how the data now have much more sensible values for a galaxy cluster merge simulation.
Comparing Units From Different Simulations¶
The code units from different simulations will have different conversions to physical coordinates. This can get confusing when working with data from more than one simulation or from a single simulation where the units change with time.
As an example, let’s load up two enzo datasets from different redshifts in the same cosmology simulation, one from high redshift:
>>> ds1 = yt.load('Enzo_64/DD0002/data0002')
>>> ds1.current_redshift
7.8843748886903
>>> ds1.length_unit
unyt_quantity(128, 'Mpccm/h')
>>> ds1.length_unit.in_cgs()
unyt_quantity(6.26145538e+25, 'cm')
And another from low redshift:
>>> ds2 = yt.load('Enzo_64/DD0043/data0043')
>>> ds2.current_redshift
0.0013930880640796
>>> ds2.length_unit
unyt_quantity(128, 'Mpccm/h')
>>> ds2.length_unit.in_cgs()
unyt_quantity(5.55517285e+26, 'cm')
Now despite the fact that 'Mpccm/h'
means different things for the two
datasets, it’s still a well-defined operation to take the ratio of the two
length units:
>>> ds2.length_unit / ds1.length_unit
unyt_quantity(8.87201539, '(dimensionless)')
Because code units and comoving units are defined relative to a physical unit
system, unyt
is able to give the correct answer here. So long as the result
comes out dimensionless or in a physical unit then the answer will be
well-defined. However, if we want the answer to come out in the internal units
of one particular dataset, additional care must be taken. For an example where
this might be an issue, let’s try to compute the sum of two comoving distances
from each simulation:
>>> d1 = 12 * ds1.units.Mpccm
>>> d2 = 12 * ds2.units.Mpccm
>>> d1 + d2
unyt_quantity(118.46418468, 'Mpccm')
>>> d2 + d1
unyt_quantity(13.35256754, 'Mpccm')
So this is definitely weird - addition appears to not be associative anymore!
However, both answers are correct, the confusion is arising because "Mpccm"
is ambiguous in these expressions. In situations like this, unyt
will use
the definition for units from the leftmost term in an expression, so the first
example is returning data in high-redshift comoving megaparsecs, while the
second example returns data in low-redshift comoving megaparsecs.
Wherever possible it’s best to do calculations in physical units when working with more than one dataset. If you need to use comoving units or code units then extra care must be taken in your code to avoid ambiguity.