Fields and Unit Conversion

Notebook

In the past, querying a data object with a field name returned a NumPy ndarray . In the new unit system, data object queries will return a YTArray, a subclass of ndarray that preserves all of the nice properties of ndarray, including broadcasting, deep and shallow copies, and views.

Selecting data from an object

YTArray is 'unit-aware'. Let's show how this works in practice using a sample Enzo dataset:

In [1]:
import yt
ds = yt.load('IsolatedGalaxy/galaxy0030/galaxy0030')

dd = ds.all_data()
maxval, maxloc = ds.find_max('density')

dens = dd['density']
In [2]:
print (maxval)
7.73426503924e-24 g/cm**3
In [3]:
print (dens)
[  4.92775113e-31   4.94005233e-31   4.93824694e-31 ...,   1.12879234e-25
   1.59561490e-25   1.09824903e-24] g/cm**3
In [4]:
mass = dd['cell_mass']

print ("Cell Masses in CGS: \n", mass, "\n")
print ("Cell Masses in MKS: \n", mass.in_mks(), "\n")
print ("Cell Masses in Solar Masses: \n", mass.in_units('Msun'), "\n")
print ("Cell Masses in code units: \n", mass.in_units('code_mass'), "\n")
Cell Masses in CGS: 
 [  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37] g 

Cell Masses in MKS: 
 [  4.41963696e+35   4.43066975e+35   4.42905051e+35 ...,   6.03437074e+33
   8.52994081e+33   5.87109032e+34] kg 

Cell Masses in Solar Masses: 
 [ 222269.24695249  222824.1002147   222742.66680395 ...,    3034.76292816
    4289.817327     29526.47097354] Msun 

Cell Masses in code units: 
 [  5.44645036e-06   5.46004640e-06   5.45805096e-06 ...,   7.43633493e-08
   1.05117003e-07   7.23511960e-07] code_mass 

In [5]:
dx = dd['dx']
print ("Cell dx in code units: \n", dx, "\n")
print ("Cell dx in centimeters: \n", dx.in_cgs(), "\n")
print ("Cell dx in meters: \n", dx.in_units('m'), "\n")
print ("Cell dx in megaparsecs: \n", dx.in_units('Mpc'), "\n")
Cell dx in code units: 
 [ 0.03125     0.03125     0.03125    ...,  0.00012207  0.00012207
  0.00012207] code_length 

Cell dx in centimeters: 
 [  9.64375000e+22   9.64375000e+22   9.64375000e+22 ...,   3.76708984e+20
   3.76708984e+20   3.76708984e+20] cm 

Cell dx in meters: 
 [  9.64375000e+20   9.64375000e+20   9.64375000e+20 ...,   3.76708984e+18
   3.76708984e+18   3.76708984e+18] m 

Cell dx in megaparsecs: 
 [ 0.03125327  0.03125327  0.03125327 ...,  0.00012208  0.00012208
  0.00012208] Mpc 

Unit conversions

YTArray defines several user-visible member functions that allow data to be converted from one unit system to another:

  • in_units
  • in_cgs
  • in_mks
  • in_base
  • convert_to_units
  • convert_to_cgs
  • convert_to_mks
  • convert_to_base

The first method, in_units, returns a copy of the array in the units denoted by a string argument:

In [6]:
print (dd['density'].in_units('Msun/pc**3'))
[  7.28103608e-09   7.29921182e-09   7.29654424e-09 ...,   1.66785569e-03
   2.35761291e-03   1.62272618e-02] Msun/pc**3

in_cgs and in_mks return a copy of the array converted to CGS and MKS units, respectively:

In [7]:
print (dd['pressure'])
print (dd['pressure'].in_cgs())
print (dd['pressure'].in_mks())
[  6.45004535e-19   6.47838014e-19   6.47258644e-19 ...,   1.32223013e-13
   1.71428000e-13   7.90189327e-13] dyne/cm**2
[  6.45004535e-19   6.47838014e-19   6.47258644e-19 ...,   1.32223013e-13
   1.71428000e-13   7.90189327e-13] g/(cm*s**2)
[  6.45004535e-20   6.47838014e-20   6.47258644e-20 ...,   1.32223013e-14
   1.71428000e-14   7.90189327e-14] kg/(m*s**2)

in_cgs and in_mks are just special cases of the more general in_base, which can convert a YTArray to a number of different unit systems:

In [8]:
print (dd['pressure'].in_base('imperial')) # Imperial/English base units
print (dd['pressure'].in_base('galactic')) # Base units of kpc, Msun, Myr
print (dd['pressure'].in_base('planck')) # Base units in the Planck system
print (dd['pressure'].in_base()) # defaults to cgs if no argument given
[  4.33423036e-20   4.35327046e-20   4.34937728e-20 ...,   8.88497628e-15
   1.15194297e-14   5.30982712e-14] lbm/(ft*s**2)
[  9.96813770e-04   1.00119273e-03   1.00029735e-03 ...,   2.04342316e+02
   2.64931147e+02   1.22118770e+03] Msun/(Myr**2*kpc)
[  1.39202025e-133   1.39813533e-133   1.39688496e-133 ...,
   2.85357857e-128   3.69968324e-128   1.70535164e-127] m_pl/(l_pl*t_pl**2)
[  6.45004535e-19   6.47838014e-19   6.47258644e-19 ...,   1.32223013e-13
   1.71428000e-13   7.90189327e-13] g/(cm*s**2)

in_base can even take a dataset as the argument to convert the YTArray into the base units of the dataset:

In [9]:
print (dd['pressure'].in_base(ds)) # The IsolatedGalaxy dataset from above
[  1.33105548e-01   1.33690275e-01   1.33570714e-01 ...,   2.72860355e+04
   3.53765232e+04   1.63066425e+05] code_mass/(code_length*code_time**2)

yt defines a number of unit systems, and new unit systems may be added by the user, which can also be passed to in_base. To learn more about the unit systems, how to use them with datasets and other objects, and how to add new ones, see Unit Systems.

The rest of the methods do in-place conversions:

In [10]:
dens = dd['density']
print (dens)

dens.convert_to_units('Msun/pc**3')
print (dens)
[  4.92775113e-31   4.94005233e-31   4.93824694e-31 ...,   1.12879234e-25
   1.59561490e-25   1.09824903e-24] g/cm**3
[  7.28103608e-09   7.29921182e-09   7.29654424e-09 ...,   1.66785569e-03
   2.35761291e-03   1.62272618e-02] Msun/pc**3

One possibly confusing wrinkle when using in-place conversions is if you try to query dd['density'] again, you'll find that it has been converted to solar masses per cubic parsec:

In [11]:
print (dd['density'])

dens.convert_to_units('g/cm**3')

print (dens)
[  7.28103608e-09   7.29921182e-09   7.29654424e-09 ...,   1.66785569e-03
   2.35761291e-03   1.62272618e-02] Msun/pc**3
[  4.92775113e-31   4.94005233e-31   4.93824694e-31 ...,   1.12879234e-25
   1.59561490e-25   1.09824903e-24] g/cm**3

Since the unit metadata is preserved and the array values are still correct in the new unit system, all numerical operations will still be correct.

One of the nicest aspects of this new unit system is that the symbolic algebra for mathematical operations on data with units is performed automatically by sympy. This example shows how we can construct a field with density units from two other fields that have units of mass and volume:

In [12]:
print (dd['cell_mass'])
print (dd['cell_volume'].in_units('cm**3'))

print ((dd['cell_mass']/dd['cell_volume']).in_cgs())
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37] g
[  8.96887209e+68   8.96887209e+68   8.96887209e+68 ...,   5.34586435e+61
   5.34586435e+61   5.34586435e+61] cm**3
[  4.92775113e-31   4.94005233e-31   4.93824694e-31 ...,   1.12879234e-25
   1.59561490e-25   1.09824903e-24] g/cm**3

Electrostatic/Electromagnetic Units

Electromagnetic units can be a bit tricky, because the units for such quantities in different unit systems can have entirely different dimensions, even if they are meant to represent the same physical quantities. For example, in the SI system of units, current in Amperes is a fundamental unit of measure, so the unit of charge "coulomb" is equal to one ampere-second. On the other hand, in the Gaussian/CGS system, there is no equivalent base electromagnetic unit, and the electrostatic charge unit "esu" is equal to one $\mathrm{cm^{3/2}g^{-1/2}s^{-1}}$ (which does not have any apparent physical significance). yt recognizes this difference:

In [13]:
q1 = yt.YTArray(1.0,"C") # coulombs
q2 = yt.YTArray(1.0,"esu") # electrostatic units / statcoulomb

print ("units =", q1.in_mks().units, ", dims =", q1.units.dimensions)
print ("units =", q2.in_cgs().units, ", dims =", q2.units.dimensions)
units = A*s , dims = (current_mks)*(time)
units = cm**(3/2)*sqrt(g)/s , dims = (length)**(3/2)*sqrt((mass))/(time)
In [14]:
B1 = yt.YTArray(1.0,"T") # tesla
B2 = yt.YTArray(1.0,"gauss") # gauss

print ("units =", B1.in_mks().units, ", dims =", B1.units.dimensions)
print ("units =", B2.in_cgs().units, ", dims =", B2.units.dimensions)
units = kg/(A*s**2) , dims = (mass)/((current_mks)*(time)**2)
units = sqrt(g)/(sqrt(cm)*s) , dims = sqrt((mass))/(sqrt((length))*(time))

To convert between these two systems, use Unit Equivalencies.

Working with views and converting to ndarray

There are two ways to convert the data into a numpy array. The most straightforward and safe way to do this is to create a copy of the array data. The following cell demonstrates four equivalent ways of doing this, in increasing degree of terseness.

In [15]:
import numpy as np

dens = dd['cell_mass']

print (dens.to_ndarray())
print (np.array(dens))
print (dens.value)
print (dens.v)
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]

Since we have a copy of the data, we can mess with it however we wish without disturbing the original data returned by the yt data object.

Another way to touch the raw array data is to get a view. A numpy view is a lightweight array interface to a memory buffer. There are four ways to create views of YTArray instances:

In [16]:
print (dd['cell_mass'].ndarray_view())
print (dd['cell_mass'].view(np.ndarray))
print (dd['cell_mass'].ndview)
print (dd['cell_mass'].d)
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]
[  4.41963696e+38   4.43066975e+38   4.42905051e+38 ...,   6.03437074e+36
   8.52994081e+36   5.87109032e+37]

When working with views, rememeber that you are touching the raw array data and no longer have any of the unit checking provided by the unit system. This can be useful where it might be more straightforward to treat the array as if it didn't have units but without copying the data.

In [17]:
density_values = dd['density'].d
density_values[0:10] = 0

# The original array was updated
print (dd['density'])
[  0.00000000e+00   0.00000000e+00   0.00000000e+00 ...,   1.12879234e-25
   1.59561490e-25   1.09824903e-24] g/cm**3

Round-Trip Conversions to and from Other Unit Systems

Finally, a YTArray or YTQuantity may be converted to an AstroPy quantity, which is a NumPy array or a scalar associated with units from AstroPy's units system. You may use this facility if you have AstroPy installed.

Some examples of converting from AstroPy units to yt:

In [18]:
from astropy import units as u

x = 42.0 * u.meter
y = yt.YTQuantity.from_astropy(x)
/usr/lib64/python3.4/site-packages/IPython/kernel/__init__.py:13: ShimWarning: The `IPython.kernel` package has been deprecated. You should import from ipykernel or jupyter_client instead.
  "You should import from ipykernel or jupyter_client instead.", ShimWarning)
In [19]:
print (x, type(x))
print (y, type(y))
42.0 m <class 'astropy.units.quantity.Quantity'>
42.0 m <class 'yt.units.yt_array.YTQuantity'>
In [20]:
a = np.random.random(size=10) * u.km/u.s
b = yt.YTArray.from_astropy(a)
In [21]:
print (a, type(a))
print (b, type(b))
[ 0.77753381  0.06505615  0.61308168  0.42650944  0.3674397   0.62112214
  0.02316259  0.43257555  0.48916812  0.92512479] km / s <class 'astropy.units.quantity.Quantity'>
[ 0.77753381  0.06505615  0.61308168  0.42650944  0.3674397   0.62112214
  0.02316259  0.43257555  0.48916812  0.92512479] km/s <class 'yt.units.yt_array.YTArray'>

It also works the other way around, converting a YTArray or YTQuantity to an AstroPy quantity via the method to_astropy. For arrays:

In [22]:
temp = dd["temperature"]
atemp = temp.to_astropy()
In [23]:
print (temp, type(temp))
print (atemp, type(atemp))
[  9336.29589844   9353.96386719   9349.01464844 ...,  11824.51855469
  11588.16699219  10173.02148438] K <class 'yt.units.yt_array.YTArray'>
[  9336.29589844   9353.96386719   9349.01464844 ...,  11824.51855469
  11588.16699219  10173.02148438] K <class 'astropy.units.quantity.Quantity'>

and quantities:

In [24]:
from yt.units import kboltz
kb = kboltz.to_astropy()
In [25]:
print (kboltz, type(kboltz))
print (kb, type(kb))
1.3806488e-16 erg/K <class 'yt.units.yt_array.YTQuantity'>
1.3806488e-16 erg / K <class 'astropy.units.quantity.Quantity'>

As a sanity check, you can show that it works round-trip:

In [26]:
k1 = kboltz.to_astropy()
k2 = yt.YTQuantity.from_astropy(kb)
print (k1 == k2)
False
In [27]:
c = yt.YTArray.from_astropy(a)
d = c.to_astropy()
print (a == d)
[ True  True  True  True  True  True  True  True  True  True]

We can also do the same thing with unitful quantities from the Pint package, using essentially the same procedure:

In [28]:
from pint import UnitRegistry
ureg = UnitRegistry()
v = 1000.*ureg.km/ureg.s
w = yt.YTQuantity.from_pint(v)
In [29]:
print (v, type(v))
print (w, type(w))
1000.0 kilometer / second <class 'pint.unit.build_quantity_class.<locals>.Quantity'>
1000.0 km/s <class 'yt.units.yt_array.YTQuantity'>
In [30]:
ptemp = temp.to_pint()
In [31]:
print (temp, type(temp))
print (ptemp, type(ptemp))
[  9336.29589844   9353.96386719   9349.01464844 ...,  11824.51855469
  11588.16699219  10173.02148438] K <class 'yt.units.yt_array.YTArray'>
[  9336.29589844   9353.96386719   9349.01464844 ...,  11824.51855469
  11588.16699219  10173.02148438] kelvin <class 'pint.unit.build_quantity_class.<locals>.Quantity'>

(2)_Fields_and_unit_conversion.ipynb; 2)_Fields_and_unit_conversion_evaluated.ipynb; 2)_Fields_and_unit_conversion.py)

Derived Fields

The following example creates a derived field for the square root of the cell volume.

Notebook
In [1]:
import yt
import numpy as np

# Function defining the derived field
def root_cell_volume(field, data):
   return np.sqrt(data['cell_volume'])

# Load the dataset
ds = yt.load('HiresIsolatedGalaxy/DD0044/DD0044')

# Add the field to the dataset, linking to the derived field function and
# units of the field
ds.add_field(("gas", "root_cell_volume"), units="cm**(3/2)", function=root_cell_volume)

# Access the derived field like any other field
ad = ds.all_data()
ad['root_cell_volume']
Out[1]:
YTArray([  5.61663061e+33,   5.61663061e+33,   5.61663061e+33, ...,
         7.02078826e+32,   7.02078826e+32,   7.02078826e+32]) cm**(3/2)

No special unit logic needs to happen inside of the function - np.sqrt will convert the units of the density field appropriately:

Notebook
In [1]:
import yt
import numpy as np

ds = yt.load('HiresIsolatedGalaxy/DD0044/DD0044')
ad = ds.all_data()

print(ad['cell_volume'].in_cgs())
print(np.sqrt(ad['cell_volume'].in_cgs()))
[  3.15465394e+67   3.15465394e+67   3.15465394e+67 ...,   4.92914678e+65
   4.92914678e+65   4.92914678e+65] cm**3
[  5.61663061e+33   5.61663061e+33   5.61663061e+33 ...,   7.02078826e+32
   7.02078826e+32   7.02078826e+32] cm**(3/2)

That said, it is necessary to specify the units in the call to the add_field function. Not only does this ensure the returned units will be exactly what you expect, it also allows an in-place conversion of units, just in case the function returns a field with dimensionally equivalent units.

For example, let’s redo the above example but ask for units of Mpc**(3/2):

Notebook
In [1]:
import yt
import numpy as np

def root_cell_volume(field, data):
   return np.sqrt(data['cell_volume'])

ds = yt.load('HiresIsolatedGalaxy/DD0044/DD0044')

# Here we set the default units to Mpc^(3/2)
ds.add_field(("gas", "root_cell_volume"), units="Mpc**(3/2)", function=root_cell_volume)

ad = ds.all_data()
ad['root_cell_volume']
Out[1]:
YTArray([ 0.00103622,  0.00103622,  0.00103622, ...,  0.00012953,
        0.00012953,  0.00012953]) Mpc**(3/2)