Data selection

Learn yt workshop

Nathan Goldbaum

University of Illinois at Urbana Champaign

National Center of Supercomputing Applications

Units and unit conversions

* The yt.units namespace

* Creating arrays with units

* Stripping units

* Unit conversions

* Code units

* Unit systems

Slides adapted from Britton Smith's Python in Astronomy 2015 talk

Data on disk has no inherent physical meaning

yt lets you think about the data using a physically motivated interface

yt lets you think about the data using a physically motivated interface

In [1]:
import yt

#
# download from:
# https://hub.yt/data/goldbaum2016a/feedback_20pc/simulation_outputs/DD0600.tar.gz
# (3.5 GB)
# 
ds = yt.load('DD0600/DD0600')
yt : [INFO     ] 2016-10-09 13:37:24,758 Parameters: current_time              = 0.6
yt : [INFO     ] 2016-10-09 13:37:24,759 Parameters: domain_dimensions         = [64 64 64]
yt : [INFO     ] 2016-10-09 13:37:24,760 Parameters: domain_left_edge          = [ 0.  0.  0.]
yt : [INFO     ] 2016-10-09 13:37:24,761 Parameters: domain_right_edge         = [ 1.  1.  1.]
yt : [INFO     ] 2016-10-09 13:37:24,762 Parameters: cosmological_simulation   = 0.0

Allowing you to forget about what your data looks like as a file format

And select only the data you want to select

And only the data you want to select

Physical objects to select data are called "data objects"

In [2]:
from yt.units import kpc

sp = ds.sphere(ds.domain_center, 20*kpc)
Parsing Hierarchy : 100%|██████████| 2653/2653 [00:00<00:00, 14526.88it/s]
yt : [INFO     ] 2016-10-09 13:37:25,062 Gathering a field list (this may take a moment.)

yt natively deals with multiresolution data

Data from each selected cell is returned as a flat, 1D array, with unit metadata attached

In [3]:
print(sp['density'])
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
[  1.84507879e-27   1.78474718e-27   1.10993722e-28 ...,   2.91673478e-25
   1.61332254e-25   1.57167186e-25] g/cm**3

Data objects can be queried for many fields

In [4]:
print(sp['temperature'])
[  17707.57526156   17662.39859408  299983.90668785 ...,    3875.69765547
    4339.59251383    4443.71003037] K

Data objects can be queried for many fields

In [5]:
from pprint import pprint

# fields that are defined in the on-disk data file
pprint(ds.field_list)
[('all', 'creation_time'),
 ('all', 'dynamical_time'),
 ('all', 'metallicity_fraction'),
 ('all', 'particle_index'),
 ('all', 'particle_mass'),
 ('all', 'particle_position_x'),
 ('all', 'particle_position_y'),
 ('all', 'particle_position_z'),
 ('all', 'particle_type'),
 ('all', 'particle_velocity_x'),
 ('all', 'particle_velocity_y'),
 ('all', 'particle_velocity_z'),
 ('enzo', 'Cooling_Time'),
 ('enzo', 'Dark_Matter_Density'),
 ('enzo', 'Density'),
 ('enzo', 'GasEnergy'),
 ('enzo', 'Metal_Density'),
 ('enzo', 'Temperature'),
 ('enzo', 'TotalEnergy'),
 ('enzo', 'x-velocity'),
 ('enzo', 'y-velocity'),
 ('enzo', 'z-velocity'),
 ('io', 'creation_time'),
 ('io', 'dynamical_time'),
 ('io', 'metallicity_fraction'),
 ('io', 'particle_index'),
 ('io', 'particle_mass'),
 ('io', 'particle_position_x'),
 ('io', 'particle_position_y'),
 ('io', 'particle_position_z'),
 ('io', 'particle_type'),
 ('io', 'particle_velocity_x'),
 ('io', 'particle_velocity_y'),
 ('io', 'particle_velocity_z')]

Data objects can be queried for many fields

In [6]:
# fields that yt can calculate given the fields in ds.field_list
# (only showing gas fields to fit on one screen)
pprint([f for f in ds.derived_field_list if f[0]=='gas'])
[('gas', 'H_nuclei_density'),
 ('gas', 'He_nuclei_density'),
 ('gas', 'angular_momentum_magnitude'),
 ('gas', 'angular_momentum_x'),
 ('gas', 'angular_momentum_y'),
 ('gas', 'angular_momentum_z'),
 ('gas', 'averaged_density'),
 ('gas', 'baroclinic_vorticity_magnitude'),
 ('gas', 'baroclinic_vorticity_x'),
 ('gas', 'baroclinic_vorticity_y'),
 ('gas', 'baroclinic_vorticity_z'),
 ('gas', 'cell_mass'),
 ('gas', 'cell_volume'),
 ('gas', 'chandra_emissivity'),
 ('gas', 'cooling_time'),
 ('gas', 'courant_time_step'),
 ('gas', 'cutting_plane_velocity_x'),
 ('gas', 'cutting_plane_velocity_y'),
 ('gas', 'cutting_plane_velocity_z'),
 ('gas', 'cylindrical_radial_velocity'),
 ('gas', 'cylindrical_radial_velocity_absolute'),
 ('gas', 'cylindrical_tangential_velocity'),
 ('gas', 'cylindrical_tangential_velocity_absolute'),
 ('gas', 'dark_matter_density'),
 ('gas', 'density'),
 ('gas', 'density_gradient_magnitude'),
 ('gas', 'density_gradient_x'),
 ('gas', 'density_gradient_y'),
 ('gas', 'density_gradient_z'),
 ('gas', 'dx'),
 ('gas', 'dy'),
 ('gas', 'dynamical_time'),
 ('gas', 'dz'),
 ('gas', 'emission_measure'),
 ('gas', 'entropy'),
 ('gas', 'jeans_mass'),
 ('gas', 'kT'),
 ('gas', 'kinetic_energy'),
 ('gas', 'mach_number'),
 ('gas', 'matter_density'),
 ('gas', 'matter_mass'),
 ('gas', 'mazzotta_weighting'),
 ('gas', 'mean_molecular_weight'),
 ('gas', 'metal_density'),
 ('gas', 'metal_mass'),
 ('gas', 'metallicity'),
 ('gas', 'number_density'),
 ('gas', 'path_element_x'),
 ('gas', 'path_element_y'),
 ('gas', 'path_element_z'),
 ('gas', 'pressure'),
 ('gas', 'pressure_gradient_magnitude'),
 ('gas', 'pressure_gradient_x'),
 ('gas', 'pressure_gradient_y'),
 ('gas', 'pressure_gradient_z'),
 ('gas', 'radial_mach_number'),
 ('gas', 'radial_velocity'),
 ('gas', 'radial_velocity_absolute'),
 ('gas', 'shear'),
 ('gas', 'shear_criterion'),
 ('gas', 'shear_mach'),
 ('gas', 'sound_speed'),
 ('gas', 'specific_angular_momentum_magnitude'),
 ('gas', 'specific_angular_momentum_x'),
 ('gas', 'specific_angular_momentum_y'),
 ('gas', 'specific_angular_momentum_z'),
 ('gas', 'sz_kinetic'),
 ('gas', 'szy'),
 ('gas', 'tangential_over_velocity_magnitude'),
 ('gas', 'tangential_velocity'),
 ('gas', 'temperature'),
 ('gas', 'thermal_energy'),
 ('gas', 'velocity_cylindrical_radius'),
 ('gas', 'velocity_cylindrical_theta'),
 ('gas', 'velocity_cylindrical_z'),
 ('gas', 'velocity_divergence'),
 ('gas', 'velocity_divergence_absolute'),
 ('gas', 'velocity_magnitude'),
 ('gas', 'velocity_spherical_phi'),
 ('gas', 'velocity_spherical_radius'),
 ('gas', 'velocity_spherical_theta'),
 ('gas', 'velocity_x'),
 ('gas', 'velocity_y'),
 ('gas', 'velocity_z'),
 ('gas', 'vertex_x'),
 ('gas', 'vertex_y'),
 ('gas', 'vertex_z'),
 ('gas', 'vorticity_growth_magnitude'),
 ('gas', 'vorticity_growth_magnitude_absolute'),
 ('gas', 'vorticity_growth_timescale'),
 ('gas', 'vorticity_growth_x'),
 ('gas', 'vorticity_growth_y'),
 ('gas', 'vorticity_growth_z'),
 ('gas', 'vorticity_magnitude'),
 ('gas', 'vorticity_squared'),
 ('gas', 'vorticity_stretching_magnitude'),
 ('gas', 'vorticity_stretching_x'),
 ('gas', 'vorticity_stretching_y'),
 ('gas', 'vorticity_stretching_z'),
 ('gas', 'vorticity_x'),
 ('gas', 'vorticity_y'),
 ('gas', 'vorticity_z'),
 ('gas', 'x'),
 ('gas', 'xray_emissivity'),
 ('gas', 'y'),
 ('gas', 'z')]

Spatial information is not lost

In [7]:
sp['x']
Out[7]:
YTArray([ 0.48510742,  0.48510742,  0.48510742, ...,  0.49382782,
        0.49382782,  0.49382782]) code_length
In [8]:
sp['x'].to('kpc')
Out[8]:
YTArray([ 635.84010707,  635.84010707,  635.84010707, ...,  647.270109  ,
        647.270109  ,  647.270109  ]) kpc

Spatial information is not lost

In [9]:
import numpy as np

np.unique(sp['dx']).to('kpc')
Out[9]:
YTArray([ 0.02      ,  0.04000001,  0.08000001,  0.16000003,  0.32000005,
        0.64000011]) kpc

Data objects return fields as unit-aware numpy arrays

In [10]:
from yt.units import kboltz, mh

pressure = sp['density']*sp['temperature']*kboltz/(0.6*mh)

pressure.in_units('Pa')
Out[10]:
YTArray([  4.49178775e-16,   4.33382728e-16,   4.57763947e-16, ...,
         1.55414682e-14,   9.62532417e-15,   9.60180328e-15]) Pa
In [11]:
pressure.in_units('dyne/cm**2')
Out[11]:
YTArray([  4.49178775e-15,   4.33382728e-15,   4.57763947e-15, ...,
         1.55414682e-13,   9.62532417e-14,   9.60180328e-14]) dyne/cm**2

Derived quantities turn fields into single values

In [12]:
sp['cell_mass']
Out[12]:
YTArray([  1.42104128e+37,   1.37457513e+37,   8.54850542e+35, ...,
         6.85549442e+34,   3.79195384e+34,   3.69405809e+34]) g

Derived quantities turn fields into single values

\begin{equation} M = \sum_i m_i \end{equation}
In [13]:
sp.quantities.total_quantity('cell_mass')
Out[13]:
1.466891758119552e+43 g
In [14]:
sp.min('temperature')
Out[14]:
37.49965095080129 K
In [15]:
sp.max('temperature')
Out[15]:
22794011.739036545 K
In [16]:
sp.mean('temperature', weight='cell_mass')
Out[16]:
5616.2938135343575 K
In [17]:
sp.mean('temperature', weight='cell_volume')
Out[17]:
3586755.8584530945 K
In [18]:
from IPython.display import IFrame

IFrame("http://yt-project.org/docs/dev/analyzing/objects.html#available-derived-quantities", width=700, height=600)
Out[18]:

Create new fields by defining a python function

In [19]:
def my_entropy(field, data):
    return (kboltz * data['temperature'] / data['number_density']**(2/3))

ds.add_field('entropy', function=my_entropy, units="keV*cm**2")

In [20]:
sp['entropy']
Out[20]:
YTArray([  1.42992470e-01,   1.45824074e-01,   1.57784025e+01, ...,
         1.07049000e-03,   1.77882067e-03,   1.85353934e-03]) cm**2*keV

Data Containers

In [21]:
ad = ds.all_data()

box = ds.box(ds.domain_left_edge + 500*kpc, ds.domain_right_edge - 500*kpc)

sp = ds.sphere(ds.domain_center, 500*kpc)

disk = ds.disk(ds.domain_center, [0, 0, 1], 500*kpc, 100*kpc)

ray = ds.ray(ds.domain_left_edge, ds.domain_right_edge)
In [22]:
IFrame("http://yt-project.org/doc/analyzing/objects.html#available-objects", width=700, height=600)
Out[22]:

Particle filters

In [23]:
from yt.data_objects.particle_filters import add_particle_filter

def young_star(pfilter, data):
    filter = data.ds.current_time - data[pfilter.filtered_type, "creation_time"]
    filter.convert_to_units('Myr')
    return filter < 10

add_particle_filter("young_stars", function=young_star, filtered_type='io',
                    requires=["creation_time"])
In [24]:
ds.add_particle_filter('young_stars')

ad = ds.all_data()

print(ad['io', 'particle_mass'].shape)
print(ad['young_stars', 'particle_mass'].shape)
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
(26111208,)
(85493,)
This uses ParticlePlot, a new feature in yt 3.3
In [25]:
from yt import ParticlePlot

p = ParticlePlot(ds, 
                 ('young_stars', 'particle_position_x'), 
                 ('young_stars', 'particle_position_y'), 
                 ('young_stars', 'particle_mass'), 
                 width=(10, 'kpc'))

p.set_unit(('young_stars', 'particle_mass'), 'Msun')
yt : [INFO     ] 2016-10-09 13:38:25,054 xlim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:38:25,055 ylim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:38:25,057 xlim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:38:25,058 ylim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:38:25,072 Splatting (('young_stars', 'particle_mass')) onto a 800 by 800 mesh
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
Out[25]:

In [26]:
from yt import ProjectionPlot

prj = ProjectionPlot(ds, 2, ('gas', 'density'), width=(15, 'kpc'))
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
yt : [INFO     ] 2016-10-09 13:38:57,473 Projection completed
yt : [INFO     ] 2016-10-09 13:38:57,474 xlim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:38:57,475 ylim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:38:57,476 xlim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:38:57,477 ylim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:38:57,479 Making a fixed resolution buffer of (('gas', 'density')) 800 by 800
In [27]:
dd = ds.sphere(ds.domain_center, (10, 'kpc'))

prj = ProjectionPlot(ds, 2, 'density', data_source=dd, width=(20, 'kpc'))

prj.set_zlim('density', 1e-4, 1e0)
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
yt : [INFO     ] 2016-10-09 13:39:02,159 Projection completed
yt : [INFO     ] 2016-10-09 13:39:02,160 xlim = 0.492371 0.507629
yt : [INFO     ] 2016-10-09 13:39:02,160 ylim = 0.492371 0.507629
yt : [INFO     ] 2016-10-09 13:39:02,162 xlim = 0.492371 0.507629
yt : [INFO     ] 2016-10-09 13:39:02,163 ylim = 0.492371 0.507629
yt : [INFO     ] 2016-10-09 13:39:02,165 Making a fixed resolution buffer of (('gas', 'density')) 800 by 800
Out[27]:

In [28]:
ProjectionPlot(ds, 2, 'temperature', weight_field='density', width=(15, 'kpc'))
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
yt : [INFO     ] 2016-10-09 13:39:12,107 Projection completed
yt : [INFO     ] 2016-10-09 13:39:12,107 xlim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:39:12,108 ylim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:39:12,110 xlim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:39:12,110 ylim = 0.494278 0.505722
yt : [INFO     ] 2016-10-09 13:39:12,112 Making a fixed resolution buffer of (('gas', 'temperature')) 800 by 800
Out[28]:

In [29]:
import numexpr as ne
from yt.units import G
from numpy import pi

def _total_dynamical_time(field, data):
    dens = data['density']
    dmd = data['dark_matter_density']
    dens.convert_to_cgs()
    dmd.convert_to_cgs()
    tdyn = ne.evaluate("sqrt(3*pi/(32*G*(dens + dmd)))")
    tdyn = tdyn*yt.units.s
    return tdyn

def _sfr(field, data):
    dens = data['density']
    tdyn = data['total_dynamical_time']
    Mu = np.float64(data.ds.parameters['Mu'])
    dens.convert_to_cgs()
    tdyn.convert_to_cgs()
    sfr = ne.evaluate("where(dens*Mu/mh > 50, 0.01*dens/tdyn, 0)")
    sfr = sfr*(yt.units.g/yt.units.cm**3/yt.units.s)
    return sfr

ds.add_field('total_dynamical_time', _total_dynamical_time, units='s')
ds.add_field('sfr_density', _sfr, units='g/cm**3/s')
In [30]:
prj = yt.ProjectionPlot(ds, 2, 'sfr_density', width=(10, 'kpc'))

prj.set_unit('sfr_density', 'Msun/kpc**2/yr')

prj.set_zlim('sfr_density', 1e-1, 3e1)
yt : [INFO     ] 2016-10-09 13:39:23,437 Projection completed
yt : [INFO     ] 2016-10-09 13:39:23,438 xlim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:39:23,439 ylim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:39:23,440 xlim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:39:23,441 ylim = 0.496185 0.503815
yt : [INFO     ] 2016-10-09 13:39:23,443 Making a fixed resolution buffer of (('gas', 'sfr_density')) 800 by 800
Out[30]:

PhasePlot and ProfilePlot

In [31]:
from yt import PhasePlot

PhasePlot(ds.all_data(), 'density', 'temperature', 'cell_mass', 
          weight_field=None, fractional=True)
Out[31]:

In [33]:
max_val, max_loc = ds.find_max(('gas', 'density'))

print (max_val, max_loc)
yt : [INFO     ] 2016-10-09 13:39:54,991 Max Value is 3.20734e-21 at 0.5017318725585888 0.4996719360351562 0.5001907348632812
3.2073369118751583e-21 g/cm**3 [ 0.50173187  0.49967194  0.50019073] code_length
In [34]:
PhasePlot(ds.sphere(max_loc, (3, 'kpc')), 'density', 'temperature', 'cell_mass', 
          weight_field=None, fractional=True)
/usr/local/lib/python3.5/site-packages/yt/data_objects/grid_patch.py:190: VisibleDeprecationWarning: using a non-integer number instead of an integer will result in an error in the future
  startIndex[2]:endIndex[2]] = tofill
Out[34]: