Low-Level Data Inspection: Accessing Raw Data¶
yt can not only provide high-level access to data, such as through slices, projections, object queries and the like, but it can also provide low-level access to the raw data.
Note
This section is tuned for patch- or block-based simulations. Future versions of yt will enable more direct access to particle and oct based simulations. For now, these are represented as patches, with the attendant properties.
For a more basic introduction, see yt Quickstart and more specifically Starting Out and Loading Data.
Examining Grid Hierarchies¶
yt organizes grids in a hierarchical fashion; a coarser grid that contains (or
overlaps with) a finer grid is referred to as its parent. yt organizes these
only a single level of refinement at a time. To access grids, the grids
attribute on a GridIndex
object. (For
fast operations, a number of additional arrays prefixed with grid
are also
available, such as grid_left_edges
and so on.) This returns an instance of
AMRGridPatch
, which can be queried for
either data or index information.
The AMRGridPatch
object itself provides
the following attributes:
Children
: a list of grids contained within this one, of one higher level of refinementParent
: a single object or a list of objects this grid is contained within, one level of refinement coarserchild_mask
: a mask of 0’s and 1’s, representing where no finer data is available in refined grids (1) or where this grid is covered by finer regions (0). Note that to get back the final data contained within a grid, one can multiple a field by this attribute.child_indices
: a mask of booleans, where False indicates no finer data is available. This is essentially the inverse ofchild_mask
.child_index_mask
: a mask of indices into theds.index.grids
array of the child grids.LeftEdge
: the left edge, in native code coordinates, of this gridRightEdge
: the right edge, in native code coordinates, of this griddds
: the width of a cell in this gridid
: the id (not necessarily the index) of this grid. Defined such that subtracting the property_id_offset
gives the index intods.index.grids
.NumberOfParticles
: the number of particles in this gridOverlappingSiblings
: a list of sibling grids that this grid overlaps with. Likely only defined for Octree-based codes.
In addition, the method
get_global_startindex()
can be
used to get the integer coordinates of the upper left edge. These integer
coordinates are defined with respect to the current level; this means that they
are the offset of the left edge, with respect to the left edge of the domain,
divided by the local dds
.
To traverse a series of grids, this type of construction can be used:
g = ds.index.grids[1043]
g2 = g.Children[1].Children[0]
print(g2.LeftEdge)
Examining Grid Data¶
Once you have identified a grid you wish to inspect, there are two ways to examine data. You can either ask the grid to read the data and pass it to you as normal, or you can manually intercept the data from the IO handler and examine it before it has been unit converted. This allows for much more raw data inspection.
To access data that has been read in the typical fashion and unit-converted as normal, you can access the grid as you would a normal object:
g = ds.index.grids[1043]
print(g["gas", "density"])
print(g["gas", "density"].min())
To access the raw data (as found in the file), use
g = ds.index.grids[1043]
rho = g["gas", "density"].in_base("code")
Finding Data at Fixed Points¶
One of the most common questions asked of data is, what is the value at this specific point. While there are several ways to find out the answer to this question, a few helper routines are provided as well. To identify the finest-resolution (i.e., most canonical) data at a given point, use the point data object:
from yt.units import kpc
point_obj = ds.point([30, 75, 80]*kpc)
density_at_point = point_obj['gas', 'density']
The point data object works just like any other yt data object. It is special because it is the only zero-dimensional data object: it will only return data at the exact point specified when creating the point data object. For more information about yt data objects, see Data Objects.
If you need to find field values at many points, the
find_field_values_at_points()
function may be more efficient. This function returns a nested list of field
values at multiple points in the simulation volume. For example, if one wanted
to find the value of a mesh field at the location of the particles in a
simulation, one could do:
ad = ds.all_data()
ppos = ad["all", "particle_position"]
ppos_den_vel = ds.find_field_values_at_points(
[("gas", "density"), ("gas", "velocity_x")],
ppos
)
In this example, ppos_den_vel
will be a list of arrays. The first array will
contain the density values at the particle positions, the second will contain
the x velocity values at the particle positions.
Examining Grid Data in a Fixed Resolution Array¶
If you have a dataset, either AMR or single resolution, and you want to just stick it into a fixed resolution numpy array for later examination, then you want to use a Covering Grid. You must specify the maximum level at which to sample the data, a left edge of the data where you will start, and the resolution at which you want to sample.
For example, let’s use the sample dataset
Enzo_64
. This dataset is at a resolution of 64^3 with 5 levels of AMR,
so if we want a 64^3 array covering the entire volume and sampling just the
lowest level data, we run:
import yt
ds = yt.load("Enzo_64/DD0043/data0043")
all_data_level_0 = ds.covering_grid(level=0, left_edge=[0, 0.0, 0.0], dims=[64, 64, 64])
Note that we can also get the same result and rely on the dataset to know its own underlying dimensions:
all_data_level_0 = ds.covering_grid(
level=0, left_edge=[0, 0.0, 0.0], dims=ds.domain_dimensions
)
We can now access our underlying data at the lowest level by specifying what field we want to examine:
print(all_data_level_0["gas", "density"].shape)
# (64, 64, 64)
print(all_data_level_0["gas", "density"])
# array([[[ 1.92588925e-31, 1.74647692e-31, 2.54787518e-31, ...,
print(all_data_level_0["gas", "temperature"].shape)
# (64, 64, 64)
If you create a covering grid that spans two child grids of a single parent grid, it will fill those zones covered by a zone of a child grid with the data from that child grid. Where it is covered only by the parent grid, the cells from the parent grid will be duplicated (appropriately) to fill the covering grid.
Let’s say we now want to look at that entire data volume and sample it at a higher resolution (i.e. level 2). As stated above, we’ll be oversampling under-refined regions, but that’s OK. We must also increase the resolution of our output array by a factor of 2^2 in each direction to hold this new larger dataset:
all_data_level_2 = ds.covering_grid(
level=2, left_edge=[0, 0.0, 0.0], dims=ds.domain_dimensions * 2**2
)
And let’s see what’s the density in the central location:
print(all_data_level_2["gas", "density"].shape)
(256, 256, 256)
print(all_data_level_2["gas", "density"][128, 128, 128])
1.7747457571203124e-31
There are two different types of covering grids: unsmoothed and smoothed. Smoothed grids will be filled through a cascading interpolation process; they will be filled at level 0, interpolated to level 1, filled at level 1, interpolated to level 2, filled at level 2, etc. This will help to reduce edge effects. Unsmoothed covering grids will not be interpolated, but rather values will be duplicated multiple times.
To sample our dataset from above with a smoothed covering grid in order to reduce edge effects, it is a nearly identical process:
all_data_level_2_s = ds.smoothed_covering_grid(
2, [0.0, 0.0, 0.0], ds.domain_dimensions * 2**2
)
print(all_data_level_2_s["gas", "density"].shape)
(256, 256, 256)
print(all_data_level_2_s["gas", "density"][128, 128, 128])
1.763744852165591e-31
Covering grids can also accept a data_source
argument, in which case only
the cells of the covering grid that are contained by the data_source
will be
filled. This can be useful to create regularized arrays of more complex
geometries. For example, if we provide a sphere, we see that the covering grid
shape is the same, but the number of cells with data is less
sp = ds.sphere(ds.domain_center, (0.25, "code_length"))
cg_sp = ds.covering_grid(
level=0, left_edge=[0, 0.0, 0.0], dims=ds.domain_dimensions, data_source=sp
)
print(cg_sp["gas", "density"].shape)
(64, 64, 64)
print(cg_sp["gas", "density"].size)
262144
print(cg_sp["gas", "density"][cg_sp["gas", "density"] != 0].size)
17256
The data_source
can be any 3D Data Container. Also
note that the data_source
argument is only available for the covering_grid
at present (not the smoothed_covering_grid
).
Examining Image Data in a Fixed Resolution Array¶
In the same way that one can sample a multi-resolution 3D dataset by placing
it into a fixed resolution 3D array as a
Covering Grid, one can
also access the raw image data that is returned from various yt functions
directly as a fixed resolution array. This provides a means for bypassing the
yt method for generating plots, and allows the user the freedom to use
whatever interface they wish for displaying and saving their image data.
You can use the FixedResolutionBuffer
to accomplish this as described in Slice, Projections, and other Images: The Fixed Resolution Buffer.
High-level Information about Particles¶
There are a number of high-level helpers attached to Dataset
objects to find
out information about the particles in an output file. First, one can check if
there are any particles in a dataset at all by examining
ds.particles_exist
. This will be True
for datasets the include particles
and False
otherwise.
One can also see which particle types are available in a dataset. Particle types
that are available in the dataset’s on-disk output are known as “raw” particle
types, and they will appear in ds.particle_types_raw
. Particle types that
are dynamically defined via a particle filter of a particle union will also
appear in the ds.particle_types
list. If the simulation only has one
particle type on-disk, its name will by 'io'
. If there is more than one
particle type, the names of the particle types will be inferred from the output
file. For example, Gadget HDF5 files have particle type names like PartType0
and PartType1
, while Enzo data, which usually only has one particle type,
will only have a particle named io
.
Finally, one can see the number of each particle type by inspecting
ds.particle_type_counts
. This will be a dictionary mapping the names of
particle types in ds.particle_types_raw
to the number of each particle type
in a simulation output.