yt.loaders module

This module gathers all user-facing functions with a load_ prefix.

yt.loaders.load(fn: Union[str, os.PathLike[str]], *args, hint: Optional[str] = None, **kwargs)[source]

Load a Dataset or DatasetSeries object. The data format is automatically discovered, and the exact return type is the corresponding subclass of yt.data_objects.static_output.Dataset. A yt.data_objects.time_series.DatasetSeries is created if the first argument is a pattern.

Parameters
  • fn (str, os.Pathlike[str]) – A path to the data location. This can be a file name, directory name, a glob pattern, or a url (for data types that support it).

  • hint (str, optional) – Only classes whose name include a hint are considered. If loading fails with a YTAmbiguousDataType exception, this argument can be used to lift ambiguity. Hints are case insensitive.

  • arguments, if any, are passed down to the return class. (Additional) –

Returns

Raises
yt.loaders.load_amr_grids(grid_data, domain_dimensions, bbox=None, sim_time=0.0, length_unit=None, mass_unit=None, time_unit=None, velocity_unit=None, magnetic_unit=None, periodicity=True, True, True, geometry='cartesian', refine_by=2, unit_system='cgs', default_species_fields=None, *, parameters=None)[source]

Load a set of grids of data into yt as a StreamHandler. This should allow a sequence of grids of varying resolution of data to be loaded directly into yt and analyzed as would any others. This comes with several caveats:

  • Units will be incorrect unless the unit system is explicitly specified.

  • Some functions may behave oddly, and parallelism will be disappointing or non-existent in most cases.

  • Particles may be difficult to integrate.

  • No consistency checks are performed on the index

Parameters
  • grid_data (list of dicts) – This is a list of dicts. Each dict must have entries “left_edge”, “right_edge”, “dimensions”, “level”, and then any remaining entries are assumed to be fields. Field entries must map to an NDArray or a function with the signature (grid_object, field_name) -> NDArray. The grid_data may also include a particle count. If no particle count is supplied, the dataset is understood to contain no particles. The grid_data will be modified in place and can’t be assumed to be static. Grid data may also be supplied as a tuple of (NDarray or function, unit string) to specify the units.

  • domain_dimensions (array_like) – This is the domain dimensions of the grid

  • length_unit (string or float) – Unit to use for lengths. Defaults to unitless. If set to be a string, the bbox dimensions are assumed to be in the corresponding units. If set to a float, the value is a assumed to be the conversion from bbox dimensions to centimeters.

  • mass_unit (string or float) – Unit to use for masses. Defaults to unitless.

  • time_unit (string or float) – Unit to use for times. Defaults to unitless.

  • velocity_unit (string or float) – Unit to use for velocities. Defaults to unitless.

  • magnetic_unit (string or float) – Unit to use for magnetic fields. Defaults to unitless.

  • bbox (array_like (xdim:zdim, LE:RE), optional) – Size of computational domain in units specified by length_unit. Defaults to a cubic unit-length domain.

  • sim_time (float, optional) – The simulation time in seconds

  • periodicity (tuple of booleans) – Determines whether the data will be treated as periodic along each axis

  • geometry (string or tuple) – “cartesian”, “cylindrical”, “polar”, “spherical”, “geographic” or “spectral_cube”. Optionally, a tuple can be provided to specify the axis ordering – for instance, to specify that the axis ordering should be z, x, y, this would be: (“cartesian”, (“z”, “x”, “y”)). The same can be done for other coordinates, for instance: (“spherical”, (“theta”, “phi”, “r”)).

  • refine_by (integer or list/array of integers.) – Specifies the refinement ratio between levels. Defaults to 2. This can be an array, in which case it specifies for each dimension. For instance, this can be used to say that some datasets have refinement of 1 in one dimension, indicating that they span the full range in that dimension.

  • default_species_fields (string, optional) – If set, default species fields are created for H and He which also determine the mean molecular weight. Options are “ionized” and “neutral”.

  • parameters (dictionary, optional) – Optional dictionary used to populate the dataset parameters, useful for storing dataset metadata.

Examples

>>> grid_data = [
...     dict(
...         left_edge=[0.0, 0.0, 0.0],
...         right_edge=[1.0, 1.0, 1.0],
...         level=0,
...         dimensions=[32, 32, 32],
...         number_of_particles=0,
...     ),
...     dict(
...         left_edge=[0.25, 0.25, 0.25],
...         right_edge=[0.75, 0.75, 0.75],
...         level=1,
...         dimensions=[32, 32, 32],
...         number_of_particles=0,
...     ),
... ]
...
>>> for g in grid_data:
...     g[("gas", "density")] = (
...         np.random.random(g["dimensions"]) * 2 ** g["level"],
...         "g/cm**3",
...     )
...
>>> ds = load_amr_grids(grid_data, [32, 32, 32], length_unit=1.0)
yt.loaders.load_archive(fn: Union[str, pathlib.Path], path: str, ratarmount_kwa: Optional[Dict] = None, mount_timeout: float = 1.0, *args, **kwargs)yt.data_objects.static_output.Dataset[source]

Load archived data with yt.

This is a wrapper around load() to include mounting and unmounting the archive as a read-only filesystem and load it.

Parameters
  • fn (str) – The filename of the archive containing the dataset.

  • path (str) – The path to the dataset in the archive.

  • ratarmount_kwa (dict, optional) – Optional parameters to pass to ratarmount to mount the archive.

  • mount_timeout (float, optional) – The timeout to wait for ratarmount to mount the archive. Default is 1s.

Notes

  • The function is experimental and may work or not depending on your setup.

  • Any additional keyword argument is passed down to load().

  • This function requires ratarmount to be installed.

  • This function does not work on Windows system.

yt.loaders.load_hdf5_file(fn: Union[str, os.PathLike[str]], root_node: Optional[str] = '/', fields: Optional[List[str]] = None, bbox: numpy.ndarray = None, nchunks: int = 0, dataset_arguments: Optional[dict] = None)[source]

Create a (grid-based) yt dataset given the path to an hdf5 file.

This function accepts a filename, as well as (potentially) a bounding box, the root node where fields are stored, and the number of chunks to attempt to decompose the object into. This function will then introspect that HDF5 file, attempt to determine the available fields, and then supply a yt.data_objects.static_output.Dataset object. However, unlike the other loaders, the data is not required to be preloaded into memory, and will only be loaded on demand.

This does not yet work with particle-type datasets.

Parameters
  • fn (str, os.Pathlike[str]) – A path to the hdf5 file that contains the data.

  • root_node (str, optional) – If the fields to be loaded are stored under an HDF5 group object, specify it here. Otherwise, the fields are assumed to be at the root level of the HDF5 file hierarchy.

  • fields (list of str, optional) – The fields to be included as part of the dataset. If this is not included, all of the datasets under root_node will be included. If your file contains, for instance, a “parameters” node at the root level next to other fields, it would be (mistakenly) included. This allows you to specify only those that should be included.

  • bbox (array_like (xdim:zdim, LE:RE), optional) – If supplied, this will be the bounding box for the dataset. If not supplied, it will be assumed to be from 0 to 1 in all dimensions.

  • nchunks (int, optional) – How many chunks should this dataset be split into? If 0 or not supplied, yt will attempt to ensure that there is one chunk for every 64**3 zones in the dataset.

  • dataset_arguments (dict, optional) – Any additional arguments that should be passed to yt.loaders.load_amr_grids, including things like the unit length and the coordinates.

Returns

This returns an instance of a dataset, created with load_amr_grids, that can read from the HDF5 file supplied to this function. An open handle to the HDF5 file is retained.

Return type

yt.data_objects.static_output.Dataset object

Raises

FileNotFoundError – If fn does not match any existing file or directory.

yt.loaders.load_hexahedral_mesh(data, connectivity, coordinates, length_unit=None, bbox=None, sim_time=0.0, mass_unit=None, time_unit=None, velocity_unit=None, magnetic_unit=None, periodicity=True, True, True, geometry='cartesian', unit_system='cgs', *, parameters=None)[source]

Load a hexahedral mesh of data into yt as a StreamHandler.

This should allow a semistructured grid of data to be loaded directly into yt and analyzed as would any others. This comes with several caveats:

  • Units will be incorrect unless the data has already been converted to cgs.

  • Some functions may behave oddly, and parallelism will be disappointing or non-existent in most cases.

  • Particles may be difficult to integrate.

Particle fields are detected as one-dimensional fields. The number of particles is set by the “number_of_particles” key in data.

Parameters
  • data (dict) – This is a dict of numpy arrays, where the keys are the field names. There must only be one. Note that the data in the numpy arrays should define the cell-averaged value for of the quantity in in the hexahedral cell.

  • connectivity (array_like) – This should be of size (N,8) where N is the number of zones.

  • coordinates (array_like) – This should be of size (M,3) where M is the number of vertices indicated in the connectivity matrix.

  • bbox (array_like (xdim:zdim, LE:RE), optional) – Size of computational domain in units of the length unit.

  • sim_time (float, optional) – The simulation time in seconds

  • mass_unit (string) – Unit to use for masses. Defaults to unitless.

  • time_unit (string) – Unit to use for times. Defaults to unitless.

  • velocity_unit (string) – Unit to use for velocities. Defaults to unitless.

  • magnetic_unit (string) – Unit to use for magnetic fields. Defaults to unitless.

  • periodicity (tuple of booleans) – Determines whether the data will be treated as periodic along each axis

  • geometry (string or tuple) – “cartesian”, “cylindrical”, “polar”, “spherical”, “geographic” or “spectral_cube”. Optionally, a tuple can be provided to specify the axis ordering – for instance, to specify that the axis ordering should be z, x, y, this would be: (“cartesian”, (“z”, “x”, “y”)). The same can be done for other coordinates, for instance: (“spherical”, (“theta”, “phi”, “r”)).

  • parameters (dictionary, optional) – Optional dictionary used to populate the dataset parameters, useful for storing dataset metadata.

yt.loaders.load_octree(octree_mask, data, bbox=None, sim_time=0.0, length_unit=None, mass_unit=None, time_unit=None, velocity_unit=None, magnetic_unit=None, periodicity=True, True, True, over_refine_factor=None, num_zones=2, partial_coverage=1, unit_system='cgs', default_species_fields=None, *, parameters=None)[source]

Load an octree mask into yt.

Octrees can be saved out by calling save_octree on an OctreeContainer. This enables them to be loaded back in.

This will initialize an Octree of data. Note that fluid fields will not work yet, or possibly ever.

Parameters
  • octree_mask (np.ndarray[uint8_t]) – This is a depth-first refinement mask for an Octree. It should be of size n_octs * 8 (but see note about the root oct below), where each item is 1 for an oct-cell being refined and 0 for it not being refined. For num_zones != 2, the children count will still be 8, so there will still be n_octs * 8 entries. Note that if the root oct is not refined, there will be only one entry for the root, so the size of the mask will be (n_octs - 1)*8 + 1.

  • data (dict) – A dictionary of 1D arrays. Note that these must of the size of the number of “False” values in the octree_mask.

  • bbox (array_like (xdim:zdim, LE:RE), optional) – Size of computational domain in units of length

  • sim_time (float, optional) – The simulation time in seconds

  • length_unit (string) – Unit to use for lengths. Defaults to unitless.

  • mass_unit (string) – Unit to use for masses. Defaults to unitless.

  • time_unit (string) – Unit to use for times. Defaults to unitless.

  • velocity_unit (string) – Unit to use for velocities. Defaults to unitless.

  • magnetic_unit (string) – Unit to use for magnetic fields. Defaults to unitless.

  • periodicity (tuple of booleans) – Determines whether the data will be treated as periodic along each axis

  • partial_coverage (boolean) – Whether or not an oct can be refined cell-by-cell, or whether all 8 get refined.

  • default_species_fields (string, optional) – If set, default species fields are created for H and He which also determine the mean molecular weight. Options are “ionized” and “neutral”.

  • num_zones (int) – The number of zones along each dimension in an oct

  • parameters (dictionary, optional) – Optional dictionary used to populate the dataset parameters, useful for storing dataset metadata.

Example

>>> import numpy as np
>>> oct_mask = np.zeros(25)
... oct_mask[[0,  5,  7, 16]] = 8
>>> octree_mask = np.array(oct_mask, dtype=np.uint8)
>>> quantities = {}
>>> quantities["gas", "density"] = np.random.random((22, 1))
>>> bbox = np.array([[-10.0, 10.0], [-10.0, 10.0], [-10.0, 10.0]])
>>> ds = load_octree(
...     octree_mask=octree_mask,
...     data=quantities,
...     bbox=bbox,
...     num_zones=1,
...     partial_coverage=0,
... )
yt.loaders.load_particles(data, length_unit=None, bbox=None, sim_time=None, mass_unit=None, time_unit=None, velocity_unit=None, magnetic_unit=None, periodicity=True, True, True, geometry='cartesian', unit_system='cgs', data_source=None, default_species_fields=None, *, parameters=None)[source]

Load a set of particles into yt as a StreamParticleHandler.

This will allow a collection of particle data to be loaded directly into yt and analyzed as would any others. This comes with several caveats:

  • There must be sufficient space in memory to contain all the particle data.

  • Parallelism will be disappointing or non-existent in most cases.

  • Fluid fields are not supported.

Note: in order for the dataset to take advantage of SPH functionality, the following two fields must be provided: * (‘io’, ‘density’) * (‘io’, ‘smoothing_length’)

Parameters
  • data (dict) – This is a dict of numpy arrays or (numpy array, unit name) tuples, where the keys are the field names. Particles positions must be named “particle_position_x”, “particle_position_y”, and “particle_position_z”.

  • length_unit (float) – Conversion factor from simulation length units to centimeters

  • bbox (array_like (xdim:zdim, LE:RE), optional) – Size of computational domain in units of the length_unit

  • sim_time (float, optional) – The simulation time in seconds

  • mass_unit (float) – Conversion factor from simulation mass units to grams

  • time_unit (float) – Conversion factor from simulation time units to seconds

  • velocity_unit (float) – Conversion factor from simulation velocity units to cm/s

  • magnetic_unit (float) – Conversion factor from simulation magnetic units to gauss

  • periodicity (tuple of booleans) – Determines whether the data will be treated as periodic along each axis

  • data_source (YTSelectionContainer, optional) – If set, parameters like bbox, sim_time, and code units are derived from it.

  • default_species_fields (string, optional) – If set, default species fields are created for H and He which also determine the mean molecular weight. Options are “ionized” and “neutral”.

  • parameters (dictionary, optional) – Optional dictionary used to populate the dataset parameters, useful for storing dataset metadata.

Examples

>>> pos = [np.random.random(128 * 128 * 128) for i in range(3)]
>>> data = dict(
...     particle_position_x=pos[0],
...     particle_position_y=pos[1],
...     particle_position_z=pos[2],
... )
>>> bbox = np.array([[0.0, 1.0], [0.0, 1.0], [0.0, 1.0]])
>>> ds = load_particles(data, 3.08e24, bbox=bbox)
yt.loaders.load_sample(fn: Optional[str] = None, *, progressbar: bool = True, timeout=None, **kwargs)[source]

Load sample data with yt.

This is a simple wrapper around load() to include fetching data with pooch from remote source.

The data registry table can be retrieved and visualized using get_data_registry_table(). The filename column contains usable keys that can be passed as the first positional argument to load_sample. Some data samples contain series of datasets. It may be required to supply the relative path to a specific dataset.

Parameters
  • fn (str) – The filename of the dataset to load, as defined in the data registry table.

  • progressbar (bool) – display a progress bar (tqdm).

  • timeout (float or int (optional)) – Maximal waiting time, in seconds, after which download is aborted. None means “no limit”. This parameter is directly passed to down to requests.get via pooch.HTTPDownloader

Notes

  • This function is experimental as of yt 4.0.0, do not rely on its exact behaviour.

  • Any additional keyword argument is passed down to load().

  • In case of collision with predefined keyword arguments as set in the data registry, the ones passed to this function take priority.

  • Datasets with slashes ‘/’ in their names can safely be used even on Windows. On the contrary, paths using backslashes ‘' won’t work outside of Windows, so it is recommended to favour the UNIX convention (‘/’) in scripts that are meant to be cross-platform.

  • This function requires pandas and pooch.

  • Corresponding sample data live at https://yt-project.org/data

yt.loaders.load_simulation(fn, simulation_type, find_outputs=False)[source]

Load a simulation time series object of the specified simulation type.

Parameters
  • fn (str, os.Pathlike, or byte (types supported by os.path.expandusers)) – Name of the data file or directory.

  • simulation_type (str) – E.g. ‘Enzo’

  • find_outputs (bool) – Defaults to False

Raises
yt.loaders.load_uniform_grid(data, domain_dimensions, length_unit=None, bbox=None, nprocs=1, sim_time=0.0, mass_unit=None, time_unit=None, velocity_unit=None, magnetic_unit=None, periodicity=True, True, True, geometry='cartesian', unit_system='cgs', default_species_fields=None, *, cell_widths=None, parameters=None)[source]

Load a uniform grid of data into yt as a StreamHandler.

This should allow a uniform grid of data to be loaded directly into yt and analyzed as would any others. This comes with several caveats:

  • Units will be incorrect unless the unit system is explicitly specified.

  • Some functions may behave oddly, and parallelism will be disappointing or non-existent in most cases.

  • Particles may be difficult to integrate.

Particle fields are detected as one-dimensional fields.

Parameters
  • data (dict) – This is a dict of numpy arrays, (numpy array, unit spec) tuples. Functions may also be supplied in place of numpy arrays as long as the subsequent argument nprocs is not specified to be greater than 1. Supplied functions much accepts the arguments (grid_object, field_name) and return numpy arrays. The keys to the dict are the field names.

  • domain_dimensions (array_like) – This is the domain dimensions of the grid

  • length_unit (string) – Unit to use for lengths. Defaults to unitless.

  • bbox (array_like (xdim:zdim, LE:RE), optional) – Size of computational domain in units specified by length_unit. Defaults to a cubic unit-length domain.

  • nprocs (integer, optional) – If greater than 1, will create this number of subarrays out of data

  • sim_time (float, optional) – The simulation time in seconds

  • mass_unit (string) – Unit to use for masses. Defaults to unitless.

  • time_unit (string) – Unit to use for times. Defaults to unitless.

  • velocity_unit (string) – Unit to use for velocities. Defaults to unitless.

  • magnetic_unit (string) – Unit to use for magnetic fields. Defaults to unitless.

  • periodicity (tuple of booleans) – Determines whether the data will be treated as periodic along each axis

  • geometry (string or tuple) – “cartesian”, “cylindrical”, “polar”, “spherical”, “geographic” or “spectral_cube”. Optionally, a tuple can be provided to specify the axis ordering – for instance, to specify that the axis ordering should be z, x, y, this would be: (“cartesian”, (“z”, “x”, “y”)). The same can be done for other coordinates, for instance: (“spherical”, (“theta”, “phi”, “r”)).

  • default_species_fields (string, optional) – If set, default species fields are created for H and He which also determine the mean molecular weight. Options are “ionized” and “neutral”.

  • parameters (dictionary, optional) – Optional dictionary used to populate the dataset parameters, useful for storing dataset metadata.

Examples

>>> np.random.seed(int(0x4D3D3D3))
>>> bbox = np.array([[0.0, 1.0], [-1.5, 1.5], [1.0, 2.5]])
>>> arr = np.random.random((128, 128, 128))
>>> data = dict(density=arr)
>>> ds = load_uniform_grid(data, arr.shape, length_unit="cm", bbox=bbox, nprocs=12)
>>> dd = ds.all_data()
>>> dd[("gas", "density")]
unyt_array([0.76017901, 0.96855994, 0.49205428, ..., 0.78798258,
            0.97569432, 0.99453904], 'g/cm**3')
yt.loaders.load_unstructured_mesh(connectivity, coordinates, node_data=None, elem_data=None, length_unit=None, bbox=None, sim_time=0.0, mass_unit=None, time_unit=None, velocity_unit=None, magnetic_unit=None, periodicity=False, False, False, geometry='cartesian', unit_system='cgs', *, parameters=None)[source]

Load an unstructured mesh of data into yt as a StreamHandler.

This should allow an unstructured mesh data to be loaded directly into yt and analyzed as would any others. Not all functionality for visualization will be present, and some analysis functions may not yet have been implemented.

Particle fields are detected as one-dimensional fields. The number of particles is set by the “number_of_particles” key in data.

In the parameter descriptions below, a “vertex” is a 3D point in space, an “element” is a single polyhedron whose location is defined by a set of vertices, and a “mesh” is a set of polyhedral elements, each with the same number of vertices.

Parameters
  • connectivity (list of array_like or array_like) – This should either be a single 2D array or list of 2D arrays. If this is a list, each element in the list corresponds to the connectivity information for a distinct mesh. Each array can have different connectivity length and should be of shape (N,M) where N is the number of elements and M is the number of vertices per element.

  • coordinates (array_like) – The 3D coordinates of mesh vertices. This should be of size (L, D) where L is the number of vertices and D is the number of coordinates per vertex (the spatial dimensions of the dataset). Currently this must be either 2 or 3. When loading more than one mesh, the data for each mesh should be concatenated into a single coordinates array.

  • node_data (dict or list of dicts) – For a single mesh, a dict mapping field names to 2D numpy arrays, representing data defined at element vertices. For multiple meshes, this must be a list of dicts. Note that these are not the values as a function of the coordinates, but of the connectivity. Their shape should be the same as the connectivity. This means that if the data is in the shape of the coordinates, you may need to reshape them using the connectivity array as an index.

  • elem_data (dict or list of dicts) – For a single mesh, a dict mapping field names to 1D numpy arrays, where each array has a length equal to the number of elements. The data must be defined at the center of each mesh element and there must be only one data value for each element. For multiple meshes, this must be a list of dicts, with one dict for each mesh.

  • bbox (array_like (xdim:zdim, LE:RE), optional) – Size of computational domain in units of the length unit.

  • sim_time (float, optional) – The simulation time in seconds

  • length_unit (string) – Unit to use for length. Defaults to unitless.

  • mass_unit (string) – Unit to use for masses. Defaults to unitless.

  • time_unit (string) – Unit to use for times. Defaults to unitless.

  • velocity_unit (string) – Unit to use for velocities. Defaults to unitless.

  • magnetic_unit (string) – Unit to use for magnetic fields. Defaults to unitless.

  • periodicity (tuple of booleans) – Determines whether the data will be treated as periodic along each axis

  • geometry (string or tuple) – “cartesian”, “cylindrical”, “polar”, “spherical”, “geographic” or “spectral_cube”. Optionally, a tuple can be provided to specify the axis ordering – for instance, to specify that the axis ordering should be z, x, y, this would be: (“cartesian”, (“z”, “x”, “y”)). The same can be done for other coordinates, for instance: (“spherical”, (“theta”, “phi”, “r”)).

  • parameters (dictionary, optional) – Optional dictionary used to populate the dataset parameters, useful for storing dataset metadata.

Examples

Load a simple mesh consisting of two tets.

>>> # Coordinates for vertices of two tetrahedra
>>> coordinates = np.array(
...     [
...         [0.0, 0.0, 0.5],
...         [0.0, 1.0, 0.5],
...         [0.5, 1, 0.5],
...         [0.5, 0.5, 0.0],
...         [0.5, 0.5, 1.0],
...     ]
... )
>>> # The indices in the coordinates array of mesh vertices.
>>> # This mesh has two elements.
>>> connectivity = np.array([[0, 1, 2, 4], [0, 1, 2, 3]])
>>> # Field data defined at the centers of the two mesh elements.
>>> elem_data = {("connect1", "elem_field"): np.array([1, 2])}
>>> # Field data defined at node vertices
>>> node_data = {
...     ("connect1", "node_field"): np.array(
...         [[0.0, 1.0, 2.0, 4.0], [0.0, 1.0, 2.0, 3.0]]
...     )
... }
>>> ds = load_unstructured_mesh(
...     connectivity, coordinates, elem_data=elem_data, node_data=node_data
... )