Testing¶

yt includes a testing suite that one can run on the code base to ensure that no breaks in functionality have occurred. This suite is based on the pytest testing framework and consists of two types of tests:

Unit tests. These make sure that small pieces of code run (or fail) as intended in predictable contexts. See Unit Testing.
Answer tests. These generate outputs from the user-facing yt API and compare them against the outputs produced using an older, “known good”, version of yt. See Answer Testing.

These tests ensure consistency in results as development proceeds.

We recommend that developers run tests locally on changed features when developing to help ensure that the new code does not break any existing functionality. To further this goal and ensure that changes do not propagate errors or have unintentional consequences on the rest of the codebase, the full test suite is run through our continuous integration (CI) servers. CI is run on push on open pull requests on a variety of computational platforms using Github Actions and a continuous integration server at the University of Illinois. The full test suite may take several hours to run, so we do not recommend running it locally.

Unit Testing¶

What Do Unit Tests Do¶

Unit tests are tests that operate on some small piece of code and verify that it behaves as intended. In practice, this means that we write simple assertions (or assert statements) about results and then let pytest go through them. A test is considered a success when no error (in particular AssertionError) occurs.

How to Run the Unit Tests¶

One can run the unit tests by navigating to the base directory of the yt git repository and invoking pytest:

$ cd $YT_GIT
$ pytest

where $YT_GIT is the path to the root of the yt git repository.

If you only want to run tests in a specific file, you can do so by specifying the path of the test relative to the $YT_GIT/ directory. For example, if you want to run the plot_window tests, you’d run:

$ pytest yt/visualization/tests/test_plotwindow.py

from the yt source code root directory.

Additionally, if you only want to run a specific test in a test file (rather than all of the tests contained in the file), such as, test_all_fields in plot_window.py, you can do so by running:

$ pytest yt/visualization/tests/test_plotwindow.py::test_all_fields

from the yt source code rood directory

See the pytest documentation for more on how to invoke pytest and select tests.

Unit Test Tools¶

yt provides several helper functions and decorators to write unit tests. These tools all reside in the yt.testing module. Describing them all in detail is outside the scope of this document, as in some cases they belong to other packages.

How To Write New Unit Tests¶

To create new unit tests:

Create a new tests/ directory next to the file containing the functionality you want to test and add an empty __init__.py file to it. #. If a tests/ directory already exists, there is no need to create a new one.
Inside this new tests/ directory, create a new python file prefixed with test_ and including the name of the functionality or source file being tested. #. If a file testing the functionality you’re interested in already exists, please add your tests to the existing there.
Inside this new test_ file, create functions prefixed with test_ that accept no arguments.
Each test function should do some work that tests some functionality and should also verify that the results are correct using assert statements or functions.
If a dataset is needed, use fake_random_ds, fake_amr_ds, or fake_particle_ds (the former two of which have support for particles that may be utilized) and be sure to test for several combinations of nproc so that domain decomposition can be tested as well.
To iterate over multiple options, or combinations of options, use the @pytest.mark.parametrize decorator.

For an example of how to write unit tests, look at the file yt/data_objects/tests/test_covering_grid.py, which covers a great deal of functionality.

Debugging Failing Tests¶

When writing new tests, one often exposes bugs or writes a test incorrectly,

causing an exception to be raised or a failed test. To help debug issues like this, pytest can drop into a debugger whenever a test fails or raises an exception.

In addition, one can debug more crudely using print statements. To do this, you can add print statements to the code as normal. However, the test runner will capture all print output by default. To ensure that output gets printed to your terminal while the tests are running, pass -s (which will disable stdout and stderr capturing) to the pytest executable.

$ pytest -s

Lastly, to quickly debug a specific failing test, it is best to only run that one test during your testing session. This can be accomplished by explicitly passing the name of the test function or class to pytest, as in the following example:

$ pytest yt/visualization/tests/test_plotwindow.py::TestSetWidth

This pytest invocation will only run the tests defined by the TestSetWidth class. See the pytest documentation for more on the various ways to invoke pytest.

Finally, to determine which test is failing while the tests are running, it helps to run the tests in “verbose” mode. This can be done by passing the -v option to the pytest executable.

$ pytest -v

All of the above pytest options can be combined. So, for example, to run the TestSetWidth tests with verbose output, letting the output of print statements come out on the terminal prompt, and enabling pdb debugging on errors or test failures, one would do:

$ pytest yt/visualization/tests/test_plotwindow.py::TestSetWidth -v -s --pdb

More pytest options can be found by using the --help flag

$ pytest --help

Answer Testing¶

Note

This section documents answer tests run with pytest. The plan is to switch to using pytest for answer tests at some point in the future, but currently (July 2024), answer tests are still implemented and run with nose. We generally encourage developers to use pytest for any new tests, but if you need to change or update one of the older nose tests, or are, e.g., writing a new frontend, an older version of this documentation decribes how the nose tests work.

Note

Given that nose never had support for Python 3.10 (which as of yt 4.4 is our oldest supported version), it is necessary to patch it to get tests running. This is the command we run on CI to this end find .venv/lib/python3.10/site-packages/nose -name '*.py' -exec sed -i -e s/collections.Callable/collections.abc.Callable/g '{}' ';'

What Do Answer Tests Do¶

Answer tests use actual data to test reading, writing, and various manipulations of that data. Answer tests are how we test frontends, as opposed to operations, in yt.

In order to ensure that each of these operations are performed correctly, we store gold standard versions of yaml files called answer files. More generally, an answer file is a yaml file containing the results of having run the answer tests, which can be compared to a reference, enabling us to control that results do not drift over time.

How to Run the Answer Tests¶

In order to run the answer tests locally:

Create a directory to hold the data you’ll be using for the answer tests you’ll be writing or the answer tests you’ll be running. This directory should be outside the yt git repository in a place that is logical to where you would normally store data.
Add folders of the required data to this directory. Other yt data, such as IsolatedGalaxy, can be downloaded to this directory as well.
Tell yt where it can find the data. This is done by setting the config parameter test_data_dir to the path of the directory with the test data downloaded from https://yt-project.org/data/. For example,

$ yt config set yt test_data_dir /Users/tomservo/src/yt-data

this should only need to be done once (unless you change where you’re storing the data, in which case you’ll need to repeat this step so yt looks in the right place).

Generate or obtain a set of gold standard answer files. In order to generate gold standard answer files, wwitch to a “known good” version of yt and then run the answer tests as described below. Once done, switch back to the version of yt you wish to test.
Now you’re ready to run the answer tests!

As an example, let’s focus on running the answer tests for the tipsy frontend. Let’s also assume that we need to generate a gold standard answer file. To do this, we first switch to a “known good” version of yt and run the following command from the top of the yt git directory (i.e., $YT_GIT) in order to generate the gold standard answer file:

Note

It’s possible to run the answer tests for all the frontends, but due to the large number of test datasets we currently use this is not normally done except on the yt project’s contiguous integration server.

$ cd $YT_GIT
$ pytest --with-answer-testing --answer-store --local-dir="$HOME/Documents/test" -k "TestTipsy"

The --with-answer-testing tells pytest that we want to run answer tests. Without this option, the unit tests will be run instead of the answer tests. The --answer-store option tells pytest to save the results produced by each test to a local gold standard answer file. Omitting this option is how we tell pytest to compare the results to a gold standard. The --local-dir option specifies where the gold standard answer file will be saved (or is already located, in the case that --answer-store is omitted). The -k option tells pytest that we only want to run tests whose name matches the given pattern.

Note

The path specified by --local-dir can, but does not have to be, the same directory as the test_data_dir configuration variable. It is best practice to keep the data that serves as input to yt separate from the answers produced by yt’s tests, however.

Note

The value given to the -k option (e.g., “TestTipsy”) is the name of the class containing the answer tests. You do not need to specify the path.

The newly generated gold standard answer file will be named tipsy_answers_xyz.yaml, where xyz denotes the version number of the gold standard answers. The answer version number is determined by the answer_version attribute of the class being tested (e.g., TestTipsy.answer_version).

Note

Changes made to yt sometimes result in known, expected changes to the way certain operations behave. This necessitates updating the gold standard answer files. This process is accomplished by changing the version number specified in each answer test class (e.g., TestTipsy.answer_version). The answer version for each test class can be found as the attribute answer_version of that class.

Once the gold standard answer file has been generated we switch back to the version of yt we want to test, recompile if necessary, and run the tests using the following command:

$ pytest --with-answer-testing --local-dir="$HOME/Documents/test" -k "TestTipsy"

The result of each test is printed to STDOUT. If a test passes, pytest prints a period. If a test fails, encounters an exception, or errors out for some reason, then an F is printed. Explicit descriptions for each test are also printed if you pass -v to the pytest executable. Similar to the unit tests, the -s and --pdb options can be passed, as well.

How to Write Answer Tests¶

To add a new answer test:

Create a new directory called tests inside the directory where the component you want to test resides and add an empty __init__.py file to it.
Create a new file in the tests directory that will hold the new answer tests. The name of the file should begin with test_.
Create a new class whose name begins with Test (e.g., TestTipsy).
Decorate the class with pytest.mark.answer_test. This decorator is used to tell pytest which tests are answer tests.

Note

Tests that do not have this decorator are considered to be unit tests.
Add the following three attributes to the class: answer_file=None, saved_hashes=None, and answer_version=000. These attributes are used by the hashing fixture (discussed below) to automate the creation of new answer files as well as facilitate the comparison to existing answers.
Add methods to the class that test a number of different fields and data objects.
If these methods are performing calculations or data manipulation, they should store the result in a ndarray, if possible. This array should be be added to the hashes (see below) dictionary like so: self.hashes.update(<test_name>:<array>), where <test_name> is the name of the function from yt/utilities/answer_testing/answer_tests.py that is being used and <array> is the ndarray holding the result

If you are adding to a frontend that has tests already, simply add methods to the existing test class.

There are several things that can make the test writing process easier:

yt/utilities/answer_testing/testing_utilities.py contains a large number of helper functions.
Most frontends end up needing to test much of the same functionality as other frontends. As such, a list of functions that perform such work can be found in yt/utilities/answer_testing/answer_tests.py.
Fixtures! You can find the set of fixtures that have already been built for yt in $YT_GIT/conftest.py. If you need/want to add additional fixtures, please add them there.
The parametrize decorator is extremely useful for performing iteration over various combinations of test parameters. It should be used whenever possible.
- The use of this decorator allows pytest to write the names and values of the test parameters to the generated answer files, which can make debugging failing tests easier, since one can easily see exactly which combination of parameters were used for a given test.
- It is also possible to employ the requires_ds decorator to ensure that a test does not run unless a specific dataset is found, but not necessary. If the dataset is parametrized over, then the ds fixture found in the root conftest.py file performs the same check and marks the test as failed if the dataset isn’t found.

Here is what a minimal example might look like for a new frontend:

# Content of yt/frontends/new_frontend/tests/test_outputs.py
import pytest

from yt.utilities.answer_testing.answer_tests import field_values

# Parameters to test with
ds1 = "my_first_dataset"
ds2 = "my_second_dataset"
field1 = ("Gas", "Density")
field2 = ("Gas", "Temperature")
obj1 = None
obj2 = ("sphere", ("c", (0.1, "unitary")))


@pytest.mark.answer_test
class TestNewFrontend:
    answer_file = None
    saved_hashes = None
    answer_version = "000"

    @pytest.mark.usefixtures("hashing")
    @pytest.mark.parametrize("ds", [ds1, ds2], indirect=True)
    @pytest.mark.parametrize("field", [field1, field2], indirect=True)
    @pytest.mark.parametrize("dobj", [obj1, obj2], indirect=True)
    def test_fields(self, ds, field, dobj):
        self.hashes.update({"field_values": field_values(ds, field, dobj)})

Answer test examples can be found in yt/frontends/enzo/tests/test_outputs.py.

Creating and Updating Image Baselines for pytest-mpl Tests¶

We use pytest-mpl for image comparison tests. These tests take the form of functions, which must be decorated with @pytest.mark.mpl_image_compare and return a matplotlib.figure.Figure object.

The collection of reference images is kept as git submodule in tests/pytest_mpl_baseline/.

There are 4 situations where updating reference images may be necessary

adding new tests
bugfixes
intentional change of style in yt
old baseline fails with a new version of matplotlib, but changes are not noticeable to the human eye

The process of updating images is the same in all cases. It involves opening two Pull Requests (PR) that we’ll number PR1 and PR2.

open a Pull Request (PR1) to yt’s main repo with the code changes
wait for tests jobs to complete
go to the “Checks” tab on the PR page (https://github.com/yt-project/yt/pull/<PR number>/checks)
if all tests passed, you’re done !
if tests other than image tests failed, fix them, and go back to step 2.

Otherwise, if only image tests failed, navigate to the “Build and Tests” job summary page.

at the bottom of the page, you’ll find “Artifacts”. Download yt_pytest_mpl_results.zip, unzip it and open fig_comparison.html therein; This document is an interactive report of the test job. Inspect failed tests results and verify that any differences are either intended or insignificant. If they are not, fix the code and go back to step 2
clone https://github.com/yt-project/yt_pytest_mpl_baseline.git and unzip the new baseline
Download the other artifact (yt_pytest_mpl_new_baseline.zip), unzip it within your clone of yt_pytest_mpl_baseline.
create a branch, commit all changes, and open a Pull Request (PR2) to https://github.com/yt-project/yt_pytest_mpl_baseline (PR2 should link to PR1)
wait for this second PR to be merged
Now it’s time to update PR1: navigate back to your local copy of yt’s main repository.
run the following commands

$ git submodule update --init
$ cd tests/pytest_mpl_baseline
$ git checkout main
$ git pull
$ cd ../
$ git add pytest_mpl_baseline
$ git commit -m "update image test baseline"
$ git push

go back to step 2. This time everything should pass. If not, ask for help !

Note

Though it is technically possible to (re)generate reference images locally, it is best not to, because at a pixel level, matplotlib’s behaviour is platform-dependent. By letting CI runners generate images, we ensure pixel-perfect comparison is possible in CI, which is where image comparison tests are most often run.

How to Write Image Comparison Tests (deprecated API)¶

Warning

this section describes deprecated API. New test code should follow _update_image_tests

Many of yt’s operations involve creating and manipulating images. As such, we have a number of tests designed to compare images. These tests employ functionality from matplotlib to automatically compare images and detect differences, if any. Image comparison tests are used in the plotting and volume rendering machinery.

The easiest way to use the image comparison tests is to make use of the generic_image function. As an argument, this function takes a function the test machinery can call which will save an image to disk. The test will then find any images that get created and compare them with the stored “correct” answer.

Here is an example test function (from yt/visualization/tests/test_raw_field_slices.py):

import pytest

import yt
from yt.utilities.answer_testing.answer_tests import generic_image
from yt.utilities.answer_testing.testing_utilities import data_dir_load, requires_ds

# Test data
raw_fields = "Laser/plt00015"


def compare(ds, field):
    def slice_image(im_name):
        sl = yt.SlicePlot(ds, "z", field)
        sl.set_log("all", False)
        image_file = sl.save(im_name)
        return image_file

    gi = generic_image(slice_image)
    # generic_image returns a list. In this case, there's only one entry,
    # which is a np array with the data we want
    assert len(gi) == 1
    return gi[0]


@pytest.mark.answer_test
@pytest.mark.usefixtures("temp_dir")
class TestRawFieldSlices:
    answer_file = None
    saved_hashes = None
    answer_version = "000"

    @pytest.mark.usefixtures("hashing")
    @requires_ds(raw_fields)
    def test_raw_field_slices(self, field):
        ds = data_dir_load(raw_fields)
        gi = compare(ds, field)
        self.hashes.update({"generic_image": gi})

Note

The inner function slice_image can create any number of images, as long as the corresponding filenames conform to the prefix.

Another good example of an image comparison test is the plot_window_attribute defined in the yt/utilities/answer_testing/answer_tests.py and used in yt/visualization/tests/test_plotwindow.py. This sort of image comparison test is more useful if you are finding yourself writing a ton of boilerplate code to get your image comparison test working. The generic_image function is more useful if you only need to do a one-off image comparison test.

Updating Answers¶

In order to regenerate answers for a particular set of tests it is sufficient to change the answer_version attribute in the desired test class.

When adding tests to an existing set of answers (like local_owls_000.yaml or local_varia_000.yaml), it is considered best practice to first submit a pull request adding the tests WITHOUT incrementing the version number. Then, allow the tests to run (resulting in “no old answer” errors for the missing answers). If no other failures are present, you can then increment the version number to regenerate the answers. This way, we can avoid accidentally covering up test breakages.

Handling yt Dependencies¶

Our dependencies are specified in pyproject.toml. Hard dependencies are found in project.dependencies, while optional dependencies are specified in project.optional-dependencies. The full target contains the specs to run our test suite, which are intended to be as modern as possible (we don’t set upper limits to versions unless we need to).

The test target specifies the tools needed to run the tests, but not needed by yt itself.

Documentation and typechecking requirements are found in requirements/, and used in tests/ci_install.sh.

Python version support. We vow to follow numpy’s deprecation plan regarding our supported versions for Python and numpy, defined formally in NEP 29, but generally support larger version intervals than recommended in this document.

Third party dependencies. We attempt to make yt compatible with a wide variety of upstream software versions. However, sometimes a specific version of a project that yt depends on causes some breakage and must be blacklisted in the tests or a more experimental project that yt depends on optionally might change sufficiently that the yt community decides not to support an old version of that project.

Note. Some of our optional dependencies are not trivial to install and their support may vary across platforms.

If you would like to add a new dependency for yt (even an optional dependency) or would like to update a version of a yt dependency, you must edit the pyproject.toml file. For new dependencies, simply append the name of the new dependency to the end of the file, along with a pin to the latest version number of the package. To update a package’s version, simply update the version number in the entry for that package.

Finally, we also run a set of tests with “minimal” dependencies installed. When adding tests that depend on an optional dependency, you can wrap the test with the yt.testing.requires_module decorator to ensure it does not run during the minimal dependency tests (see yt/frontends/amrvac/tests/test_read_amrvac_namelist.py for a good example). If for some reason you need to update the listing of packages that are installed for the “minimal” dependency tests, you will need to update minimal_requirements.txt.