Testing¶
yt includes a testing suite that one can run on the code base to ensure that no breaks in functionality have occurred. This suite is based on the pytest testing framework and consists of two types of tests:
Unit tests. These make sure that small pieces of code run (or fail) as intended in predictable contexts. See Unit Testing.
Answer tests. These generate outputs from the user-facing yt API and compare them against the outputs produced using an older, “known good”, version of yt. See Answer Testing.
These tests ensure consistency in results as development proceeds.
We recommend that developers run tests locally on changed features when developing to help ensure that the new code does not break any existing functionality. To further this goal and ensure that changes do not propagate errors or have unintentional consequences on the rest of the codebase, the full test suite is run through our continuous integration (CI) servers. CI is run on push on open pull requests on a variety of computational platforms using Github Actions and a continuous integration server at the University of Illinois. The full test suite may take several hours to run, so we do not recommend running it locally.
Unit Testing¶
What Do Unit Tests Do¶
Unit tests are tests that operate on some small piece of code and verify
that it behaves as intended. In
practice, this means that we write simple assertions (or assert
statements) about results and then let pytest go through them. A test is considered a success when no error (in particular AssertionError
) occurs.
How to Run the Unit Tests¶
One can run the unit tests by navigating to the base directory of the yt git
repository and invoking pytest
:
$ cd $YT_GIT
$ pytest
where $YT_GIT
is the path to the root of the yt git repository.
If you only want to run tests in a specific file, you can do so by specifying the path of the test relative to the
$YT_GIT/
directory. For example, if you want to run the plot_window
tests, you’d
run:
$ pytest yt/visualization/tests/test_plotwindow.py
from the yt source code root directory.
Additionally, if you only want to run a specific test in a test file (rather than all of the tests contained in the file), such as, test_all_fields
in plot_window.py
, you can do so by running:
$ pytest yt/visualization/tests/test_plotwindow.py::test_all_fields
from the yt source code rood directory
See the pytest documentation for more on how to invoke pytest and select tests.
Unit Test Tools¶
yt provides several helper functions and decorators to write unit tests. These tools all reside in the yt.testing
module. Describing them all in detail is outside the scope of this
document, as in some cases they belong to other packages.
How To Write New Unit Tests¶
To create new unit tests:
Create a new
tests/
directory next to the file containing the functionality you want to test and add an empty__init__.py
file to it. #. If atests/
directory already exists, there is no need to create a new one.Inside this new
tests/
directory, create a new python file prefixed withtest_
and including the name of the functionality or source file being tested. #. If a file testing the functionality you’re interested in already exists, please add your tests to the existing there.Inside this new
test_
file, create functions prefixed withtest_
that accept no arguments.Each test function should do some work that tests some functionality and should also verify that the results are correct using assert statements or functions.
If a dataset is needed, use
fake_random_ds
,fake_amr_ds
, orfake_particle_ds
(the former two of which have support for particles that may be utilized) and be sure to test for several combinations ofnproc
so that domain decomposition can be tested as well.To iterate over multiple options, or combinations of options, use the @pytest.mark.parametrize decorator.
For an example of how to write unit tests, look at the file
yt/data_objects/tests/test_covering_grid.py
, which covers a great deal of
functionality.
Debugging Failing Tests¶
When writing new tests, one often exposes bugs or writes a test incorrectly,
causing an exception to be raised or a failed test. To help debug issues like
this, pytest
can drop into a debugger whenever a test fails or raises an
exception.
In addition, one can debug more crudely using print statements. To do this,
you can add print statements to the code as normal. However, the test runner
will capture all print output by default. To ensure that output gets printed
to your terminal while the tests are running, pass -s
(which will disable stdout and stderr capturing) to the pytest
executable.
$ pytest -s
Lastly, to quickly debug a specific failing test, it is best to only run that
one test during your testing session. This can be accomplished by explicitly
passing the name of the test function or class to pytest
, as in the
following example:
$ pytest yt/visualization/tests/test_plotwindow.py::TestSetWidth
This pytest invocation will only run the tests defined by the
TestSetWidth
class. See the pytest documentation for more on the various ways to invoke pytest.
Finally, to determine which test is failing while the tests are running, it helps
to run the tests in “verbose” mode. This can be done by passing the -v
option
to the pytest
executable.
$ pytest -v
All of the above pytest
options can be combined. So, for example, to run
the TestSetWidth
tests with verbose output, letting the output of print
statements come out on the terminal prompt, and enabling pdb debugging on errors
or test failures, one would do:
$ pytest yt/visualization/tests/test_plotwindow.py::TestSetWidth -v -s --pdb
More pytest options can be found by using the --help
flag
$ pytest --help
Answer Testing¶
Note
This section documents answer tests run with pytest
. The plan is to
switch to using pytest
for answer tests at some point in the future,
but currently (July 2024), answer tests are still implemented and run with
nose
. We generally encourage developers to use pytest
for any new
tests, but if you need to change or update one of the older nose
tests, or are, e.g., writing a new frontend,
an older version of this documentation
decribes how the nose
tests work.
Note
Given that nose never had support for Python 3.10 (which as of yt 4.4 is our
oldest supported version), it is necessary to patch it to get tests running.
This is the command we run on CI to this end
find .venv/lib/python3.10/site-packages/nose -name '*.py' -exec sed -i -e s/collections.Callable/collections.abc.Callable/g '{}' ';'
What Do Answer Tests Do¶
Answer tests use actual data to test reading, writing, and various manipulations of that data. Answer tests are how we test frontends, as opposed to operations, in yt.
In order to ensure that each of these operations are performed correctly, we store gold standard versions of yaml files called answer files. More generally, an answer file is a yaml file containing the results of having run the answer tests, which can be compared to a reference, enabling us to control that results do not drift over time.
How to Run the Answer Tests¶
In order to run the answer tests locally:
Create a directory to hold the data you’ll be using for the answer tests you’ll be writing or the answer tests you’ll be running. This directory should be outside the yt git repository in a place that is logical to where you would normally store data.
Add folders of the required data to this directory. Other yt data, such as
IsolatedGalaxy
, can be downloaded to this directory as well.Tell yt where it can find the data. This is done by setting the config parameter
test_data_dir
to the path of the directory with the test data downloaded from https://yt-project.org/data/. For example,
$ yt config set yt test_data_dir /Users/tomservo/src/yt-data
this should only need to be done once (unless you change where you’re storing the data, in which case you’ll need to repeat this step so yt looks in the right place).
Generate or obtain a set of gold standard answer files. In order to generate gold standard answer files, wwitch to a “known good” version of yt and then run the answer tests as described below. Once done, switch back to the version of yt you wish to test.
Now you’re ready to run the answer tests!
As an example, let’s focus on running the answer tests for the tipsy frontend. Let’s also assume that we need to generate a gold standard answer file. To do this, we first switch to a “known good” version of yt and run the following command from the top of the yt git directory (i.e., $YT_GIT
) in order to generate the gold standard answer file:
Note
It’s possible to run the answer tests for all the frontends, but due to the large number of test datasets we currently use this is not normally done except on the yt project’s contiguous integration server.
$ cd $YT_GIT
$ pytest --with-answer-testing --answer-store --local-dir="$HOME/Documents/test" -k "TestTipsy"
The --with-answer-testing
tells pytest that we want to run answer tests. Without this option, the unit tests will be run instead of the answer tests. The --answer-store
option tells pytest to save the results produced by each test to a local gold standard answer file. Omitting this option is how we tell pytest to compare the results to a gold standard. The --local-dir
option specifies where the gold standard answer file will be saved (or is already located, in the case that --answer-store
is omitted). The -k
option tells pytest that we only want to run tests whose name matches the given pattern.
Note
The path specified by --local-dir
can, but does not have to be, the same directory as the test_data_dir
configuration variable. It is best practice to keep the data that serves as input to yt separate from the answers produced by yt’s tests, however.
Note
The value given to the -k option (e.g., “TestTipsy”) is the name of the class containing the answer tests. You do not need to specify the path.
The newly generated gold standard answer file will be named tipsy_answers_xyz.yaml
, where xyz
denotes the version number of the gold standard answers. The answer version number is determined by the answer_version
attribute of the class being tested (e.g., TestTipsy.answer_version
).
Note
Changes made to yt sometimes result in known, expected changes to the way certain operations behave. This necessitates updating the gold standard answer files. This process is accomplished by changing the version number specified in each answer test class (e.g., TestTipsy.answer_version
). The answer version for each test class can be found as the attribute answer_version of that class.
Once the gold standard answer file has been generated we switch back to the version of yt we want to test, recompile if necessary, and run the tests using the following command:
$ pytest --with-answer-testing --local-dir="$HOME/Documents/test" -k "TestTipsy"
The result of each test is printed to STDOUT. If a test passes, pytest prints a period. If a test fails, encounters an
exception, or errors out for some reason, then an F is printed. Explicit descriptions for each test
are also printed if you pass -v
to the pytest
executable. Similar to the unit tests, the -s
and --pdb
options can be passed, as well.
How to Write Answer Tests¶
To add a new answer test:
Create a new directory called
tests
inside the directory where the component you want to test resides and add an empty__init__.py
file to it.Create a new file in the
tests
directory that will hold the new answer tests. The name of the file should begin withtest_
.Create a new class whose name begins with
Test
(e.g.,TestTipsy
).Decorate the class with
pytest.mark.answer_test
. This decorator is used to tell pytest which tests are answer tests.Note
Tests that do not have this decorator are considered to be unit tests.
Add the following three attributes to the class:
answer_file=None
,saved_hashes=None
, andanswer_version=000
. These attributes are used by thehashing
fixture (discussed below) to automate the creation of new answer files as well as facilitate the comparison to existing answers.Add methods to the class that test a number of different fields and data objects.
If these methods are performing calculations or data manipulation, they should store the result in a
ndarray
, if possible. This array should be be added to thehashes
(see below) dictionary like so:self.hashes.update(<test_name>:<array>)
, where<test_name>
is the name of the function fromyt/utilities/answer_testing/answer_tests.py
that is being used and<array>
is thendarray
holding the result
If you are adding to a frontend that has tests already, simply add methods to the existing test class.
There are several things that can make the test writing process easier:
yt/utilities/answer_testing/testing_utilities.py
contains a large number of helper functions.Most frontends end up needing to test much of the same functionality as other frontends. As such, a list of functions that perform such work can be found in
yt/utilities/answer_testing/answer_tests.py
.Fixtures! You can find the set of fixtures that have already been built for yt in
$YT_GIT/conftest.py
. If you need/want to add additional fixtures, please add them there.- The parametrize decorator is extremely useful for performing iteration over various combinations of test parameters. It should be used whenever possible.
The use of this decorator allows pytest to write the names and values of the test parameters to the generated answer files, which can make debugging failing tests easier, since one can easily see exactly which combination of parameters were used for a given test.
It is also possible to employ the
requires_ds
decorator to ensure that a test does not run unless a specific dataset is found, but not necessary. If the dataset is parametrized over, then theds
fixture found in the rootconftest.py
file performs the same check and marks the test as failed if the dataset isn’t found.
Here is what a minimal example might look like for a new frontend:
# Content of yt/frontends/new_frontend/tests/test_outputs.py
import pytest
from yt.utilities.answer_testing.answer_tests import field_values
# Parameters to test with
ds1 = "my_first_dataset"
ds2 = "my_second_dataset"
field1 = ("Gas", "Density")
field2 = ("Gas", "Temperature")
obj1 = None
obj2 = ("sphere", ("c", (0.1, "unitary")))
@pytest.mark.answer_test
class TestNewFrontend:
answer_file = None
saved_hashes = None
answer_version = "000"
@pytest.mark.usefixtures("hashing")
@pytest.mark.parametrize("ds", [ds1, ds2], indirect=True)
@pytest.mark.parametrize("field", [field1, field2], indirect=True)
@pytest.mark.parametrize("dobj", [obj1, obj2], indirect=True)
def test_fields(self, ds, field, dobj):
self.hashes.update({"field_values": field_values(ds, field, dobj)})
Answer test examples can be found in yt/frontends/enzo/tests/test_outputs.py
.
Creating and Updating Image Baselines for pytest-mpl Tests¶
We use pytest-mpl for image comparison
tests. These tests take the form of functions, which must be decorated with
@pytest.mark.mpl_image_compare
and return a matplotlib.figure.Figure
object.
The collection of reference images is kept as git submodule in tests/pytest_mpl_baseline/
.
There are 4 situations where updating reference images may be necessary
adding new tests
bugfixes
intentional change of style in yt
old baseline fails with a new version of matplotlib, but changes are not noticeable to the human eye
The process of updating images is the same in all cases. It involves opening two Pull Requests (PR) that we’ll number PR1 and PR2.
open a Pull Request (PR1) to yt’s main repo with the code changes
wait for tests jobs to complete
go to the “Checks” tab on the PR page (
https://github.com/yt-project/yt/pull/<PR number>/checks
)if all tests passed, you’re done !
if tests other than image tests failed, fix them, and go back to step 2.
Otherwise, if only image tests failed, navigate to the “Build and Tests” job summary page.
at the bottom of the page, you’ll find “Artifacts”. Download
yt_pytest_mpl_results.zip
, unzip it and openfig_comparison.html
therein; This document is an interactive report of the test job. Inspect failed tests results and verify that any differences are either intended or insignificant. If they are not, fix the code and go back to step 2clone
https://github.com/yt-project/yt_pytest_mpl_baseline.git
and unzip the new baselineDownload the other artifact (
yt_pytest_mpl_new_baseline.zip
), unzip it within your clone ofyt_pytest_mpl_baseline
.create a branch, commit all changes, and open a Pull Request (PR2) to
https://github.com/yt-project/yt_pytest_mpl_baseline
(PR2 should link to PR1)wait for this second PR to be merged
Now it’s time to update PR1: navigate back to your local copy of
yt
’s main repository.run the following commands
$ git submodule update --init
$ cd tests/pytest_mpl_baseline
$ git checkout main
$ git pull
$ cd ../
$ git add pytest_mpl_baseline
$ git commit -m "update image test baseline"
$ git push
go back to step 2. This time everything should pass. If not, ask for help !
Note
Though it is technically possible to (re)generate reference images locally, it is best not to, because at a pixel level, matplotlib’s behaviour is platform-dependent. By letting CI runners generate images, we ensure pixel-perfect comparison is possible in CI, which is where image comparison tests are most often run.
How to Write Image Comparison Tests (deprecated API)¶
Warning
this section describes deprecated API. New test code should follow _update_image_tests
Many of yt’s operations involve creating and manipulating images. As such, we have a number of tests designed to compare images. These tests employ functionality from matplotlib to automatically compare images and detect differences, if any. Image comparison tests are used in the plotting and volume rendering machinery.
The easiest way to use the image comparison tests is to make use of the
generic_image
function. As an argument, this function takes a function the test machinery can call which will save an image to disk. The test will then find any images that get created and compare them with the stored “correct” answer.
Here is an example test function (from yt/visualization/tests/test_raw_field_slices.py
):
import pytest
import yt
from yt.utilities.answer_testing.answer_tests import generic_image
from yt.utilities.answer_testing.testing_utilities import data_dir_load, requires_ds
# Test data
raw_fields = "Laser/plt00015"
def compare(ds, field):
def slice_image(im_name):
sl = yt.SlicePlot(ds, "z", field)
sl.set_log("all", False)
image_file = sl.save(im_name)
return image_file
gi = generic_image(slice_image)
# generic_image returns a list. In this case, there's only one entry,
# which is a np array with the data we want
assert len(gi) == 1
return gi[0]
@pytest.mark.answer_test
@pytest.mark.usefixtures("temp_dir")
class TestRawFieldSlices:
answer_file = None
saved_hashes = None
answer_version = "000"
@pytest.mark.usefixtures("hashing")
@requires_ds(raw_fields)
def test_raw_field_slices(self, field):
ds = data_dir_load(raw_fields)
gi = compare(ds, field)
self.hashes.update({"generic_image": gi})
Note
The inner function slice_image
can create any number of images, as long as the corresponding filenames conform to the prefix.
Another good example of an image comparison test is the
plot_window_attribute
defined in the yt/utilities/answer_testing/answer_tests.py
and used in
yt/visualization/tests/test_plotwindow.py
. This sort of image comparison
test is more useful if you are finding yourself writing a ton of boilerplate
code to get your image comparison test working. The generic_image
function is
more useful if you only need to do a one-off image comparison test.
Updating Answers¶
In order to regenerate answers for a particular set of tests it is sufficient to
change the answer_version
attribute in the desired test class.
When adding tests to an existing set of answers (like local_owls_000.yaml
or local_varia_000.yaml
),
it is considered best practice to first submit a pull request adding the tests WITHOUT incrementing
the version number. Then, allow the tests to run (resulting in “no old answer” errors for the missing
answers). If no other failures are present, you can then increment the version number to regenerate
the answers. This way, we can avoid accidentally covering up test breakages.
Handling yt Dependencies¶
Our dependencies are specified in pyproject.toml
. Hard dependencies are found in
project.dependencies
, while optional dependencies are specified in
project.optional-dependencies
. The full
target contains the specs to run our
test suite, which are intended to be as modern as possible (we don’t set upper
limits to versions unless we need to).
The test
target specifies the tools needed to run the tests, but
not needed by yt itself.
Documentation and typechecking requirements are found in requirements/
,
and used in tests/ci_install.sh
.
Python version support. We vow to follow numpy’s deprecation plan regarding our supported versions for Python and numpy, defined formally in NEP 29, but generally support larger version intervals than recommended in this document.
Third party dependencies. We attempt to make yt compatible with a wide variety of upstream software versions. However, sometimes a specific version of a project that yt depends on causes some breakage and must be blacklisted in the tests or a more experimental project that yt depends on optionally might change sufficiently that the yt community decides not to support an old version of that project.
Note. Some of our optional dependencies are not trivial to install and their support may vary across platforms.
If you would like to add a new dependency for yt (even an optional dependency)
or would like to update a version of a yt dependency, you must edit the
pyproject.toml
file. For new dependencies, simply append the name of the new
dependency to the end of the file, along with a pin to the latest version
number of the package. To update a package’s version, simply update the version
number in the entry for that package.
Finally, we also run a set of tests with “minimal” dependencies installed.
When adding tests that depend on an optional dependency, you can wrap the test
with the yt.testing.requires_module decorator
to ensure it does not run
during the minimal dependency tests (see
yt/frontends/amrvac/tests/test_read_amrvac_namelist.py
for a good example).
If for some reason you need to update the listing of packages that are installed
for the “minimal” dependency tests, you will need to update
minimal_requirements.txt
.