How to Develop yt¶
Note
If you already know how to use version control and are comfortable with handling it yourself, the quickest way to contribute to yt is to fork us on BitBucket, make your changes, and issue a pull request. The rest of this document is just an explanation of how to do that.
yt is a community project!
We are very happy to accept patches, features, and bugfixes from any member of the community! yt is developed using mercurial, primarily because it enables very easy and straightforward submission of changesets. We’re eager to hear from you, and if you are developing yt, we encourage you to subscribe to the developer mailing list
Please feel free to hack around, commit changes, and send them upstream. If you’re new to Mercurial, these three resources are pretty great for learning the ins and outs:
Keep in touch, and happy hacking! We also provide doc/coding_styleguide.txt and an example of a fiducial docstring in doc/docstring_example.txt. Please read them before hacking on the codebase, and feel free to email any of the mailing lists for help with the codebase.
Licensing¶
All contributed code must be GPL-compatible; we ask that you consider licensing under the GPL version 3, but we will consider submissions of code that are BSD-like licensed as well. If you’d rather not license in this manner, but still want to contribute, just drop me a line and I’ll put a link on the main wiki page to wherever you like!
Bootstrapping Your Development Environment¶
Getting up and running with developing yt can be somewhat daunting. To assist with that, yt provides a ‘bootstrap’ script that handles a couple of the more annoying items on the checklist – getting set up on BitBucket, creating a pasteboard, and adding a couple handy extensions to Mercurial. As time goes on, we hope that we will be able to use the extensions added during this process to both issue forks and pull requests to BitBucket, enabling much more rapid and easy development. To run the script, on the command line type:
$ yt bootstrap_dev
Note
Although the bootstrap script will manipulate and modify your ~/.hgrc and possibly your BitBucket repositories, it will ask before doing anything. You should feel free to Ctrl-C out at any time. If you wish to inspect the source code of the bootstrap script, it is located in yt/utilities/command_line.py in the function do_bootstrap.
Here is the list of items that the script will attempt to accomplish, along with a brief motivation of each.
- Ensure that the yt-supplemental repository is checked out into ``YT_DEST``. To make sure that the extensions we’re going to use to facilitate mercurial usage are checked out and ready to go, we optionally clone the repository here. If you’ve run with a recent install script, this won’t be necessary.
- Create an ``~/.hgrc`` if it does not exist, and add your username. Because Mercurial’s changesets are all signed with a username, we make sure that your username is set in your ~/.hgrc. The script will prompt you for what you would like it to be. When committing to yt, we strongly prefer you set it to be of the form “Firstname Lastname <email@address.com>”. If you want to skip this step, simply set the configuration value yourself in ~/.hgrc. Any of the above-listed tutorials on hg can help with this.
- Create a BitBucket user if you do not have one. Because yt is developed on the source code hosting site BitBucket, we make sure that you’re set up to have a username there. You should not feel obliged to do this step if you do not want to, but it provides a much more convenient mechanism for sharing changes, reporting issues, and contributing to the yt wiki. It also provides a location to host an unlimited number of publicly accessible repositories, if you wish to share other pieces of code with other users. (See Included hg Extensions for more information about this.)
- Turn on the ``hgbb`` and ``cedit`` extensions in ``~/.hgrc``. This sets up these extensions, described below. It amounts to adding them to the [extensions] section and adding your BitBucket username to the [bb] section.
- Create a pasteboard repository. This is the step that is probably the most fun. yt now comes with pasteboard facilities. A pasteboard is like a pastebin, except designed to be more persistent – it’s a versioned repository that contains scripts with descriptions, which are automatically posted to the web. You can download from your pasteboard programmatically using the yt pasteboard command, and you can download from other pasteboards using the yt pasteboard_grab command. This repository will be created on BitBucket, and will be of the name your_username.bitbucket.org, which is also the web address it will be hosted at.
And that’s it! If you run into any trouble, please email yt-dev with your concerns, questions or error messages. This should put you in a good place to start developing yt efficiently.
Included hg Extensions¶
Mercurial is written in Python, and as such is easily extensible by scripts. It comes with a number of extensions (descriptions of which you can find on the Mercurial wiki under UsingExtensions. Some of my favorites are transplant, extdiff, color and progress.) yt now comes bundled with a few additional extensions, which should make interacting with other repositories and BitBucket a bit easier.
The first of these is hgbb, which is a Mercurial extension that interacts with the public-facing BitBucket-API. It adds several commands, and you can get information about these commands by typing:
$ hg help COMMANDNAME
It also adds the URL-specifer bb://USERNAME/reponame for convenience; this means you can reference sskory/yt to see Stephen’s yt fork, for instance.
The most fun of these commands are:
- bbcreate
- This creates a new repository on BitBucket and clones it locally. This is really cool and very convenient when developing.
- bbforks
- This shows the status of all known forks of a given repository, and can show the incoming and outgoing changesets. You can use this to see what changesets are different between yours and another repository.
As time goes on, and as the BitBucket API is expanded to cover things like forking and pull requests, we hope that this extension will also expand.
The other extension that is currently bundled with yt is the cedit extension. This adds the ability to add, remove and set configuration options from the command line. This brings with it the ability to add new sources for Mercurial repositories – for instance, if you become aware of a different source repository you want to be able to pull from, you can add it as a source and then pull from it directly.
The new commands you may be interested in are:
- cedit
- Set an option in either the local or the global configuration file.
- addsource
- Add a mercurial repo to the [paths] section of the local repository.
How To Get The Source Code¶
If you just want to look at the source code, you already have it on your computer. Go to the directory where you ran the install_script.sh, then go to $YT_DEST/src/yt-hg . In this directory are a number of subdirectories with different components of the code, although most of them are in the yt subdirectory. Feel free to explore here. If you’re looking for a specific file or function in the yt source code, use the unix find command:
$ find <DIRECTORY_TREE_TO_SEARCH> -name '<FILENAME>'
The above command will find the FILENAME in any subdirectory in the DIRECTORY_TREE_TO_SEARCH. Alternatively, if you’re looking for a function call or a keyword in an unknown file in a directory tree, try:
$ grep -R <KEYWORD_TO_FIND> <DIRECTORY_TREE_TO_SEARCH>
This can be very useful for tracking down functions in the yt source.
While you can edit this source code and execute it on your local machine, you will be unable to share your work with others in the community (or get feedback on your work). If you want to submit your modifications to the yt project, follow the directions below.
How To Get The Source Code For Editing¶
yt is hosted on BitBucket, and you can see all of the yt repositories at http://hg.yt-project.org/ . With the yt installation script you should have a copy of Mercurial for checking out pieces of code. Make sure you have followed the steps above for bootstrapping your development (to assure you have a bitbucket account, etc.)
In order to access the source code for yt, we ask that you make a “fork” of the main yt repository on bitbucket. A fork is simply an exact copy of the main repository (along with its history) that you will now own and can make modifications as you please. You can create a personal fork by visiting the yt bitbucket webpage at https://bitbucket.org/yt_analysis/yt/wiki/Home . After logging in, you should see an option near the top right labeled “fork”. Click this option, and then click the fork repository button on the subsequent page. You now have a forked copy of the yt repository for your own personal use.
This forked copy exists on the bitbucket repository, so in order to access it locally, follow the instructions at the top of that webpage for that forked repository, namely run at a local command line:
$ hg clone http://bitbucket.org/<USER>/<REPOSITORY_NAME>
This downloads that new forked repository to your local machine, so that you can access it, read it, make modifications, etc. It will put the repository in a local directory of the same name as the repository in the current working directory. You can see any past state of the code by using the hg log command. For example, the following command would show you the last 5 changesets (modifications to the code) that were submitted to that repository.
$ cd <REPOSITORY_NAME>
$ hg log -l 5
Using the revision specifier (the number or hash identifier next to each changeset), you can update the local repository to any past state of the code (a previous changeset or version) by executing the command:
$ hg up revision_specifier
Lastly, if you want to use this new downloaded version of your yt repository as the active version of yt on your computer (i.e. the one which is executed when you run yt from the command line or from yt.mods import *), then you must “activate” it using the following commands from within the repository directory.
In order to do this for the first time with a new repository, you have to copy some config files over from your yt installation directory (where yt was initially installed from the install_script.sh). Try this:
$ cp $YT_DEST/src/yt-hg/*.cfg <REPOSITORY_NAME>
and then every time you want to “activate” a different repository of yt.
$ cd <REPOSITORY_NAME>
$ python2.7 setup.py develop
This will rebuild all C modules as well.
How To Submit Changes¶
So now you’ve made some cool modifications to the yt source, and you want to share it with the rest of the community. But wait, how can we trust that your modifications aren’t going to break the rest of yt (after all, mine have several times!) So we have you submit your code using the “pull request” mechanism on bitbucket. A pull request basically submits your modifications to the repository, but they need to be reviewed/tested by other users of the code before they’re pulled into the main repository.
When you’re ready to submit them to the main repository, simply go to the bitbucket page for your personal fork of the yt-analysis yt repository, and click the button to issue a pull request (at top right):
Make sure you notify yt_analysis and put in a little description. That’ll notify the core developers that you’ve got something ready to submit, and we will review it and (hopefully!) merge it in. If it goes well, you may end up with push access to the main repository.
How To Download/Test Someone Else’s Pull Request¶
Go to the bitbucket yt repository webpage. Follow the instructions for cloning the repo. You must clone a new version of yt on your local machine (for the purposes of testing) by running a command like:
$ hg clone https://bitbucket.org/yt_analysis/yt yt-testing
At the top of the webpage, click on the pull request tab, and click on the specific pull request you want to test.
Click on the changeset hash (e.g. 3ec2c245c827) just below the name of the pull request and the name of the user who submitted the pull request.
This page has the details of the changesets that were made in the pull request. To view how these results are different from the existing repository (i.e. the one that you already cloned), click the “compare fork” button in the upper right of the page.
This page has instructions on how to merge this pull request into your local cloned copy of yt (so that you can test it on your own machine). In this case, it should give you these commands (because it recognizes that the pull request came from the submitter’s forked branch of yt):
$ hg pull -r yt https://bitbucket.org/<USERNAME>/<REPOSITORY_NAME>
$ hg update yt
$ hg merge yt
After running these commands on your local testing copy of yt, you will want to use this testing yt as your default (at least temporarily) so that you can test how this version of yt behaves with whatever testing scripts. You can “activate” it by going into the new yt-testing directory and running:
$ ``cp yt-old/src/yt-hg/*.cfg yt-testing``
$ cd yt-testing
$ python setup.py develop
Now do whatever tests that you need to do on the pull-requested version of yt, using your normal from yt.mods import * or whatever. When you are done testing this version of yt, just get rid of the yt-testing directory tree, and “reactivate” your old version of yt:
$ rm -rf yt-testing
$ cd yt-old/src/yt-hg
$ python setup.py develop
If you want to accept the changeset or reject it (if you have sufficient priveleges) or comment on it, you can do so from its pull request webpage.
How To Read The Source Code¶
yt is organized into several sub-packages, each of which governs a different conceptual regime.
- frontends
This is where interfaces to codes are created. Within each subdirectory of yt/frontends/ there must exist the following files, even if empty:
- data_structures.py, where subclasses of AMRGridPatch, StaticOutput and AMRHierarchy are defined.
- io.py, where a subclass of IOHandler is defined.
- misc.py, where any miscellaneous functions or classes are defined.
- definitions.py, where any definitions specific to the frontend are defined. (i.e., header formats, etc.)
- visualization
- This is where all visualization modules are stored. This includes plot collections, the volume rendering interface, and pixelization frontends.
- data_objects
- All objects that handle data, processed or unprocessed, not explicitly defined as visualization are located in here. This includes the base classes for data regions, covering grids, time series, and so on. This also includes derived fields and derived quantities.
- analysis_modules
- This is where all mechanisms for processing data live. This includes things like clump finding, halo profiling, halo finding, and so on. This is something of a catchall, but it serves as a level of greater abstraction that simply data selection and modification.
- gui
- This is where all GUI components go. Typically this will be some small tool used for one or two things, which contains a launching mechanism on the command line.
- utilities
- All broadly useful code that doesn’t clearly fit in one of the other categories goes here.
How To Use Branching¶
If you are planning on making a large change to the code base that may not be ready for many commits, or if you are planning on breaking some functionality and rewriting it, you are encouraged to create a new named branch. You can mark the current repository as a new named branch by executing:
$ hg branch new_feature_name
The next commit and all subsequent commits will be contained within that named branch. At this point, add your branch on the ExistingBranches wiki page.
To merge changes in from another branch, you would execute:
$ hg merge some_other_branch
Note also that you can use revision specifiers instead of some_other_branch. When you are ready to merge back into the main branch, execute this process:
$ hg merge name_of_main_branch
$ hg commit --close-branch
$ hg up -C name_of_main_branch
$ hg merge name_of_feature_branch
$ hg commit
When you execute the merge you may have to resolve conflicts. Once you resolve conflicts in a file, you can mark it as resolved by doing:
$ hg resolve -m path/to/conflicting/file.py
Please be careful when resolving conflicts in files.
Once your branch has been merged in, mark it as closed on the wiki page.
Code Style Guide¶
To keep things tidy, we try to stick with a couple simple guidelines.
General Guidelines¶
- In general, follow PEP-8 guidelines.
- Classes are ConjoinedCapitals, methods and functions are lowercase_with_underscores.
- Use 4 spaces, not tabs, to represent indentation.
- Line widths should not be more than 80 characters.
- Do not use nested classes unless you have a very good reason to, such as requiring a namespace or class-definition modification. Classes should live at the top level. __metaclass__ is exempt from this.
- Do not use unecessary parenthesis in conditionals. if((something) and (something_else)) should be rewritten as if something and something_else. Python is more forgiving than C.
- Avoid copying memory when possible. For example, don’t do a = a.reshape(3,4) when a.shape = (3,4) will do, and a = a * 3 should be na.multiply(a, 3, a).
- In general, avoid all double-underscore method names: __something is usually unnecessary.
- Doc strings should describe input, output, behavior, and any state changes that occur on an object. See the file doc/docstring_example.txt for a fiducial example of a docstring.
API Guide¶
- Do not import “*” from anything other than yt.funcs.
- Internally, only import from source files directly; instead of: from yt.visualization.api import PlotCollection do from yt.visualization.plot_collection import PlotCollection.
- Numpy is to be imported as na not np. While this may change in the future, for now this is the correct idiom.
- Do not use too many keyword arguments. If you have a lot of keyword arguments, then you are doing too much in __init__ and not enough via parameter setting.
- In function arguments, place spaces before commas. def something(a,b,c) should be def something(a, b, c).
- Don’t create a new class to replicate the functionality of an old class – replace the old class. Too many options makes for a confusing user experience.
- Parameter files external to yt are a last resort.
- The usage of the **kwargs construction should be avoided. If they cannot be avoided, they must be explained, even if they are only to be passed on to a nested function.
Variable Names and Enzo-isms¶
- Avoid Enzo-isms. This includes but is not limited to:
- Hard-coding parameter names that are the same as those in Enzo. The following translation table should be of some help. Note that the parameters are now properties on a StaticOutput subclass: you access them like pf.refine_by .
- RefineBy `` => `` refine_by
- TopGridRank `` => `` dimensionality
- TopGridDimensions `` => `` domain_dimensions
- InitialTime `` => `` current_time
- DomainLeftEdge `` => `` domain_left_edge
- DomainRightEdge `` => `` domain_right_edge
- CurrentTimeIdentifier `` => `` unique_identifier
- CosmologyCurrentRedshift `` => `` current_redshift
- ComovingCoordinates `` => `` cosmological_simulation
- CosmologyOmegaMatterNow `` => `` omega_matter
- CosmologyOmegaLambdaNow `` => `` omega_lambda
- CosmologyHubbleConstantNow `` => `` hubble_constant
- Do not assume that the domain runs from 0 to 1. This is not true everywhere.
- Variable names should be short but descriptive.
- No globals!
Project Ideas¶
There are lots of places in yt where new extensions could be added, or new functionality put in place. Here are a few.
Adding Support for a New Code¶
yt strives to be a general-purpose analysis tool for astrophysical data. To that end, we’d like to short up our support for codes besides Enzo, as well as ensure that the other codes we support – Orion, Tiger, etc – are well-supported.
A page has been set up on the Trac site to describe the method of adding support for a new code to yt. Please feel free to use it as a reference, but if you would like some assistance, drop a line to one of the mailing lists (see Search/Ask the Mailing List) for more help.
GUIs and Interactive Exploration¶
The 1.7 release adds some functionality for interactive exploration of data, but this could be greatly expanded. In particular, the volume rendering could be converted to a hardware based volume renderer (some simple sketches of this exist in the mercurial repository), the VTK interface could be improved, and the various components that form the interface for examining data could be integrated.
Simulated Observations¶
The functionality to construct simulated observations is an end goal for yt. Transforming simulation output into mock observations is the Stanley Cup.
Bug Fixes¶
If you have simple bug fixes, please feel free to attach them to a ticket on the bug tracker (you might have to register first) or to email them to one of the developers directly. We’re always happy to hear about the things we’ve done wrong, and how you’ve fixed them!
Fields and Extensions¶
yt comes with a bunch of derived fields. However, if you have constructed some that add interesting analysis quantities, please feel free to send them to one of the developers!
Additionally, if you have a sub-module that extends yt in a fun or exciting way, we’d be very happy to include it.
Analysis Code and Examples¶
Because yt can be a bit difficult to become fully acquainted with, we encourage you to share your analysis scripts. Specifically, we will provide you with free repository space to store any analysis scripts that went into the writing of a paper. Through this, we hope to build up a library not only of usage-cases, but of real-world examples of plot generation and data analysis.
If you are interested in submitting your scripts, please contact Matt Turk at matthewturk@gmail.com, or just create a repository on BitBucket and send over the URL!