Conda build framework

Building a package requires a recipe. A recipe is flat directory which contains the following files:

  • meta.yaml (metadata file)
  • build.sh (Unix build script which is executed using bash)
  • bld.bat (Windows build script which is executed using cmd)
  • run_test.py (optional Python test file)
  • patches to the source (optional, see below)
  • other resources, which are not included in the source and cannot be generated by the build scripts.

The same recipe should be used to build a package on all platforms.

When building a package, the following steps are invoked:

  1. read the metadata
  2. download the source (into a cache)
  3. extract the source in a source directory
  4. apply the patches
  5. create a build environment (build dependencies are installed here)
  6. run the actual build script. The current working directory is the source directory with environment variables set. The build script installs into the build environment
  7. do some necessary post processing steps: shebang, rpath, etc.
  8. add conda metadata to the build environment
  9. package up the new files in the build environment into a conda package
  10. test the new conda package: - create a test environment with the package (and its dependencies) - run the test scripts

There are example recipes for many conda packages in the conda-recipes repo.

The conda skeleton command can help to make skeleton recipes for common repositories, such as PyPI.

The meta.yaml file

All the metadata in the recipe is specified in the meta.yaml file. All sections are optional except for package/name and package/version.

package:
  name: bsdiff4       # lower case name of package, may contain '-' but no spaces
  version: "1.1.4"    # version of package. Should use the PEP-386 verlib
                      # conventions. Note that YAML will interpret
                      # versions like 1.0 as floats, meaning that 1.0 will
                      # be the same as 1. To avoid this, always put the
                      # version in quotes, so that it will be interpreted
                      # as a string.

source:
  # The source section specifies where the source code of the package is coming
  # from, it may be coming from a source tarball like:
  fn: bsdiff-1.1.14.tar.gz
  url: https://pypi.python.org/packages/source/b/bsdiff4/bsdiff4-1.1.4.tar.gz
  md5: 29f6089290505fc1a852e176bd276c43
  sha1: f0a2c9a30073449cfb7d171c57552f3109d93894
  sha256: 5a022ff4c1d1de87232b1c70bde50afbb98212fd246be4a867d8737173cf1f8f
  # or from git:
  git_url: git@github.com:ilanschnell/bsdiff4.git
  git_tag: 1.1.4
  # or from hg:
  hg_url: ssh://hg@bitbucket.org/ilanschnell/bsdiff4
  hg_tag: 1.1.4
  # or from svn:
  svn_url: https://github.com/ilanschnell/bsdiff
  svn_rev: 1.1.4
  svn_ignore_externals: yes # (defaults to no)

  # Patches may optionally be applied to the source
  patches:
    - my.patch    # the patch file is expected to be found in the recipe

# Note, the source section is optional. If you want to specify a source
# location locally, the easiest way is to not specify the source here, but
# to just add something like
#
# cp -r $RECIPE_DIR/../src .
# cd src
# ...
#
# in build.sh (and similarly in bld.bat). This assumes the source is
# shipped alongside the recipe in src.

build:
  # The build number should be incremented for new builds of the same version
  number: 1       # (defaults to 0)
  string: abc     # (defaults to default conda build string plus the build number)

  # Optional Python entry points
  entry_points:
    # This creates an entry point named bsdiff4 that calls bsdiff4.cli.main_bsdiff4()
    - bsdiff4 = bsdiff4.cli:main_bsdiff4
    - bspatch4 = bsdiff4.cli:main_bspatch4

  # If osx_is_app is set, entry points will use python.app instead of python in Mac OS X
  osx_is_app: yes # (defaults to no)

  # Whether binary files should be made relocatable (using
  # install_name_tool on OS X or patchelf on Linux). See the "making
  # packages relocatable" section below for more information on this.
  binary_relocation: false # (defaults to true)

  # See the features section below for more information on features

  # Defines what features a package has
  features:
    - feature1

  # Indicates that installing this package should enable (track) the given
  # features. A package does not need to have a feature to track it.
  track_features:
    - feature2

  # Preserve the Python egg directory. This is needed for some packages
  # that use setuptools specific features.
  preserve_egg_dir: yes # (default no)

  # A regular expression describing files to not install using soft
  # links. If hard links are not possible and this is set, the package
  # will be installed via copying. By default all files are considered
  # safe for soft linking.
  no_softlink: (bin/path1\.py|bin/path2) # Don't softlink bin/path1.py or bin/path2

  # Used instead of build.sh or bld.bat. For short build scripts, this can
  # be more convenient. You may need to use selectors (see below) to use
  # different scripts for different platforms.
  script: python setup.py install

  # Files that should have the placeholder prefix
  # (/opt/anaconda1anaconda2anaconda3) replaced with the install prefix at
  # installation.  Note that conda build does this automatically for the
  # build prefix. See the Relocatable section below.
  has_prefix_files:
    - bin/file1
    - lib/file2

# the build and runtime requirements. Dependencies of these requirements
# are included automatically.
requirements:
  # Packages required to build the package. python and numpy must be
  # listed explicitly if they are required.
  build:
    - python
  # Packages required to run the package. These are the dependencies that
  # will be installed automatically whenever the package is installed.
  run:
    - python
    - argparse # [py26]

test:
  # files which are copied from the recipe into the (temporary) test
  # directory which are needed during testing
  files:
    - test-data.txt
  # in addition to the run-time requirements, you can specify requirements
  # needed during testing. The run time requirements specified above are
  # included automatically.
  requires:
    - nose
  # commands we want to make sure they work, which are expected to get
  # installed by the package
  commands:
    - bsdiff4 -h
    - bspatch4 -h
  # Python imports
  imports:
    - bsdiff4

  # The script run_test.py will be run automatically if it is part of the
  # recipe

about:
  home: https://github.com/ilanschnell/bsdiff4
  license: BSD
  summary: binary diff and patch using the BSDIFF4-format

Specifying versions in requirements

Each element in the list of build and run-time requirements is a match specification, i.e. a string, which (when split by spaces) has 1, 2 or 3 parts:

  • the first part is always the (exact) name

  • the second part refers to the version, and may contain special characters

    | means “or”, e.g. 1.0|1. matches either version 1.0 or 1.2

    * means (in terms of regex) r'.*'

    Example:

    1.0|1.4* matches 1.0, 1.4, 1.4.1b2, but not 1.2 (when there are 3 parts, the second part has to be the exact version)

  • the third part is always the (exact) build string

Preprocessing selectors

In addition, you can add selectors to any line, which are used as part of a preprocessing stage. Before the yaml file is read, each selector is evaluated, and if it is False, the line that it is on is removed. A selector is of the form # [<selector>] at the end of a line.

For example

source:
  url: http://path/to/unix/source    # [not win]
  url: http://path/to/windows/source # [win]

A selector is just a valid Python statement, that is executed. The following variables are defined. Unless otherwise stated, the variables are booleans.

linux True if the platform is Linux
linux32 True if the platform is Linux and the Python architecture is 32-bit
linux64 True if the platform is Linux and the Python architecture is 64-bit
armv6 True if the platform is Linux and the Python architecture is armv6l
osx True if the platform is OS X
unix True if the platform is Unix (OS X or Linux)
win True if the platform is Windows
win32 True if the platform is Windows and the Python architecture is 32-bit
win64 True if the platform is Windows and the Python architecture is 64-bit
py The Python version as a two digit string (like '27'). See also the CONDA_PY environment variable below.
py3k True if the Python major version is 3
py2k True if the Python major version is 2
py26 True if the Python version is 2.6
py27 True if the Python version is 2.7
py33 True if the Python version is 3.3
np The NumPy version as a two digit string (like '17'). See also the CONDA_NPY environment variable below.

Because the selector is any valid Python expression, complicated logic is possible.

source:
  url: http://path/to/windows/source      # [win]
  url: http://path/to/python2/unix/source # [unix and py2k]
  url: http://path/to/python3/unix/source # [unix and py3k]

Note that the selectors delete only they line that they are on, so you may need to put the same selector on multiple lines.

source:
  url: http://path/to/windows/source     # [win]
  md5: 30fbf531409a18a48b1be249052e242a  # [win]
  url: http://path/to/unix/source        # [unix]
  md5: 88510902197cba0d1ab4791e0f41a66e  # [unix]

Environment variables set during the build process

The following environment variables are set, both on Unix (build.sh) and on Windows (bld.bat) during the build process:

ARCH Either 32 or 64, to specify whether the build is 32-bit or 64-bit. The value depends on the ARCH environment variable, and defaults to the architecture the interpreter running conda was compiled with.
CONDA_BUILD=1 Always set.
SRC_DIR Path to where source is unpacked (or cloned). If the source file is not a recognized file type (right now, .zip, .tar, .tar.bz2, .tar.xz, and .tar), this is a directory containing a copy of the source file.
PREFIX Build prefix where build script should install to.
RECIPE_DIR Directory of recipe.
PKG_NAME Name of the package being built.
PKG_VERSION Version of the package being built.
PKG_BUILDNUM Build number of the package being built.
PATH Prepended by the build prefix bin directory.
PYTHON Path to python executable in build prefix (note that python is only installed in the build prefix when it is listed as a build requirement).
PY3K 1 when Python 3 is installed in build prefix, else 0.
STDLIB_DIR Python standard library location
SP_DIR Python’s site-packages location
PY_VER Python version building against

When building “unix-style” packages on Windows, which are then usually statically linked to executables, we do this in a special Library directory under the build prefix. The following environment variables are only defined in Windows:

LIBRARY_PREFIX <build prefix>\Library
LIBRARY_BIN <build prefix>\Library\bin
LIBRARY_INC <build prefix>\Library\include
LIBRARY_LIB <build prefix>\Library\lib
SCRIPTS <build prefix>\Scripts

On non-Windows (Linux and Mac OS X), we have:

PKG_CONFIG_PATH Path to pkgconfig directory.
HOME Standard $HOME environment variable.

On Mac OS X, we have:

OSX_ARCH i386 or x86_64, depending on Python build
CFLAGS -arch flag.
CXXFLAGS Same as CFLAGS.
LDFLAGS Same as CFLAGS.
MACOSX_DEPLOYMENT_TARGET Same as the Anaconda Python. Currently 10.5.

On Linux, we have:

LD_RUN_PATH <build prefix>/lib

All of the above environment variables are also set during the test process, except with the test prefix instead of the build prefix everywhere.

Note that build.sh is run with bash -x -e (the -x makes it echos each command that is run, and the -e makes it exit whenever a command in the script returns nonzero exit status). You can revert this in the script if you need to by using the set command.

Environment variables that affect the build process

CONDA_PY Should be 26, 27, or 33. This is the Python version used to build the package.
CONDA_NPY Should be either 16 or 17. This is the NumPy version used to build the package.

Pre/Post link/unlink scripts

You can add scripts pre-link.sh, post-link.sh, or pre-unlink.sh (or .bat for Windows) to the recipe, which will be run before the package is installed, after it is installed, and before it is removed, respectively. If these scripts exit nonzero the installation/removal will fail.

Environment variables are set in these scripts:

PREFIX The install prefix.
PKG_NAME The name of the package.
PKG_VERSION The version of the package.
PKG_BUILDNUM The build number of the package.

No output is shown from the build script, but it may write to $PREFIX/.messages.txt, which is shown after conda completes all actions.

Post-build version

In some cases, you may not know the version of the package until after it is built. In this case, you can write a file named __conda_version__.txt to the source directory, and the contents of the file will be used as the version (and the version from the meta.yaml will be ignored).

Features

Features are a way to track differences in two packages that have the same name and version. For example, a feature might indicate a specialized compiler or runtime, or a fork of a package. The canonical example of a feature is the mkl feature in Anaconda Accelerate. Packages that are compiled against MKL, such as NumPy, have the mkl feature set. The mkl package has the mkl feature set in track_features, so that installing it installs the mkl feature.

Features should be thought of as features of the environment the package is installed into, not the package itself. The reason is that when a feature is installed, conda will automatically change to a package with that feature if it exists, for instance, when the mkl feature is installed, regular numpy is removed and the numpy package with the mkl feature is installed. Enabling a feature does not install any packages that are not already installed, but it all future packages with that feature that are installed into that environment will be preferred.

Feature names are independent of package names—it is a coincidence that mkl is both the name of a package and the feature that it tracks.

To install a feature, install a package that tracks it. To remove a feature, use conda remove --features

Making Packages Relocatable

Often, the most difficult thing about building a conda package is making it relocatable. Relocatable means that the package can be installed into any prefix. Otherwise, the package would only be usable in the same environment in which it was built.

Conda build does the following things automatically to make packages relocatable:

  • Binary object files are converted to use relative paths using install_name_tool on Mac OS X and patchelf on Linux.
  • The build prefix is replaced in any text (non-binary) file with the prefix placeholder, /opt/anaconda1anaconda2anaconda3, and the file is added to the has_prefix file in the package metadata. When conda installs the package, the placeholder prefix is replaced with the install prefix in all files in info/has_prefix. See Package metadata for more information.
  • You can manually add files to has_prefix by listing the in build/has_prefix_files in the meta.yaml (see above). The files listed here should have the placeholder prefix (/opt/anaconda1anaconda2anaconda3).