Haskell Version Lockdown

Update

This post contains all the details you need about freezing dependencies. We published an update focusing on an existing solution you may find easier to use than the one listed here and how this is being integrated into cabal.

Motivation

Cabal is a great tool for library authors. As a library author we could give you a few minor nitpicks but we have few substantial complaints.

What we often forget is that the needs of library authors are different from application builders. It has been said before that Cabal is not a package manager. But cabal-install's constraints are greater than not managing user dependencies properly: it still does not provide basic essential tools for application builders.

Application builders need to produce reliable, re-playable builds. Haskellers will often attempt to do this with a Cabal file. But .cabal file versioning is meant for library authors to specify maximum version ranges that a library author hopes will work with their package. Pegging packages to specific versions in a .cabal file will eventually fail because there are dependencies of dependencies that are not pegged.

Solution Brief

At docmunch we came up with a simple solution to this problem: write out a file containing the exact versions of all packages being used and check it in to version control. The file looks like this:

components:
  executable 'foo':
    - deepseq-1.3.0.1
    - base-4.6.0.1
  test suite 'foo-test':
    - HUnit-1.2.5.2

Our build server will then use this lock file to guarantee that the binary that gets shipped to our application servers uses the same versions of dependencies as we used in development and testing.

Related Work

or, just because Ruby does it does not mean that it is criminally unsafe.

Our solution is essentially the same as Ruby gems Gemfile.lock but not as heavy-weight since Haskell does not decide dependencies at application startup like Ruby (no bundle exec is required).

Talking with some other industrial users also validated what we are doing: most adopt techniques for limiting what can be installed, and attempt to achieve the same end result.

Solution Details

Generating a .lock.yaml file

We build myProject.lock.yaml by setting cabal-version: >=1.8 and Build-Type: custom in myProject.cabal. Then, we add this Setup.hs file to the root of our project directory:

Every time cabal runs the configure step, it will write out a new lock file. You will want to run your configure step with all components enabled (cabal configure --enable-tests, etc).

Just adding the myProject.lock.yaml file to your project will make dependency differences between users visible in code diffs which is good, but ends up creating conflicts if you don't actually use it for installation.

The Setup.hs code is slow and uses partial functions where we are not even sure if they are safe. We are hoping the community can start forking the code and do a better job of figuring out cabal APIs and help improve it.

Installing dependencies with your .lock.yaml file

The real gains from the .lock.yaml file approach, however, are from replayable builds. Here's a somewhat ugly but perfectly functional command to do exactly that:

cat myProject.lock.yaml | grep ' -' | cut -d ' ' -f 6 | sort | uniq | xargs cabal install -j --force-reinstalls;

The file is in a yaml format to make it easier to install dependencies from individual cabal components. If you install the latest version of yaml with cabal install yaml, there is a yaml2json executable we created that is available. yaml2json myProject.lock.yaml will produce JSON, and there are many available command-line JSON tools (perhaps a commenter can point out a Haskell tool).

Library authors vs. Application developers

Library authors have no need for lock files since they need to build across as many versions as possible. However, they may find it useful to publish lock files of successfully built versions.

Application developers no longer need to*peg packages to specific versions in their cabal file. Instead they specify version ranges that they want to install from when they change or upgrade their dependencies.

Usage with cabal-meta

This versioning solution is similar to how cabal-meta functions. cabal-meta keeps a separate list of special packages to install and feeds that to cabal-install. cabal-meta was mainly designed to deal with building multiple local packages at once, which to a certain extent you can use the add-source command for from cabal-dev or the new cabal sandbox feature. cabal-meta also helps automatically build remote dependencies. One problem with cabal-meta is that it is a separate executable and it does nothing to stop you from accidentally not using it and just using cabal/cabal-dev. We will look into cabal-meta integration with lock files in the future.

Usage with cabal 1.18 sandboxes

If you aren't using cabal sandboxes, please immediately stop what you are doing, read Mikhail's awesome introduction to Cabal Sandboxes, upgrade cabal, and type cabal sandbox init in your project. We have been using the new sandboxes for weeks now, and it is great. We don't have to remember to use cabal-dev instead of cabal, w just have to type cabal sandbox init once.

Another very important feature of Cabal 1.18 is that it adds the ability to build individual targets. So if you have a library foo and a test-suite foo-test, you can type cabal build foo to build the library and cabal build foo-test to build the test suite. This makes using a lock file easier because you can always use cabal configure --enable-tests to start with to write out the lock file and then you can choose your build target later.

Using Cabal 1.18 combined with our lock files means we now spend none of our time on installation issues that tools should easily solve for us. Right before we rolled out our lock file one of our team members had an installation failure issue on the continuous integration server. This is the kind of build issue that is still common place with Haskell. After rolling out the lock file we had him re-merge his branch and install from the lock file, and the build passed, proving the value of what we had done.

It was just posted that Hackage2 is going to be officially release soon. In the span of a few months Haskell is changing from ridiculously harder to manage packages than mainstream programming languages to being on par.