This should be familiar to most Python programmers, but here’s a brief summary
anyway: a requirements file contains a list of dependencies for an application
(not a library), often with
specific version information. Many requirements files are generated with
pip freeze > requirements.txt1 . Here’s an example:
$ cat requirements.txt blessings==1.6 bpython==0.14.2 curtsies==0.1.19 flake8==2.4.1 greenlet==0.4.7 jedi==0.9.0 mccabe==0.3.1 msgpack-python==0.4.6 neovim==0.0.38 pep257==0.6.0 pep8==1.5.7 pyflakes==0.8.1 Pygments==2.0.2 requests==2.7.0 six==1.9.0
Most importantly, requirements files serve several purposes:
They ensure that our environments are consistent across different machines. Reproducible environments are crucial to preventing bugs, incompatibilities, and other breakage introduced by changes between versions of libraries.
They communicate to our fellow developers what the code we’ve written relies on. Given the requirements file of a project, we can generally guess the kinds of things it’s going to do before we read one line of code. e.g.
requestssuggests that an application is going to communicate with HTTP servers, and
msgpack-pythontells us that we’ll probably be using msgpack as an interchange format.
Refer again to the file above. Where did all of those things come from? Here’s the command that created that environment:
pip install bpython flake8 jedi neovim pep257
Notice any differences? We asked for five packages and got fifteen back. This is actually exactly what we want for purposes #1 and #3. For both of those use cases, we want to know exactly what library versions our application is being deployed against. Our update notifications and security alerts are only any good when the auditing services are checking the versions running on our servers. However, it does very little to address point #2: providing relaying information from one person to another.
Here’s how that
pip install resulted in that list of requirements:
$ pipdeptree bpython==0.14.2 - Pygments [installed: 2.0.2] - requests [installed: 2.7.0] - curtsies [required: >=0.1.18, installed: 0.1.19] - blessings [required: >=1.5, installed: 1.6] - greenlet [installed: 0.4.7] - six [required: >=1.5, installed: 1.9.0] flake8==2.4.1 - pyflakes [required: >=0.8.1, installed: 0.8.1] - mccabe [required: >=0.2.1, installed: 0.3.1] - pep8 [required: >=1.5.7, installed: 1.5.7] jedi==0.9.0 neovim==0.0.38 - msgpack-python [required: >=0.4.0, installed: 0.4.6] - greenlet [installed: 0.4.7] pep257==0.6.0
It turns out that
neovim required a bunch of
libraries. For a brand new virtualenv like this one, this is all pretty
readable. Once we start looking at projects that have a lot of high level
dependencies, making sense of this gets a lot harder. Additionally, when
updates to libraries like
bpython happen, e.g. when it drops support for
old, outdated versions of Python, dependencies like
six will be leftover,
unused by any libraries but still sticking around with each new deployment.
Here’s where I say something controversial within the Python community: Ruby got it right 2.
Bundler is a tool for managing Ruby environments; it can be seen as Ruby’s answer to virtualenv. In addition to isolating environments and installing dependencies, it provides separate files (Gemfile and Gemfile.lock) for human readable and machine readable 3 dependency specifications. My proposal: the Python community needs to take a similar approach.
While we need better tools 4 to address this problem, it’s largely a
social one. Before we can solve this, we need to agree that the situation needs
improvement. For now, I’m using the
pip-compile command from
pip-tools (this lets me keep separate
requirements.txt files for my applications), and I think
you should, too.
Please curate your requirements files with more care than this. Simply dumping the output from
pip freezewill likely lead to packages that are meant solely for development becoming permanent members of your deployment environments. ↩
While I’m at it, so did
node.js. NPM includes a command called shrinkwrap that produces a full, version-locked list of dependencies based on
package.json. Because of how Python’s import system works, this would be incredibly difficult (if not impossible) to pull off. ↩
Both of these files are actually in machine readable formats, but only the Gemfile addresses purpose #2 above. ↩
There are several tools aside from
pip-compileavailable. I considered each of these before finding and deciding on pip-tools. This list is here mainly as a reference for why each tool was not right for me.
pbundler: clone of Bundler, last updated in 2012.
pundle: clone of Bundler, reimplements standard Python tooling instead of working with it.
pundler: looks interesting; my second choice, but not as mature as pip-tools, and was broken with latest versions of pip when I last tried it (it tends to be difficult to use / write / maintain software that as a library when it was intended to be an application).