2.13. Using containers#

2.13.1. Running tools inside Docker#

Docker containers simplify software installation by providing a complete known-good runtime for software and its dependencies. However, containers are also purposefully isolated from the host system, so in order to run a tool inside a Docker container there is additional work to ensure that input files are available inside the container and output files can be recovered from the container. A CWL runner can perform this work automatically, allowing you to use Docker to simplify your software management while avoiding the complexity of invoking and managing Docker containers.

One of the responsibilities of the CWL runner is to adjust the paths of input files to reflect the location where they appear inside the container.

This example runs a simple Node.js script inside a Docker container which will then print “Hello World” to the standard output.

docker.cwl#
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: node
hints:
  DockerRequirement:
    dockerPull: node:slim
inputs:
  src:
    type: File
    inputBinding:
      position: 1
outputs:
  example_out:
    type: stdout
stdout: output.txt
docker-job.yml#
src:
  class: File
  path: hello.js

Before we run this, lets just break it down and see what some bits do. Most of this has been explained in previous sections, the only part that is really new is the dockerRequirement section.

baseCommand: node
hints:
  DockerRequirement:
    dockerPull: node:slim

baseCommand: node tells CWL that we will be running this command in a container. We then need to specify some hints for how to find the container we want. In this case we list just our requirements for the docker container in DockerRequirements. The dockerPull: parameter takes the same value that you would pass to a docker pull command. That is, the name of the container image (you can even specify the tag, which is good idea for best practises when using containers for reproducible research). In this case we have used a container called node:slim.

Provide a “hello.js” and invoke cwltool providing the tool description and the input object on the command line:

$ echo "console.log(\"Hello World\");" > hello.js
$ cwltool docker.cwl docker-job.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'docker.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/using-containers/docker.cwl'
INFO ['udocker', 'pull', 'node:slim']
Info: downloading layer sha256:31b3f1ad4ce1f369084d0f959813c51df0ca17d9877d5ee88c2db6ff88341430
Info: downloading layer sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Info: downloading layer sha256:c973db49e5a069ed8510ed67e27bf48843590ef201615806a2f8eed21c0a52c8
Info: downloading layer sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Info: downloading layer sha256:d5e77defed7104abe7707cb4c0f3e507468f8a7a4ad8bd5a4292846eea72709d
Info: downloading layer sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Info: downloading layer sha256:03a8016e9395070743fecbc7275973d378ad60de29f9158ca6815fe9ed2e0208
Info: downloading layer sha256:c8fcf5ec85715a1345a4dfdd513b19d26a5635eefacae5e3c2745f23827ac617
Info: downloading layer sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
Info: downloading layer sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4
INFO [job docker.cwl] /tmp/iom4rcrl$ udocker \
    --quiet \
    run \
    --volume=/tmp/iom4rcrl:/EMFdDO \
    --volume=/tmp/ekmngf89:/tmp \
    --volume=/home/runner/work/user_guide/user_guide/src/_includes/cwl/using-containers/hello.js:/var/lib/cwl/stgf9af1beb-04bf-4523-a14d-8542cf06572a/hello.js \
    --workdir=/EMFdDO \
    --rm \
    --env=TMPDIR=/tmp \
    --env=HOME=/EMFdDO \
    node:slim \
    node \
    /var/lib/cwl/stgf9af1beb-04bf-4523-a14d-8542cf06572a/hello.js > /tmp/iom4rcrl/output.txt
INFO [job docker.cwl] Max memory used: 20MiB
INFO [job docker.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/using-containers/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$648a6a6ffffdaa0badb23b8baf90b6168dd16b3a",
        "size": 12,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/using-containers/output.txt"
    }
}
INFO Final process status is success
$ cat output.txt
Message is: Hello world!

Notice the CWL runner has constructed a Docker command line to run the script.

In this example, the path to the script hello.js is /home/me/cwl/user_guide/hello.js outside the container but /var/lib/cwl/job369354770_examples/hello.js inside the container, as reflected in the invocation of the node command.