2.4. Inputs#

2.4.1. Essential Input Parameters#

The inputs of a tool is a list of input parameters that control how to run the tool. Each parameter has an id for the name of parameter, and type describing what types of values are valid for that parameter.

Available primitive types are string, int, long, float, double, and null; complex types are array and record; in addition there are special types File, Directory and Any.

The following example demonstrates some input parameters with different types and appearing on the command line in different ways.

First, create a file called inp.cwl, containing the following:

inp.cwl#
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
baseCommand: echo
inputs:
  example_flag:
    type: boolean
    inputBinding:
      position: 1
      prefix: -f
  example_string:
    type: string
    inputBinding:
      position: 3
      prefix: --example-string
  example_int:
    type: int
    inputBinding:
      position: 2
      prefix: -i
      separate: false
  example_file:
    type: File?
    inputBinding:
      prefix: --file=
      separate: false
      position: 4

outputs: []

Create a file called inp-job.yml:

inp-job.yml#
example_flag: true
example_string: hello
example_int: 42
example_file:
  class: File
  path: whale.txt

Notice that “example_file”, as a File type, must be provided as an object with the fields class: File and path.

Next, create a whale.txt using touch by typing touch whale.txt on the command line.

$ touch whale.txt

Now invoke cwltool with the tool description and the input object on the command line, using the command cwltool inp.cwl inp-job.yml. The following boxed text describes these two commands and the expected output from the command line:

$ cwltool inp.cwl inp-job.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'inp.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/inp.cwl'
INFO [job inp.cwl] /tmp/lcsrtu7y$ echo \
    -f \
    -i42 \
    --example-string \
    hello \
    --file=/tmp/gif_kzx0/stg6f45617e-72da-48dd-b667-2af2bfadbfc9/whale.txt
-f -i42 --example-string hello --file=/tmp/gif_kzx0/stg6f45617e-72da-48dd-b667-2af2bfadbfc9/whale.txt
INFO [job inp.cwl] completed success
{}
INFO Final process status is success

Tip

Where did those `/tmp` paths come from?

The CWL reference runner (cwltool) and other runners create temporary directories with symbolic (“soft”) links to your input files to ensure that the tools aren’t accidentally accessing files that were not explicitly specified

The field inputBinding is optional and indicates whether and how the input parameter should appear on the tool’s command line. If inputBinding is missing, the parameter does not appear on the command line. Let’s look at each example in detail.

example_flag:
  type: boolean
  inputBinding:
    position: 1
    prefix: -f

Boolean types are treated as a flag. If the input parameter “example_flag” is “true”, then prefix will be added to the command line. If false, no flag is added.

example_string:
  type: string
  inputBinding:
    position: 3
    prefix: --example-string

String types appear on the command line as literal values. The prefix is optional, if provided, it appears as a separate argument on the command line before the parameter . In the example above, this is rendered as --example-string hello.

example_int:
  type: int
  inputBinding:
    position: 2
    prefix: -i
    separate: false

Integer (and floating point) types appear on the command line with decimal text representation. When the option separate is false (the default value is true), the prefix and value are combined into a single argument. In the example above, this is rendered as -i42.

example_file:
  type: File?
  inputBinding:
    prefix: --file=
    separate: false
    position: 4

File types appear on the command line as the path to the file. When the parameter type ends with a question mark ? it indicates that the parameter is optional. In the example above, this is rendered as --file=/tmp/random/path/whale.txt. However, if the “example_file” parameter were not provided in the input, nothing would appear on the command line.

Input files are read-only. If you wish to update an input file, you must first copy it to the output directory.

The value of position is used to determine where parameter should appear on the command line. Positions are relative to one another, not absolute. As a result, positions do not have to be sequential, three parameters with positions 1, 3, 5 will result in the same command line as 1, 2, 3. More than one parameter can have the same position (ties are broken using the parameter name), and the position field itself is optional. The default position is 0.

The baseCommand field will always appear in the final command line before the parameters.

2.4.2. Array Inputs#

It is easy to add arrays of input parameters represented to the command line. There are two ways to specify an array parameter. First is to provide type field with type: array and items defining the valid data types that may appear in the array. Alternatively, brackets [] may be added after the type name to indicate that input parameter is array of that type.

array-inputs.cwl#
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
inputs:
  filesA:
    type: string[]
    inputBinding:
      prefix: -A
      position: 1

  filesB:
    type:
      type: array
      items: string
      inputBinding:
        prefix: -B=
        separate: false
    inputBinding:
      position: 2

  filesC:
    type: string[]
    inputBinding:
      prefix: -C=
      itemSeparator: ","
      separate: false
      position: 4

outputs:
  example_out:
    type: stdout
stdout: output.txt
baseCommand: echo
array-inputs-job.yml#
filesA: [one, two, three]
filesB: [four, five, six]
filesC: [seven, eight, nine]

Now invoke cwltool providing the tool description and the input object on the command line:

$ cwltool array-inputs.cwl array-inputs-job.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'array-inputs.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/array-inputs.cwl'
INFO [job array-inputs.cwl] /tmp/bmin9qgq$ echo \
    -A \
    one \
    two \
    three \
    -B=four \
    -B=five \
    -B=six \
    -C=seven,eight,nine > /tmp/bmin9qgq/output.txt
INFO [job array-inputs.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$91038e29452bc77dcd21edef90a15075f3071540",
        "size": 60,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
    }
}
INFO Final process status is success
$ cat output.txt
-A one two three -B=four -B=five -B=six -C=seven,eight,nine

The inputBinding can appear either on the outer array parameter definition or the inner array element definition, and these produce different behavior when constructing the command line, as shown above. In addition, the itemSeparator field, if provided, specifies that array values should be concatenated into a single argument separated by the item separator string.

Note that the arrays of inputs are specified inside square brackets [] in array-inputs-job.yml. Arrays can also be expressed over multiple lines, where array values that are not defined with an associated key are marked by a leading -. This will be demonstrated in the next lesson and is discussed in more detail in the YAML Guide. You can specify arrays of arrays, arrays of records, and other complex types.

2.4.3. Advanced Inputs#

Sometimes an underlying tool has several arguments that must be provided together (they are dependent) or several arguments that cannot be provided together (they are exclusive). You can use records and type unions to group parameters together to describe these two conditions.

record.cwl#
#!/usr/bin/env cwl-runner

cwlVersion: v1.0
class: CommandLineTool
inputs:
  dependent_parameters:
    type:
      type: record
      name: dependent_parameters
      fields:
        itemA:
          type: string
          inputBinding:
            prefix: -A
        itemB:
          type: string
          inputBinding:
            prefix: -B
  exclusive_parameters:
    type:
      - type: record
        name: itemC
        fields:
          itemC:
            type: string
            inputBinding:
              prefix: -C
      - type: record
        name: itemD
        fields:
          itemD:
            type: string
            inputBinding:
              prefix: -D
outputs:
  example_out:
    type: stdout
stdout: output.txt
baseCommand: echo
record-job1.yml#
dependent_parameters:
  itemA: one
exclusive_parameters:
  itemC: three
$ cwltool record.cwl record-job1.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
ERROR Workflow error, try again with --debug for more information:
Invalid job input record:
record-job1.yml:1:1: the `dependent_parameters` field is not valid because
                       missing required field `itemB`

In the first example, you can’t provide itemA without also providing itemB.

record-job2.yml#
dependent_parameters:
  itemA: one
  itemB: two
exclusive_parameters:
  itemC: three
  itemD: four
$ cwltool record.cwl record-job2.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
record-job2.yml:6:3: Warning: invalid field `itemD`, expected one of: 'itemC'
INFO [job record.cwl] /tmp/futnxhto$ echo \
    -A \
    one \
    -B \
    two \
    -C \
    three > /tmp/futnxhto/output.txt
INFO [job record.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$329fe3b598fed0dfd40f511522eaf386edb2d077",
        "size": 23,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
    }
}
INFO Final process status is success
$ cat output.txt
-A one -B two -C three

In the second example, itemC and itemD are exclusive, so only itemC is added to the command line and itemD is ignored.

record-job3.yml#
dependent_parameters:
  itemA: one
  itemB: two
exclusive_parameters:
  itemD: four
$ cwltool record.cwl record-job3.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
INFO [job record.cwl] /tmp/tpnbj3m9$ echo \
    -A \
    one \
    -B \
    two \
    -D \
    four > /tmp/tpnbj3m9/output.txt
INFO [job record.cwl] completed success
{
    "example_out": {
        "location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
        "basename": "output.txt",
        "class": "File",
        "checksum": "sha1$77f572b28e441240a5e30eb14f1d300bcc13a3b4",
        "size": 22,
        "path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
    }
}
INFO Final process status is success
$ cat output.txt
-A one -B two -D four

In the third example, only itemD is provided, so it appears on the command line.