2.4. Inputs#
2.4.1. Essential Input Parameters#
The inputs
of a tool is a list of input parameters that control how to
run the tool. Each parameter has an id
for the name of parameter, and
type
describing what types of values are valid for that parameter.
Available primitive types are string, int, long, float, double, and null; complex types are array and record; in addition there are special types File, Directory and Any.
The following example demonstrates some input parameters with different types and appearing on the command line in different ways.
First, create a file called inp.cwl
, containing the following:
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
baseCommand: echo
inputs:
example_flag:
type: boolean
inputBinding:
position: 1
prefix: -f
example_string:
type: string
inputBinding:
position: 3
prefix: --example-string
example_int:
type: int
inputBinding:
position: 2
prefix: -i
separate: false
example_file:
type: File?
inputBinding:
prefix: --file=
separate: false
position: 4
outputs: []
Create a file called inp-job.yml
:
example_flag: true
example_string: hello
example_int: 42
example_file:
class: File
path: whale.txt
Notice that “example_file”, as a File
type, must be provided as an
object with the fields class: File
and path
.
Next, create a whale.txt using touch by typing touch whale.txt
on the command line.
$ touch whale.txt
Now invoke cwltool
with the tool description and the input object on the command line,
using the command cwltool inp.cwl inp-job.yml
. The following boxed text describes these
two commands and the expected output from the command line:
$ cwltool inp.cwl inp-job.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'inp.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/inp.cwl'
INFO [job inp.cwl] /tmp/lcsrtu7y$ echo \
-f \
-i42 \
--example-string \
hello \
--file=/tmp/gif_kzx0/stg6f45617e-72da-48dd-b667-2af2bfadbfc9/whale.txt
-f -i42 --example-string hello --file=/tmp/gif_kzx0/stg6f45617e-72da-48dd-b667-2af2bfadbfc9/whale.txt
INFO [job inp.cwl] completed success
{}
INFO Final process status is success
Tip
Where did those `/tmp` paths come from?
The CWL reference runner (cwltool) and other runners create temporary directories with symbolic (“soft”) links to your input files to ensure that the tools aren’t accidentally accessing files that were not explicitly specified
The field inputBinding
is optional and indicates whether and how the
input parameter should appear on the tool’s command line. If
inputBinding
is missing, the parameter does not appear on the command
line. Let’s look at each example in detail.
example_flag:
type: boolean
inputBinding:
position: 1
prefix: -f
Boolean types are treated as a flag. If the input parameter
“example_flag” is “true”, then prefix
will be added to the
command line. If false, no flag is added.
example_string:
type: string
inputBinding:
position: 3
prefix: --example-string
String types appear on the command line as literal values. The prefix
is optional, if provided, it appears as a separate argument on the
command line before the parameter . In the example above, this is
rendered as --example-string hello
.
example_int:
type: int
inputBinding:
position: 2
prefix: -i
separate: false
Integer (and floating point) types appear on the command line with
decimal text representation. When the option separate
is false (the
default value is true), the prefix and value are combined into a single
argument. In the example above, this is rendered as -i42
.
example_file:
type: File?
inputBinding:
prefix: --file=
separate: false
position: 4
File types appear on the command line as the path to the file. When the
parameter type ends with a question mark ?
it indicates that the
parameter is optional. In the example above, this is rendered as
--file=/tmp/random/path/whale.txt
. However, if the “example_file”
parameter were not provided in the input, nothing would appear on the
command line.
Input files are read-only. If you wish to update an input file, you must first copy it to the output directory.
The value of position
is used to determine where parameter should
appear on the command line. Positions are relative to one another, not
absolute. As a result, positions do not have to be sequential, three
parameters with positions 1, 3, 5 will result in the same command
line as 1, 2, 3. More than one parameter can have the same position
(ties are broken using the parameter name), and the position field itself
is optional. The default position is 0.
The baseCommand
field will always appear in the final command line before the parameters.
2.4.2. Array Inputs#
It is easy to add arrays of input parameters represented to the command
line. There are two ways to specify an array parameter. First is to provide
type
field with type: array
and items
defining the valid data types
that may appear in the array. Alternatively, brackets []
may be added after
the type name to indicate that input parameter is array of that type.
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
inputs:
filesA:
type: string[]
inputBinding:
prefix: -A
position: 1
filesB:
type:
type: array
items: string
inputBinding:
prefix: -B=
separate: false
inputBinding:
position: 2
filesC:
type: string[]
inputBinding:
prefix: -C=
itemSeparator: ","
separate: false
position: 4
outputs:
example_out:
type: stdout
stdout: output.txt
baseCommand: echo
filesA: [one, two, three]
filesB: [four, five, six]
filesC: [seven, eight, nine]
Now invoke cwltool
providing the tool description and the input object
on the command line:
$ cwltool array-inputs.cwl array-inputs-job.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'array-inputs.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/array-inputs.cwl'
INFO [job array-inputs.cwl] /tmp/bmin9qgq$ echo \
-A \
one \
two \
three \
-B=four \
-B=five \
-B=six \
-C=seven,eight,nine > /tmp/bmin9qgq/output.txt
INFO [job array-inputs.cwl] completed success
{
"example_out": {
"location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
"basename": "output.txt",
"class": "File",
"checksum": "sha1$91038e29452bc77dcd21edef90a15075f3071540",
"size": 60,
"path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
}
}
INFO Final process status is success
$ cat output.txt
-A one two three -B=four -B=five -B=six -C=seven,eight,nine
The inputBinding
can appear either on the outer array parameter definition
or the inner array element definition, and these produce different behavior when
constructing the command line, as shown above.
In addition, the itemSeparator
field, if provided, specifies that array
values should be concatenated into a single argument separated by the item
separator string.
Note that the arrays of inputs are specified inside square brackets []
in array-inputs-job.yml
. Arrays can also be expressed over multiple lines, where
array values that are not defined with an associated key are marked by a leading -
.
This will be demonstrated in the next lesson
and is discussed in more detail in the YAML Guide.
You can specify arrays of arrays, arrays of records, and other complex types.
2.4.3. Advanced Inputs#
Sometimes an underlying tool has several arguments that must be provided together (they are dependent) or several arguments that cannot be provided together (they are exclusive). You can use records and type unions to group parameters together to describe these two conditions.
#!/usr/bin/env cwl-runner
cwlVersion: v1.0
class: CommandLineTool
inputs:
dependent_parameters:
type:
type: record
name: dependent_parameters
fields:
itemA:
type: string
inputBinding:
prefix: -A
itemB:
type: string
inputBinding:
prefix: -B
exclusive_parameters:
type:
- type: record
name: itemC
fields:
itemC:
type: string
inputBinding:
prefix: -C
- type: record
name: itemD
fields:
itemD:
type: string
inputBinding:
prefix: -D
outputs:
example_out:
type: stdout
stdout: output.txt
baseCommand: echo
dependent_parameters:
itemA: one
exclusive_parameters:
itemC: three
$ cwltool record.cwl record-job1.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
ERROR Workflow error, try again with --debug for more information:
Invalid job input record:
record-job1.yml:1:1: the `dependent_parameters` field is not valid because
missing required field `itemB`
In the first example, you can’t provide itemA
without also providing itemB
.
dependent_parameters:
itemA: one
itemB: two
exclusive_parameters:
itemC: three
itemD: four
$ cwltool record.cwl record-job2.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
record-job2.yml:6:3: Warning: invalid field `itemD`, expected one of: 'itemC'
INFO [job record.cwl] /tmp/futnxhto$ echo \
-A \
one \
-B \
two \
-C \
three > /tmp/futnxhto/output.txt
INFO [job record.cwl] completed success
{
"example_out": {
"location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
"basename": "output.txt",
"class": "File",
"checksum": "sha1$329fe3b598fed0dfd40f511522eaf386edb2d077",
"size": 23,
"path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
}
}
INFO Final process status is success
$ cat output.txt
-A one -B two -C three
In the second example, itemC
and itemD
are exclusive, so only itemC
is added to the command line and itemD
is ignored.
dependent_parameters:
itemA: one
itemB: two
exclusive_parameters:
itemD: four
$ cwltool record.cwl record-job3.yml
INFO /opt/hostedtoolcache/Python/3.9.13/x64/bin/cwltool 3.1.20220913185150
INFO Resolved 'record.cwl' to 'file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/record.cwl'
INFO [job record.cwl] /tmp/tpnbj3m9$ echo \
-A \
one \
-B \
two \
-D \
four > /tmp/tpnbj3m9/output.txt
INFO [job record.cwl] completed success
{
"example_out": {
"location": "file:///home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt",
"basename": "output.txt",
"class": "File",
"checksum": "sha1$77f572b28e441240a5e30eb14f1d300bcc13a3b4",
"size": 22,
"path": "/home/runner/work/user_guide/user_guide/src/_includes/cwl/inputs/output.txt"
}
}
INFO Final process status is success
$ cat output.txt
-A one -B two -D four
In the third example, only itemD
is provided, so it appears on the
command line.