How-to Guide: Creating Nested Workflows in CWL¶
This guide explains how to build and use nested workflows in CWL by leveraging the SubworkflowFeatureRequirement
.
The focus is on the workflow composition and the integration of subworkflows to create reusable components.
Objective¶
- Main Workflow: Accepts inputs and calls a subworkflow (
rgb-composite
) as a single step. - Subworkflow (
rgb-composite
): Performs a series of steps to process data and produce the desired output.
Key Blocks¶
SubworkflowFeatureRequirement
- The
SubworkflowFeatureRequirement
allows workflows to include other workflows as steps.
requirements:
SubworkflowFeatureRequirement: {}
- Subworkflow Step Definition
The main workflow calls the subworkflow
using:
step_rgb_composite:
in:
stac-item: stac-item
bands: bands
out:
- rgb-tif
run: "#rgb-composite"
run: "#rgb-composite"
: Links to thergb-composite
subworkflow.Inputs and Outputs: The subworkflow accepts inputs (
stac-item
,bands
) and produces an output (rgb-tif
).
Steps¶
- Define the Subworkflow
The rgb-composite
subworkflow performs the following:
- Fetch band-specific asset URLs using the
stac
tool. - Stack the asset TIFFs into a single file using the
rio_stack
tool. - Apply color correction to generate the RGB composite using the
rio_color
tool.
Subworkflow Definition (rgb-composite
)
class: Workflow
id: rgb-composite
requirements:
InlineJavascriptRequirement: {}
NetworkAccess:
networkAccess: true
ScatterFeatureRequirement: {}
inputs:
stac-item:
type: string
bands:
type: string[]
outputs:
rgb-tif:
outputSource: step_color/rgb
type: File
steps:
step_curl:
in:
stac_item: stac-item
common_band_name: bands
out:
- hrefs
run: "#stac"
scatter: common_band_name
scatterMethod: dotproduct
step_stack:
in:
tiffs:
source: step_curl/hrefs
out:
- stacked
run: "#rio_stack"
step_color:
in:
stacked:
source: step_stack/stacked
out:
- rgb
run: "#rio_color"
- Define the Main Workflow
The main workflow invokes the rgb-composite subworkflow:
class: Workflow
id: main
requirements:
SubworkflowFeatureRequirement: {}
InlineJavascriptRequirement: {}
NetworkAccess:
networkAccess: true
ScatterFeatureRequirement: {}
inputs:
stac-item:
type: string
bands:
type: string[]
default: ["red", "green", "blue"]
outputs:
rgb-tif:
outputSource: step_rgb_composite/rgb-tif
type: File
steps:
step_rgb_composite:
in:
stac-item: stac-item
bands: bands
out:
- rgb-tif
run: "#rgb-composite"
Requirements:
SubworkflowFeatureRequirement
: Enables the use of nested workflows.ScatterFeatureRequirement
: Allows processing multiple bands simultaneously.
Inputs:
stac-item
: URL to a STAC item.bands
: Array of band names (default: ["red", "green", "blue"]
).
Outputs:
rgb-tif
: The RGB composite file produced by the subworkflow.
- Run the Workflow
To execute the main workflow, use the following command:
cwltool nested-workflow.cwl \
--stac-item https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A
INFO /opt/hostedtoolcache/Python/3.13.2/x64/bin/cwltool 3.1.20250110105449
INFO Resolved '../cwl-workflows/nested-workflow.cwl' to 'file:///home/runner/work/how-to/how-to/cwl-workflows/nested-workflow.cwl'
INFO [workflow ] start
INFO [workflow ] starting step step_rgb_composite
INFO [step step_rgb_composite] start
INFO [workflow step_rgb_composite] start
INFO [workflow step_rgb_composite] starting step step_curl
INFO [step step_curl] start
INFO [job step_curl] /tmp/4pstl9oz$ docker \
run \
-i \
--mount=type=bind,source=/tmp/4pstl9oz,target=/sIwfdC \
--mount=type=bind,source=/tmp/w1q31ww5,target=/tmp \
--workdir=/sIwfdC \
--read-only=true \
--log-driver=none \
--user=1001:118 \
--rm \
--cidfile=/tmp/44ns03nx/20250303112226-830562.cid \
--env=TMPDIR=/tmp \
--env=HOME=/sIwfdC \
docker.io/curlimages/curl:latest \
curl \
https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A > /tmp/4pstl9oz/message
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10156 100 10156 0 0 86543 0 --:--:-- --:--:-- --:--:-- 86803
INFO [job step_curl] completed success
INFO [step step_curl] start
INFO [job step_curl_2] /tmp/spb5gh87$ docker \
run \
-i \
--mount=type=bind,source=/tmp/spb5gh87,target=/sIwfdC \
--mount=type=bind,source=/tmp/d8_81km9,target=/tmp \
--workdir=/sIwfdC \
--read-only=true \
--log-driver=none \
--user=1001:118 \
--rm \
--cidfile=/tmp/_y4hfyt3/20250303112227-838673.cid \
--env=TMPDIR=/tmp \
--env=HOME=/sIwfdC \
docker.io/curlimages/curl:latest \
curl \
https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A > /tmp/spb5gh87/message
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10156 100 10156 0 0 50333 0 --:--:-- --:--:-- --:--:-- 50527
INFO [job step_curl_2] completed success
INFO [step step_curl] start
INFO [job step_curl_3] /tmp/3ei0m00x$ docker \
run \
-i \
--mount=type=bind,source=/tmp/3ei0m00x,target=/sIwfdC \
--mount=type=bind,source=/tmp/avykuq6d,target=/tmp \
--workdir=/sIwfdC \
--read-only=true \
--log-driver=none \
--user=1001:118 \
--rm \
--cidfile=/tmp/5m27zsom/20250303112228-846239.cid \
--env=TMPDIR=/tmp \
--env=HOME=/sIwfdC \
docker.io/curlimages/curl:latest \
curl \
https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A > /tmp/3ei0m00x/message
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10156 100 10156 0 0 88837 0 --:--:-- --:--:-- --:--:-- 89087
INFO [job step_curl_3] completed success
INFO [step step_curl] completed success
INFO [workflow step_rgb_composite] starting step step_stack
INFO [step step_stack] start
INFO [job step_stack] /tmp/nfvjqhbs$ docker \
run \
-i \
--mount=type=bind,source=/tmp/nfvjqhbs,target=/sIwfdC \
--mount=type=bind,source=/tmp/7x4450t6,target=/tmp \
--workdir=/sIwfdC \
--read-only=true \
--user=1001:118 \
--rm \
--cidfile=/tmp/oor_ywqb/20250303112229-870511.cid \
--env=TMPDIR=/tmp \
--env=HOME=/sIwfdC \
--env=CPL_VSIL_CURL_ALLOWED_EXTENSIONS=.tif \
--env=GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES \
--env=GDAL_TIFF_INTERNAL_MASK=YES \
ghcr.io/eoap/how-to/rio:1.0.0 \
rio \
stack \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/53/H/PA/2021/7/S2B_53HPA_20210723_0_L2A/B04.tif \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/53/H/PA/2021/7/S2B_53HPA_20210723_0_L2A/B03.tif \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/53/H/PA/2021/7/S2B_53HPA_20210723_0_L2A/B02.tif \
stacked.tif
INFO [job step_stack] Max memory used: 1253MiB
INFO [job step_stack] completed success
INFO [step step_stack] completed success
INFO [workflow step_rgb_composite] starting step step_color
INFO [step step_color] start
INFO [job step_color] /tmp/l5ph5hqb$ docker \
run \
-i \
--mount=type=bind,source=/tmp/l5ph5hqb,target=/sIwfdC \
--mount=type=bind,source=/tmp/mevdklt8,target=/tmp \
--mount=type=bind,source=/tmp/nfvjqhbs/stacked.tif,target=/var/lib/cwl/stg16553708-985f-4c8d-b90c-7d69dfbcb08b/stacked.tif,readonly \
--workdir=/sIwfdC \
--read-only=true \
--user=1001:118 \
--rm \
--cidfile=/tmp/iu4zx4n9/20250303112254-438602.cid \
--env=TMPDIR=/tmp \
--env=HOME=/sIwfdC \
ghcr.io/eoap/how-to/rio:1.0.0 \
rio \
color \
-j \
-1 \
--out-dtype \
uint8 \
/var/lib/cwl/stg16553708-985f-4c8d-b90c-7d69dfbcb08b/stacked.tif \
rgb.tif \
'gamma 3 0.95, sigmoidal rgb 35 0.13'
INFO [job step_color] Max memory used: 705MiB
INFO [job step_color] completed success
INFO [step step_color] completed success
INFO [workflow step_rgb_composite] completed success
INFO [step step_rgb_composite] completed success
INFO [workflow ] completed success
INFO Final process status is success
- Expected Output
Intermediate Outputs:
- URLs of band-specific TIFFs (hrefs).
- Stacked TIFF file (stacked.tif).
Final Output:
- RGB composite TIFF file (rgb-tif).
{
"rgb-tif": {
"location": "file:///home/runner/work/how-to/how-to/docs/rgb.tif",
"basename": "rgb.tif",
"class": "File",
"checksum": "sha1$a4f17dfb37856da4d4efc24fbaed4b77f220b75d",
"size": 361747464,
"path": "/home/runner/work/how-to/how-to/docs/rgb.tif"
}
}
Key Takeaways¶
Modularity with Subworkflows:
- Use
SubworkflowFeatureRequirement
to encapsulate reusable workflows. - Subworkflows simplify complex workflows by isolating specific logic.
Integration of Subworkflows:
- Define subworkflow steps in the main workflow.
- Use run to link the subworkflow.
Reusability:
- Subworkflows can be reused in multiple workflows, promoting modularity and efficiency.
This approach makes it easy to manage and scale CWL workflows by leveraging nested subworkflows.