How-to Guide: Creating Nested Workflows in CWL¶
This guide explains how to build and use nested workflows in CWL by leveraging the SubworkflowFeatureRequirement.
The focus is on the workflow composition and the integration of subworkflows to create reusable components.
Objective¶
- Main Workflow: Accepts inputs and calls a subworkflow (
rgb-composite) as a single step. - Subworkflow (
rgb-composite): Performs a series of steps to process data and produce the desired output.
Key Blocks¶
SubworkflowFeatureRequirement
- The
SubworkflowFeatureRequirementallows workflows to include other workflows as steps.
requirements:
SubworkflowFeatureRequirement: {}
- Subworkflow Step Definition
The main workflow calls the subworkflow using:
step_rgb_composite:
in:
stac-item: stac-item
bands: bands
out:
- rgb-tif
run: "#rgb-composite"
run: "#rgb-composite": Links to thergb-compositesubworkflow.Inputs and Outputs: The subworkflow accepts inputs (
stac-item,bands) and produces an output (rgb-tif).
Steps¶
- Define the Subworkflow
The rgb-composite subworkflow performs the following:
- Fetch band-specific asset URLs using the
stactool. - Stack the asset TIFFs into a single file using the
rio_stacktool. - Apply color correction to generate the RGB composite using the
rio_colortool.
Subworkflow Definition (rgb-composite)
class: Workflow
id: rgb-composite
requirements:
InlineJavascriptRequirement: {}
NetworkAccess:
networkAccess: true
ScatterFeatureRequirement: {}
inputs:
stac-item:
type: string
bands:
type: string[]
outputs:
rgb-tif:
outputSource: step_color/rgb
type: File
steps:
step_curl:
in:
stac_item: stac-item
common_band_name: bands
out:
- hrefs
run: "#stac"
scatter: common_band_name
scatterMethod: dotproduct
step_stack:
in:
tiffs:
source: step_curl/hrefs
out:
- stacked
run: "#rio_stack"
step_color:
in:
stacked:
source: step_stack/stacked
out:
- rgb
run: "#rio_color"
- Define the Main Workflow
The main workflow invokes the rgb-composite subworkflow:
class: Workflow
id: main
requirements:
SubworkflowFeatureRequirement: {}
InlineJavascriptRequirement: {}
NetworkAccess:
networkAccess: true
ScatterFeatureRequirement: {}
inputs:
stac-item:
type: string
bands:
type: string[]
default: ["red", "green", "blue"]
outputs:
rgb-tif:
outputSource: step_rgb_composite/rgb-tif
type: File
steps:
step_rgb_composite:
in:
stac-item: stac-item
bands: bands
out:
- rgb-tif
run: "#rgb-composite"
Requirements:
SubworkflowFeatureRequirement: Enables the use of nested workflows.ScatterFeatureRequirement: Allows processing multiple bands simultaneously.
Inputs:
stac-item: URL to a STAC item.bands: Array of band names (default: ["red", "green", "blue"]).
Outputs:
rgb-tif: The RGB composite file produced by the subworkflow.
- Run the Workflow
To execute the main workflow, use the following command:
cwltool nested-workflow.cwl \
--stac-item https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A
INFO /opt/hostedtoolcache/Python/3.13.3/x64/bin/cwltool 3.1.20250110105449
INFO Resolved '../cwl-workflows/nested-workflow.cwl' to 'file:///home/runner/work/how-to/how-to/cwl-workflows/nested-workflow.cwl'
INFO [workflow ] start
INFO [workflow ] starting step step_rgb_composite
INFO [step step_rgb_composite] start
INFO [workflow step_rgb_composite] start
INFO [workflow step_rgb_composite] starting step step_curl
INFO [step step_curl] start
INFO [job step_curl] /tmp/pufx_v3d$ docker \
run \
-i \
--mount=type=bind,source=/tmp/pufx_v3d,target=/yZUdzu \
--mount=type=bind,source=/tmp/7eymyjv6,target=/tmp \
--workdir=/yZUdzu \
--read-only=true \
--log-driver=none \
--user=1001:118 \
--rm \
--cidfile=/tmp/cakv9aaf/20250620071829-260192.cid \
--env=TMPDIR=/tmp \
--env=HOME=/yZUdzu \
docker.io/curlimages/curl:latest \
curl \
https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A > /tmp/pufx_v3d/message
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10156 100 10156 0 0 44020 0 --:--:-- --:--:-- --:--:-- 43965
INFO [job step_curl] completed success
INFO [step step_curl] start
INFO [job step_curl_2] /tmp/sxcka5p1$ docker \
run \
-i \
--mount=type=bind,source=/tmp/sxcka5p1,target=/yZUdzu \
--mount=type=bind,source=/tmp/hionhjgm,target=/tmp \
--workdir=/yZUdzu \
--read-only=true \
--log-driver=none \
--user=1001:118 \
--rm \
--cidfile=/tmp/3t4e8won/20250620071830-267272.cid \
--env=TMPDIR=/tmp \
--env=HOME=/yZUdzu \
docker.io/curlimages/curl:latest \
curl \
https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A > /tmp/sxcka5p1/message
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10156 100 10156 0 0 47808 0 --:--:-- --:--:-- --:--:-- 47680
INFO [job step_curl_2] completed success
INFO [step step_curl] start
INFO [job step_curl_3] /tmp/twxvo_8b$ docker \
run \
-i \
--mount=type=bind,source=/tmp/twxvo_8b,target=/yZUdzu \
--mount=type=bind,source=/tmp/rwhcvgjb,target=/tmp \
--workdir=/yZUdzu \
--read-only=true \
--log-driver=none \
--user=1001:118 \
--rm \
--cidfile=/tmp/vpt14fg_/20250620071831-274759.cid \
--env=TMPDIR=/tmp \
--env=HOME=/yZUdzu \
docker.io/curlimages/curl:latest \
curl \
https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_53HPA_20210723_0_L2A > /tmp/twxvo_8b/message
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
100 10156 100 10156 0 0 45382 0 --:--:-- --:--:-- --:--:-- 45339
INFO [job step_curl_3] completed success
INFO [step step_curl] completed success
INFO [workflow step_rgb_composite] starting step step_stack
INFO [step step_stack] start
INFO [job step_stack] /tmp/e53l9l2z$ docker \
run \
-i \
--mount=type=bind,source=/tmp/e53l9l2z,target=/yZUdzu \
--mount=type=bind,source=/tmp/rn2471xj,target=/tmp \
--workdir=/yZUdzu \
--read-only=true \
--user=1001:118 \
--rm \
--cidfile=/tmp/29_fbisl/20250620071832-296183.cid \
--env=TMPDIR=/tmp \
--env=HOME=/yZUdzu \
--env=CPL_VSIL_CURL_ALLOWED_EXTENSIONS=.tif \
--env=GDAL_HTTP_MERGE_CONSECUTIVE_RANGES=YES \
--env=GDAL_TIFF_INTERNAL_MASK=YES \
ghcr.io/eoap/how-to/rio:1.0.0 \
rio \
stack \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/53/H/PA/2021/7/S2B_53HPA_20210723_0_L2A/B04.tif \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/53/H/PA/2021/7/S2B_53HPA_20210723_0_L2A/B03.tif \
https://sentinel-cogs.s3.us-west-2.amazonaws.com/sentinel-s2-l2a-cogs/53/H/PA/2021/7/S2B_53HPA_20210723_0_L2A/B02.tif \
stacked.tif
INFO [job step_stack] Max memory used: 1199MiB
INFO [job step_stack] completed success
INFO [step step_stack] completed success
INFO [workflow step_rgb_composite] starting step step_color
INFO [step step_color] start
INFO [job step_color] /tmp/rri4c0zo$ docker \
run \
-i \
--mount=type=bind,source=/tmp/rri4c0zo,target=/yZUdzu \
--mount=type=bind,source=/tmp/m1rd6m6g,target=/tmp \
--mount=type=bind,source=/tmp/e53l9l2z/stacked.tif,target=/var/lib/cwl/stgb368b707-6529-4134-9bbd-6700eb174bde/stacked.tif,readonly \
--workdir=/yZUdzu \
--read-only=true \
--user=1001:118 \
--rm \
--cidfile=/tmp/thowdx3l/20250620071851-051982.cid \
--env=TMPDIR=/tmp \
--env=HOME=/yZUdzu \
ghcr.io/eoap/how-to/rio:1.0.0 \
rio \
color \
-j \
-1 \
--out-dtype \
uint8 \
/var/lib/cwl/stgb368b707-6529-4134-9bbd-6700eb174bde/stacked.tif \
rgb.tif \
'gamma 3 0.95, sigmoidal rgb 35 0.13'
INFO [job step_color] Max memory used: 786MiB
INFO [job step_color] completed success
INFO [step step_color] completed success
INFO [workflow step_rgb_composite] completed success
INFO [step step_rgb_composite] completed success
INFO [workflow ] completed success
INFO Final process status is success
- Expected Output
Intermediate Outputs:
- URLs of band-specific TIFFs (hrefs).
- Stacked TIFF file (stacked.tif).
Final Output:
- RGB composite TIFF file (rgb-tif).
{
"rgb-tif": {
"location": "file:///home/runner/work/how-to/how-to/docs/rgb.tif",
"basename": "rgb.tif",
"class": "File",
"checksum": "sha1$dc1b292898d647116fdb31ec9c04be3a6ff9e5e9",
"size": 361747464,
"path": "/home/runner/work/how-to/how-to/docs/rgb.tif"
}
}
Key Takeaways¶
Modularity with Subworkflows:
- Use
SubworkflowFeatureRequirementto encapsulate reusable workflows. - Subworkflows simplify complex workflows by isolating specific logic.
Integration of Subworkflows:
- Define subworkflow steps in the main workflow.
- Use run to link the subworkflow.
Reusability:
- Subworkflows can be reused in multiple workflows, promoting modularity and efficiency.
This approach makes it easy to manage and scale CWL workflows by leveraging nested subworkflows.