Skip to content

Water bodies detection

Goal

Wrap the water bodies detection Python command line tool as a Common Workflow Language CommandLineTool and execute it with a CWL runner.

Lab

This step has a dedicated lab available at /workspace/quickwin/practice-labs/CommandLineTool.ipynb

How to wrap a step as a CWL CommandLineTool

The CWL document below shows the water bodies detection Python command line tool step wrapped as a CWL CommandLineTool:

cwl-cli/detect-water-body
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
cwlVersion: v1.0

class: CommandLineTool
id: detect-water-body
requirements:
    EnvVarRequirement:
      envDef:
        PYTHONPATH: /app
    ResourceRequirement:
      coresMax: 1
      ramMax: 512
    DockerRequirement:
      dockerPull: localhost/detect-water-body:latest  
baseCommand: ["python", "-m", "app"]
arguments: []
inputs:
  item:
    type: string
    inputBinding:
        prefix: --input-item
  aoi:
    type: string
    inputBinding:
        prefix: --aoi
  epsg:
    type: string
    inputBinding:
        prefix: --epsg
  band:
    type:
      - type: array
        items: string
        inputBinding:
          prefix: '--band'

outputs:
  water-body:
    outputBinding:
        glob: .
    type: Directory

Let's break down the key components of this CWL document:

  • cwlVersion: v1.0: Specifies the version of the CWL specification that this document follows.
  • class: CommandLineTool: Indicates that this CWL document defines a command-line tool.
  • id: crop: Provides a unique identifier for this tool, which can be used to reference it in workflows.
  • requirements: Specifies the requirements and dependencies of the tool. In this case, it defines the following:
    • InlineJavascriptRequirement: This requirement allows the use of inline JavaScript expressions in the tool.
    • EnvVarRequirement: It sets environment variables. In this case, it sets the PYTHONPATH environment variable to "/app."
    • ResourceRequirement: Specifies resource requirements for running the tool, including the maximum number of CPU cores and maximum RAM.
    • DockerRequirement: This requirement specifies the Docker container to be used. It indicates that the tool should be executed in a Docker container with the image localhost/crop:latest.
  • baseCommand: Defines the base command to be executed in the container. In this case, it's running a Python module called "app" with the command python -m app.
  • arguments: This section is empty, meaning there are no additional command-line arguments specified here. The tool is expected to receive its arguments via the input parameters.
  • inputs: Describes the input parameters for the tool, including their types and how they are bound to command-line arguments. The tool expects the following inputs:
    • item: A string representing the input STAC item (image) to be processed, bound to the --input-item argument.
    • aoi: A string representing the area of interest (AOI) as a bounding box, bound to the --aoi argument.
    • epsg: A string representing the EPSG code for the coordinate system, bound to the --epsg argument.
    • band: An array of strings representing the name of the bands to be extracted, bound to the --band argument.
  • outputs: Specifies the tool's output. It defines an output parameter named water-body, which is of type Directory. The outputBinding section specifies that the tool is expected to produce one or more files (glob: .) as output.

Steps

Clean-up the /workspace/quickwin/runs folder:

rm -fr /workspace/quickwin/runs/*

Run the CWL document using the cwltool CWL runner to execute the water bodies detection:

terminal
export WORKSPACE=/workspace/quickwin

command -v podman >/dev/null 2>&1 && { 
    flag="--podman"
}

cwltool ${flag} \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-cli/detect-water-body.cwl \
    --item "https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_10TFK_20210713_0_L2A" \
    --aoi="-121.399,39.834,-120.74,40.472" \
    --epsg "EPSG:4326" \
    --band "green" \
    --band "nir" 

Expected outcome

The folder /workspace/quickwin/runs contains:

(base) jovyan@jupyter-fbrito--training:~/quickwin$ tree runs
runs
└── poz7ftyy
    ├── S2B_10TFK_20210713_0_L2A
    │   ├── S2B_10TFK_20210713_0_L2A.json
    │   └── otsu.tif
    ├── catalog.json
    └── otsu.tif

2 directories, 4 files

Extra

The CWL runner cwltool allows you to do a YAML file with the parameters:

params.yaml
item: https://earth-search.aws.element84.com/v0/collections/sentinel-s2-l2a-cogs/items/S2B_10TFK_20210713_0_L2A
aoi: "-121.399,39.834,-120.74,40.472"
epsg: "EPSG:4326"
band: 
- green
- nir

and run it with:

terminal
export WORKSPACE=/workspace/quickwin

cwltool \
    --podman \
    --outdir ${WORKSPACE}/runs \
    ${WORKSPACE}/cwl-cli/detect-water-body.cwl \
    ${WORKSPACE}/cwl-cli/params.yaml