JSON Schema generation¶
A simple usage of the library that, given generates a JSON Schema for inputs and outputs.
1. Parsing¶
In this sample we'll show the access from a remote public URL.
In [1]:
Copied!
from cwl_loader import load_cwl_from_location
from cwl2ogc import BaseCWLtypes2OGCConverter
workflow_id = "pattern-12"
cwl_document = load_cwl_from_location(
"https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl"
)
workflow = None
for wf in cwl_document:
if workflow_id == wf.id.split("#")[-1]:
workflow = wf
break
if workflow is not None:
cwl_converter = BaseCWLtypes2OGCConverter(workflow)
else:
raise ValueError(f"'#{workflow_id}' not found in input $graph")
from cwl_loader import load_cwl_from_location
from cwl2ogc import BaseCWLtypes2OGCConverter
workflow_id = "pattern-12"
cwl_document = load_cwl_from_location(
"https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl"
)
workflow = None
for wf in cwl_document:
if workflow_id == wf.id.split("#")[-1]:
workflow = wf
break
if workflow is not None:
cwl_converter = BaseCWLtypes2OGCConverter(workflow)
else:
raise ValueError(f"'#{workflow_id}' not found in input $graph")
2026-05-13 12:25:59.263 | DEBUG | cwl_loader:load_cwl_from_location:240 - Loading CWL document from https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl...
2026-05-13 12:25:59.379 | DEBUG | cwl_loader:_load_cwl_from_stream:243 - Reading stream from https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl...
2026-05-13 12:25:59.406 | DEBUG | cwl_loader:load_cwl_from_stream:210 - CWL data of type <class 'ruamel.yaml.comments.CommentedMap'> successfully loaded from stream
2026-05-13 12:25:59.407 | DEBUG | cwl_loader:load_cwl_from_yaml:147 - No needs to update the Raw CWL document since it targets already the v1.2
2026-05-13 12:25:59.408 | DEBUG | cwl_loader:load_cwl_from_yaml:151 - Parsing the raw CWL document to the CWL Utils DOM...
2026-05-13 12:25:59.794 | DEBUG | cwl_loader:load_cwl_from_yaml:160 - Raw CWL document successfully parsed to the CWL Utils DOM!
2026-05-13 12:25:59.795 | DEBUG | cwl_loader:load_cwl_from_yaml:162 - Dereferencing the steps[].run...
2026-05-13 12:25:59.796 | DEBUG | cwl_loader:_on_process:54 - Checking if https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl#crop must be externally imported...
2026-05-13 12:25:59.796 | DEBUG | cwl_loader:_on_process:58 - run_url: https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl - uri: https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl
2026-05-13 12:25:59.797 | DEBUG | cwl_loader:_on_process:54 - Checking if https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl#norm_diff must be externally imported...
2026-05-13 12:25:59.797 | DEBUG | cwl_loader:_on_process:58 - run_url: https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl - uri: https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl
2026-05-13 12:25:59.798 | DEBUG | cwl_loader:_on_process:54 - Checking if https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl#otsu must be externally imported...
2026-05-13 12:25:59.798 | DEBUG | cwl_loader:_on_process:58 - run_url: https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl - uri: https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl
2026-05-13 12:25:59.799 | DEBUG | cwl_loader:load_cwl_from_yaml:166 - steps[].run successfully dereferenced! Dereferencing the FQNs...
2026-05-13 12:25:59.800 | DEBUG | cwl_loader:load_cwl_from_yaml:170 - CWL document successfully dereferenced! Now verifying steps[].run integrity...
2026-05-13 12:25:59.800 | DEBUG | cwl_loader:load_cwl_from_yaml:176 - All steps[].run link are resolvable!
2026-05-13 12:25:59.801 | DEBUG | cwl_loader:load_cwl_from_yaml:179 - Sorting Process instances by dependencies....
2026-05-13 12:25:59.802 | DEBUG | cwl_loader:load_cwl_from_yaml:181 - Sorting process is over.
2026-05-13 12:25:59.803 | DEBUG | cwl_loader:_load_cwl_from_stream:253 - Stream from https://raw.githubusercontent.com/eoap/application-package-patterns/refs/heads/main/cwl-workflow/pattern-12.cwl successfully load!
2. Inputs JSON Schema generation¶
Once the document is parsed, invoke the cwl2ogc APIs to convert the CWL inputs to the JSON schema:
In [2]:
Copied!
import sys
cwl_converter.dump_inputs_json_schema(stream=sys.stdout, pretty_print=True)
import sys
cwl_converter.dump_inputs_json_schema(stream=sys.stdout, pretty_print=True)
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://eoap.github.io/cwl2ogc/pattern-12/inputs.yaml",
"description": "The schema to represent a pattern-12 inputs definition",
"type": "object",
"required": [
"aoi",
"bands",
"item",
"cropped-collection",
"ndwi-collection",
"water-bodies-collection"
],
"properties": {
"aoi": {
"$ref": "#/$defs/aoi"
},
"bands": {
"$ref": "#/$defs/bands"
},
"item": {
"$ref": "#/$defs/item"
},
"cropped-collection": {
"$ref": "#/$defs/cropped-collection"
},
"ndwi-collection": {
"$ref": "#/$defs/ndwi-collection"
},
"water-bodies-collection": {
"$ref": "#/$defs/water-bodies-collection"
}
},
"additionalProperties": false,
"$defs": {
"aoi": {
"type": "object",
"properties": {
"bbox": {
"type": "array",
"items": {
"type": "number",
"format": "double"
}
},
"crs": {
"type": "string",
"enum": [
"CRS84",
"CRS84h"
]
}
},
"required": [
"bbox",
"crs"
]
},
"bands": {
"type": "array",
"items": {
"type": "string"
},
"default": [
"green",
"nir"
]
},
"item": {
"oneOf": [
{
"type": "string",
"format": "uri"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json"
},
{
"$ref": "https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml"
}
]
},
"cropped-collection": {
"type": "string",
"format": "uri"
},
"ndwi-collection": {
"type": "string",
"format": "uri"
},
"water-bodies-collection": {
"type": "string",
"format": "uri"
}
}
}
2.1 Inputs validation¶
Schema can be used to fully validate an inputs dictionary (expecting JSON Schema validation errors in the example below):
In [3]:
Copied!
from jsonschema import Draft202012Validator
from jsonschema.exceptions import SchemaError
def validate(schema: dict, data: dict):
try:
validator = Draft202012Validator(schema)
errors = validator.iter_errors(data) if validator is not None else []
if errors:
for error in errors:
print(
f"[{'.'.join(error.schema_path)}] - #/{'/'.join(error.path)}: {error.message}"
)
else:
print("No JSON Schema violations detected!")
except SchemaError as schema_error:
print(
f"An error occurred while instantiating {Draft202012Validator.__class__.__name__}: {schema_error.message}"
)
from jsonschema import Draft202012Validator
from jsonschema.exceptions import SchemaError
def validate(schema: dict, data: dict):
try:
validator = Draft202012Validator(schema)
errors = validator.iter_errors(data) if validator is not None else []
if errors:
for error in errors:
print(
f"[{'.'.join(error.schema_path)}] - #/{'/'.join(error.path)}: {error.message}"
)
else:
print("No JSON Schema violations detected!")
except SchemaError as schema_error:
print(
f"An error occurred while instantiating {Draft202012Validator.__class__.__name__}: {schema_error.message}"
)
Define the inputs to be validate
In [4]:
Copied!
inputs = {
"aoi": "-118.985,38.432,-118.183,38.938",
"filesB": "EPSG:4326",
"bands": ["green", "nir08"],
"item": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LC08_L2SP_042033_20231007_02_T1",
}
validate(cwl_converter.get_inputs_json_schema(), inputs)
inputs = {
"aoi": "-118.985,38.432,-118.183,38.938",
"filesB": "EPSG:4326",
"bands": ["green", "nir08"],
"item": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LC08_L2SP_042033_20231007_02_T1",
}
validate(cwl_converter.get_inputs_json_schema(), inputs)
[required] - #/: 'cropped-collection' is a required property [required] - #/: 'ndwi-collection' is a required property [required] - #/: 'water-bodies-collection' is a required property [properties.aoi.type] - #/aoi: '-118.985,38.432,-118.183,38.938' is not of type 'object'
--------------------------------------------------------------------------- JSONDecodeError Traceback (most recent call last) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/referencing/_core.py:428, in Registry.get_or_retrieve(self, uri) 427 try: --> 428 resource = registry._retrieve(uri) 429 except ( 430 exceptions.CannotDetermineSpecification, 431 exceptions.NoSuchResource, 432 ): File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:126, in _warn_for_remote_retrieve(uri) 113 warnings.warn( 114 "Automatically retrieving remote references can be a security " 115 "vulnerability and is discouraged by the JSON Schema " (...) 123 stacklevel=9, # Ha ha ha ha magic numbers :/ 124 ) 125 return referencing.Resource.from_contents( --> 126 json.load(response), 127 default_specification=referencing.jsonschema.DRAFT202012, 128 ) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/json/__init__.py:298, in load(fp, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 280 """Deserialize ``fp`` (a ``.read()``-supporting file-like object containing 281 a JSON document) to a Python object. 282 (...) 296 kwarg; otherwise ``JSONDecoder`` is used. 297 """ --> 298 return loads(fp.read(), 299 cls=cls, object_hook=object_hook, 300 parse_float=parse_float, parse_int=parse_int, 301 parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/json/__init__.py:352, in loads(s, cls, object_hook, parse_float, parse_int, parse_constant, object_pairs_hook, **kw) 349 if (cls is None and object_hook is None and 350 parse_int is None and parse_float is None and 351 parse_constant is None and object_pairs_hook is None and not kw): --> 352 return _default_decoder.decode(s) 353 if cls is None: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/json/decoder.py:345, in JSONDecoder.decode(self, s, _w) 341 """Return the Python representation of ``s`` (a ``str`` instance 342 containing a JSON document). 343 344 """ --> 345 obj, end = self.raw_decode(s, idx=_w(s, 0).end()) 346 end = _w(s, end).end() File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/json/decoder.py:363, in JSONDecoder.raw_decode(self, s, idx) 362 except StopIteration as err: --> 363 raise JSONDecodeError("Expecting value", s, err.value) from None 364 return obj, end JSONDecodeError: Expecting value: line 1 column 1 (char 0) The above exception was the direct cause of the following exception: Unretrievable Traceback (most recent call last) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/referencing/_core.py:682, in Resolver.lookup(self, ref) 681 try: --> 682 retrieved = self._registry.get_or_retrieve(uri) 683 except exceptions.NoSuchResource: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/referencing/_core.py:435, in Registry.get_or_retrieve(self, uri) 434 except Exception as error: --> 435 raise exceptions.Unretrievable(ref=uri) from error 436 else: Unretrievable: 'https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml' The above exception was the direct cause of the following exception: Unresolvable Traceback (most recent call last) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:462, in create.<locals>.Validator._validate_reference(self, ref, instance) 461 try: --> 462 resolved = self._resolver.lookup(ref) 463 except referencing.exceptions.Unresolvable as err: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/referencing/_core.py:686, in Resolver.lookup(self, ref) 685 except exceptions.Unretrievable as error: --> 686 raise exceptions.Unresolvable(ref=ref) from error 688 if fragment.startswith("/"): Unresolvable: https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml The above exception was the direct cause of the following exception: _WrappedReferencingError Traceback (most recent call last) Cell In[4], line 8 4 "bands": ["green", "nir08"], 5 "item": "https://planetarycomputer.microsoft.com/api/stac/v1/collections/landsat-c2-l2/items/LC08_L2SP_042033_20231007_02_T1", 6 } 7 ----> 8 validate(cwl_converter.get_inputs_json_schema(), inputs) Cell In[3], line 17, in validate(schema, data) 13 f"[{'.'.join(error.schema_path)}] - #/{'/'.join(error.path)}: {error.message}" 14 ) 15 else: 16 print("No JSON Schema violations detected!") ---> 17 except SchemaError as schema_error: 18 print( 19 f"An error occurred while instantiating {Draft202012Validator.__class__.__name__}: {schema_error.message}" 20 ) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:383, in create.<locals>.Validator.iter_errors(self, instance, _schema) 381 for validator, k, v in validators: 382 errors = validator(self, v, instance, _schema) or () --> 383 for error in errors: 384 # set details if not already set by the called fn 385 error._set( 386 validator=k, 387 validator_value=v, (...) 390 type_checker=self.TYPE_CHECKER, 391 ) 392 if k not in {"if", "$ref"}: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/_keywords.py:296, in properties(validator, properties, instance, schema) 294 for property, subschema in properties.items(): 295 if property in instance: --> 296 yield from validator.descend( 297 instance[property], 298 subschema, 299 path=property, 300 schema_path=property, 301 ) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:431, in create.<locals>.Validator.descend(self, instance, schema, path, schema_path, resolver) 428 continue 430 errors = validator(evolved, v, instance, schema) or () --> 431 for error in errors: 432 # set details if not already set by the called fn 433 error._set( 434 validator=k, 435 validator_value=v, (...) 438 type_checker=evolved.TYPE_CHECKER, 439 ) 440 if k not in {"if", "$ref"}: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/_keywords.py:275, in ref(validator, ref, instance, schema) 274 def ref(validator, ref, instance, schema): --> 275 yield from validator._validate_reference(ref=ref, instance=instance) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:431, in create.<locals>.Validator.descend(self, instance, schema, path, schema_path, resolver) 428 continue 430 errors = validator(evolved, v, instance, schema) or () --> 431 for error in errors: 432 # set details if not already set by the called fn 433 error._set( 434 validator=k, 435 validator_value=v, (...) 438 type_checker=evolved.TYPE_CHECKER, 439 ) 440 if k not in {"if", "$ref"}: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/_keywords.py:368, in oneOf(validator, oneOf, instance, schema) 360 else: 361 yield ValidationError( 362 f"{instance!r} is not valid under any of the given schemas", 363 context=all_errors, 364 ) 366 more_valid = [ 367 each for _, each in subschemas --> 368 if validator.evolve(schema=each).is_valid(instance) 369 ] 370 if more_valid: 371 more_valid.append(first_valid) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:499, in create.<locals>.Validator.is_valid(self, instance, _schema) 487 warnings.warn( 488 ( 489 "Passing a schema to Validator.is_valid is deprecated " (...) 495 stacklevel=2, 496 ) 497 self = self.evolve(schema=_schema) --> 499 error = next(self.iter_errors(instance), None) 500 return error is None File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:383, in create.<locals>.Validator.iter_errors(self, instance, _schema) 381 for validator, k, v in validators: 382 errors = validator(self, v, instance, _schema) or () --> 383 for error in errors: 384 # set details if not already set by the called fn 385 error._set( 386 validator=k, 387 validator_value=v, (...) 390 type_checker=self.TYPE_CHECKER, 391 ) 392 if k not in {"if", "$ref"}: File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/_keywords.py:275, in ref(validator, ref, instance, schema) 274 def ref(validator, ref, instance, schema): --> 275 yield from validator._validate_reference(ref=ref, instance=instance) File /opt/hostedtoolcache/Python/3.14.4/x64/lib/python3.14/site-packages/jsonschema/validators.py:464, in create.<locals>.Validator._validate_reference(self, ref, instance) 462 resolved = self._resolver.lookup(ref) 463 except referencing.exceptions.Unresolvable as err: --> 464 raise exceptions._WrappedReferencingError(err) from err 466 return self.descend( 467 instance, 468 resolved.contents, 469 resolver=resolved.resolver, 470 ) 471 else: _WrappedReferencingError: Unresolvable: https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml
3. Outputs JSON Schema generation¶
Users can reuse the BaseCWLtypes2OGCConverter instance to convert the CWL outputs to the JSON Schema:
In [5]:
Copied!
cwl_converter.dump_outputs_json_schema(stream=sys.stdout, pretty_print=True)
cwl_converter.dump_outputs_json_schema(stream=sys.stdout, pretty_print=True)
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"$id": "https://eoap.github.io/cwl2ogc/pattern-12/outputs.yaml",
"description": "The schema to represent a pattern-12 outputs definition",
"type": "object",
"required": [
"cropped",
"ndwi",
"water_bodies"
],
"properties": {
"cropped": {
"$ref": "#/$defs/cropped"
},
"ndwi": {
"$ref": "#/$defs/ndwi"
},
"water_bodies": {
"$ref": "#/$defs/water_bodies"
}
},
"additionalProperties": false,
"$defs": {
"cropped": {
"type": "array",
"items": {
"oneOf": [
{
"type": "string",
"format": "uri"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json"
},
{
"$ref": "https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml"
}
]
}
},
"ndwi": {
"oneOf": [
{
"type": "string",
"format": "uri"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json"
},
{
"$ref": "https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml"
}
]
},
"water_bodies": {
"oneOf": [
{
"type": "string",
"format": "uri"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/item-spec/json-schema/item.json"
},
{
"$ref": "https://schemas.stacspec.org/v1.0.0/collection-spec/json-schema/collection.json"
},
{
"$ref": "https://schemas.opengis.net/ogcapi/features/part1/1.0/openapi/schemas/featureCollectionGeoJSON.yaml"
}
]
}
}
}
2.1 Outputs validation¶
Schema can be used to fully validate an outputs dictionary (JSON Schema validation expected to pass):
In [6]:
Copied!
outputs = {"example_out": "In girum imus nocte et consumimur igni"}
validate(cwl_converter.get_outputs_json_schema(), outputs)
outputs = {"example_out": "In girum imus nocte et consumimur igni"}
validate(cwl_converter.get_outputs_json_schema(), outputs)
[required] - #/: 'cropped' is a required property
[required] - #/: 'ndwi' is a required property
[required] - #/: 'water_bodies' is a required property
[additionalProperties] - #/: Additional properties are not allowed ('example_out' was unexpected)