Developer Onboarding
Goal
Contribute safely to the CLI tools and CWL workflows, especially stac-zarr.
Repository map
command-line-tools/stac-collection/: STAC packaging step for NDWI/Otsu outputscommand-line-tools/stac-zarr/: Zarr v3 + STAC writercommand-line-tools/occurrence/: consumer tool (reads Zarr STAC output and derives occurrence)command-line-tools/stac-eopf-product/: EOPF-style Zarr outputcwl-workflow/: producer and consumer workflow definitionsdocs/: MkDocs content and notebook-backed walkthroughs
Local setup (uv + Task, recommended)
uv sync
Useful commands:
task test:unit:scoped
task cwl:run:producer
task cwl:test:e2e
task containers:build:producer
Local setup (pip editable, alternative)
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -e command-line-tools/stac-collection
pip install -e command-line-tools/stac-zarr
pip install -e command-line-tools/occurrence
pip install -e command-line-tools/stac-eopf-product
Common checks
task test:unit:scoped
task cwl:check:release
stac-zarr implementation contract
stac-zarr is Collection-driven:
collection.item_assetsis required- measurement list is derived from
collection.item_assets - each input Item must contain all declared measurement keys
stac-zarr also writes:
- STAC
rel: storeand asset-level metadata (cube:*,proj:*,raster:*) - Zarr v3 root conventions metadata (
zarr_conventions,multiscales,proj:*,spatial:*) - overview pyramids under
measurements_overviews/
Overview controls:
--overview-levels--continuous-overview-reducer(mean|max|median|nearest)--categorical-overview-reducer(mean|max|median|nearest)
When changing CLI options, update:
cwl-workflow/app-water-bodies.cwl- docs pages (
docs/input-stac.md,docs/compliance/stac-zarr-best-practices.md)
Regenerating Pydantic convention models
stac-zarr can regenerate schema-derived Pydantic models (instead of hand-editing) with:
task models:generate:all
Or run individually:
task models:generate:spatial
task models:generate:geo-proj
task models:generate:multiscales
Generated files are written to:
command-line-tools/stac-zarr/stac_zarr/models/generated/
Notes:
geo-projschema is modular, so generation creates a package:command-line-tools/stac-zarr/stac_zarr/models/generated/geo_proj/__init__.pycommand-line-tools/stac-zarr/stac_zarr/models/generated/geo_proj/_internal.pycommand-line-tools/stac-zarr/stac_zarr/models/generated/geo_proj/projjson.pyspatialandmultiscalescurrently generate single modules:.../generated/spatial.py.../generated/multiscales.py- Runtime
stac-zarruses generated convention/spatial models; legacy hand-written models were removed.