Skip to content

Input STAC Requirements

Mandatory use of the Item Assets Extension

The tooling that reads a STAC Catalog and produces STAC/Zarr outputs expects the input STAC Catalog to contain a STAC Collection with the Item Assets extension defined.

The item_assets definitions are used as the authoritative source for deriving the measurements written to the output Zarr store (native Zarr v3 or Zarr v2 following EOPF conventions).

Note: Collections without item_assets are considered invalid inputs for this tool.

Why item_assets is required

The conversion process produces a Zarr layout of the form:

data.zarr/
└── measurements/
    ├── <measurement-1>/
    ├── <measurement-2>/
    └── ...

Each Zarr measurement group is derived from a corresponding Item Asset definition in the Collection.

The Item Assets extension provides:

  • The canonical list of measurements to materialize
  • Stable measurement identifiers (asset keys)
  • Semantic metadata (title, description, roles)
  • Media type and band definitions

This avoids relying on:

  • implicit inspection of Items
  • asset presence heuristics
  • dataset-specific assumptions

Expected Collection structure

The input STAC Collection MUST:

  • Declare the Item Assets extension
  • Define an item_assets object
  • Include one entry per measurement to be written

Minimal example:

{
  "type": "Collection",
  "stac_version": "1.1.0",

  "stac_extensions": [
    "https://stac-extensions.github.io/item-assets/v1.0.0/schema.json"
  ],

  "item_assets": {
    "water-bodies": {
      "title": "Water Bodies",
      "description": "Water bodies classification",
      "roles": ["data"],
      "type": "application/vnd.zarr; version=3",
      "bands": [
        {
          "name": "water-bodies",
          "description": "Water bodies classification"
        }
      ]
    },
    "water-bodies-confidence": {
      "title": "Water Bodies Confidence",
      "description": "Confidence of water bodies detection",
      "roles": ["data"],
      "type": "application/vnd.zarr; version=3"
    }
  }
}

How item_assets is used by the tool

For each entry in collection.item_assets:

Item Assets field Usage in Zarr output
Asset key Name of the Zarr measurement group
title Zarr group attribute (title)
description Zarr group attribute (description)
bands Variables created under the measurement group
roles Informational (not mapped to storage layout)
type Validation of expected data model

The tool does not infer measurements from Items. Only measurements explicitly declared in item_assets are materialized.

Relationship with Items

  • Items are used only as a source of data
  • Items MAY contain additional assets
  • Assets not declared in item_assets are ignored

This allows:

  • heterogeneous Items
  • sparse or partial Item coverage
  • future Item evolution without breaking the Zarr layout

Validation behavior

If any of the following conditions are met, the tool fails fast:

  • item_assets is missing
  • an Item Asset key is not found in at least one Item
  • required band variables cannot be resolved

This ensures the output Zarr store is:

  • deterministic
  • schema-driven
  • reproducible
  • aligned with the STAC Zarr Best Practices

Rationale (design choice)

Using item_assets as the measurement contract:

  • aligns with STAC best practices
  • avoids Item-level duplication
  • supports Collection-only data models
  • cleanly maps to EOPF measurements/* layout
  • works equally well for native and virtual Zarr stores

This approach treats the STAC Collection as the data model and Items as data carriers, which is consistent with datacube-oriented workflows.