Data Flow Management
The Best Practice for Earth Observation Application Package addresses data flow management of the input and output EO Products files by defining rules for the data stage-in and data stage-out for Applications that require staged files and/or generate files that need to be staged-out.
Data stage-in definition
Data stage-in is the process to retrieve the inputs and make these available for the processing. Processing inputs are provided as catalogue references and the Platform is responsible for translating those references into inputs available as files for the local processing.
Data stage-out definition
Data stage-out is the process to upload the output files generated by the processing onto external system(s), and make them available for later usage. The Platform retrieves the processing outputs and automatically stores them onto an external persistent storage. Additionally, the Platform should publish the metadata of the outputs onto a Catalogue and provide their references as an output.
Platform data flow management
For the data stage-in, the Platform creates a local STAC Catalog with a STAC Item whose Assets have an accessible href (either local or remote e.g. COG) as the input files manifest for the application.
For the data stage-out, the Application creates a local STAC Catalog as the output files manifest describing the results metadata and assets’ location thus enabling the Platform to provide the processing results in the OGC API — Processes response.
Example
The data flow management concepts mapped to the a Water Body Detection application are depicted below.