data_ingestion/spaceeye/spaceeye_preprocess_ensemble

Runs the SpaceEye preprocessing pipeline with an ensemble of cloud segmentation models. The workflow fetches both Sentinel-1 and Sentinel-2 tiles that cover the input geometry and time range and preprocesses them, it also computes improved cloud masks using cloud and shadow segmentation models. Cloud probabilities are computed with an ensemble of five models.

graph TD inp1>user_input] out1>s2_raster] out2>s1_raster] out3>cloud_mask] tsk1{{s2}} tsk2{{s1}} tsk1{{s2}} -- raster/s2_products --> tsk2{{s1}} inp1>user_input] -- user_input --> tsk1{{s2}} inp1>user_input] -- user_input --> tsk2{{s1}} tsk1{{s2}} -- raster --> out1>s2_raster] tsk2{{s1}} -- raster --> out2>s1_raster] tsk1{{s2}} -- mask --> out3>cloud_mask]

Sources

  • user_input: Time range and geometry of interest.

Sinks

  • s2_raster: Sentinel-2 rasters.

  • s1_raster: Sentinel-1 rasters.

  • cloud_mask: Cloud and cloud shadow mask.

Parameters

  • pc_key: Planetary Computer API key.

Tasks

  • s2: Downloads and preprocesses Sentinel-2 imagery that covers the input geometry and time range, and computes improved cloud masks using an ensemble of cloud and shadow segmentation models.

  • s1: Downloads and preprocesses tiles of Sentinel-1 imagery that intersect with the input Sentinel-2 products in the input time range.

Workflow Yaml

name: spaceeye_preprocess_ensemble
sources:
  user_input:
  - s2.user_input
  - s1.user_input
sinks:
  s2_raster: s2.raster
  s1_raster: s1.raster
  cloud_mask: s2.mask
parameters:
  pc_key: '@SECRET(eywa-secrets, pc-sub-key)'
tasks:
  s2:
    workflow: data_ingestion/sentinel2/preprocess_s2_ensemble_masks
    parameters:
      pc_key: '@from(pc_key)'
  s1:
    workflow: data_ingestion/sentinel1/preprocess_s1
    parameters:
      pc_key: '@from(pc_key)'
edges:
- origin: s2.raster
  destination:
  - s1.s2_products
description:
  short_description: Runs the SpaceEye preprocessing pipeline with an ensemble of
    cloud segmentation models.
  long_description: The workflow fetches both Sentinel-1 and Sentinel-2 tiles that
    cover the input geometry and time range and preprocesses them, it also computes
    improved cloud masks using cloud and shadow segmentation models. Cloud probabilities
    are computed with an ensemble of five models.
  sources:
    user_input: Time range and geometry of interest.
  sinks:
    s2_raster: Sentinel-2 rasters.
    s1_raster: Sentinel-1 rasters.
    cloud_mask: Cloud and cloud shadow mask.
  parameters:
    pc_key: Planetary Computer API key.