data_ingestion/sentinel2/preprocess_s2_ensemble_masks

Downloads and preprocesses Sentinel-2 imagery that covers the input geometry and time range, and computes improved cloud masks using an ensemble of cloud and shadow segmentation models. This workflow selects a minimum set of tiles that covers the input geometry, downloads Sentinel-2 imagery for the selected time range, and preprocesses it by generating a single multi-band raster at 10m resolution. It then improves cloud masks by merging the product mask with cloud and shadow masks computed using an ensemble of cloud and shadow segmentation models.

graph TD inp1>user_input] out1>raster] out2>mask] tsk1{{s2}} tsk2{{cloud}} tsk1{{s2}} -- raster/s2_raster --> tsk2{{cloud}} tsk1{{s2}} -- mask/product_mask --> tsk2{{cloud}} inp1>user_input] -- user_input --> tsk1{{s2}} tsk1{{s2}} -- raster --> out1>raster] tsk2{{cloud}} -- mask --> out2>mask]

Sources

  • user_input: Time range and geometry of interest.

Sinks

  • raster: Sentinel-2 L2A rasters with all bands resampled to 10m resolution.

  • mask: Cloud masks at 10m resolution.

Parameters

  • min_tile_cover: Minimum RoI coverage to consider a set of tiles sufficient.

  • max_tiles_per_time: Maximum number of tiles used to cover the RoI in each date.

  • cloud_thr: Confidence threshold to assign a pixel as cloud.

  • shadow_thr: Confidence threshold to assign a pixel as shadow.

  • pc_key: Optional Planetary Computer API key.

Tasks

  • s2: Downloads and preprocesses Sentinel-2 imagery that covers the input geometry and time range.

  • cloud: Improves cloud masks by merging the product cloud mask with cloud and shadow masks computed by an ensemble of machine learning segmentation models.

Workflow Yaml

name: preprocess_s2_ensemble_masks
sources:
  user_input:
  - s2.user_input
sinks:
  raster: s2.raster
  mask: cloud.mask
parameters:
  min_tile_cover: null
  max_tiles_per_time: null
  cloud_thr: null
  shadow_thr: null
  pc_key: null
tasks:
  s2:
    workflow: data_ingestion/sentinel2/preprocess_s2
    parameters:
      min_tile_cover: '@from(min_tile_cover)'
      max_tiles_per_time: '@from(max_tiles_per_time)'
      pc_key: '@from(pc_key)'
  cloud:
    workflow: data_ingestion/sentinel2/improve_cloud_mask_ensemble
    parameters:
      cloud_thr: '@from(cloud_thr)'
      shadow_thr: '@from(shadow_thr)'
edges:
- origin: s2.raster
  destination:
  - cloud.s2_raster
- origin: s2.mask
  destination:
  - cloud.product_mask
description:
  short_description: Downloads and preprocesses Sentinel-2 imagery that covers the
    input geometry and time range, and computes improved cloud masks using an ensemble
    of cloud and shadow segmentation models.
  long_description: This workflow selects a minimum set of tiles that covers the input
    geometry, downloads Sentinel-2 imagery for the selected time range, and preprocesses
    it by generating a single multi-band raster at 10m resolution. It then improves
    cloud masks by merging the product mask with cloud and shadow masks computed using
    an ensemble of cloud and shadow segmentation models.
  sources:
    user_input: Time range and geometry of interest.
  sinks:
    raster: Sentinel-2 L2A rasters with all bands resampled to 10m resolution.
    mask: Cloud masks at 10m resolution.