data_ingestion/weather/download_herbie

Downloads forecast data for provided location & time range using herbie python package. Herbie is a python package that downloads recent and archived numerical weather prediction (NWP) model outputs from different cloud archive sources. Its most popular capability is to download HRRR model data. NWP data in GRIB2 format can be read with xarray+cfgrib. Model data Herbie can retrieve includes the High Resolution Rapid Refresh (HRRR), Rapid Refresh (RAP), Global Forecast System (GFS), National Blend of Models (NBM), Rapid Refresh Forecast System - Prototype (RRFS), and ECMWF open data forecast products (ECMWF).

graph TD inp1>user_input] out1>forecast] tsk1{{list_herbie}} tsk2{{download_herbie}} tsk1{{list_herbie}} -- product/herbie_product --> tsk2{{download_herbie}} inp1>user_input] -- input_item --> tsk1{{list_herbie}} tsk2{{download_herbie}} -- forecast --> out1>forecast]

Sources

  • user_input: Time range and geometry of interest.

Sinks

  • forecast: Grib file with the requested forecast.

Parameters

  • model: Model name as defined in the models template folder. CASE INSENSITIVE Below are examples of model types ‘hrrr’ HRRR contiguous United States model ‘hrrrak’ HRRR Alaska model (alias ‘alaska’) ‘rap’ RAP model ‘gfs’ Global Forecast System (atmosphere) ‘gfs_wave’ Global Forecast System (wave) ‘rrfs’ Rapid Refresh Forecast System prototype for more information see https://herbie.readthedocs.io/en/latest/user_guide/model_info.html

  • product: Output variable product file type (sfc (surface fields), prs (pressure fields), nat (native fields), subh (subhourly fields)). Not specifying this will use the first product in model template file.

  • frequency: frequency in hours of the forecast

  • forecast_lead_times: Forecast lead time in the format [start_time, end_time, increment] (in hours). This parameter can be None, and in this case see parameter ‘forecast_start_date’ for more details. You cannot specify ‘forecast_lead_times’ and ‘forecast_start_date’ at the same time.

  • forecast_start_date: latest datetime (in the format “%Y-%m-%d %H:%M”) for which analysis (zero lead time) are retrieved. After this datetime, forecasts with progressively increasing lead times are retrieved. If this parameter is set to None and ‘forecast_lead_times’ is also set to None, then the workflow returns analysis (zero lead time) up to the latest analysis available, and from that point it returns forecasts with progressively increasing lead times.

  • search_text: It’s a regular expression used to search on GRIB2 Index files and allow you to download just the layer of the file required instead of complete file. For more information on search_text refer to below url. https://blaylockbk.github.io/Herbie/_build/html/user_guide/searchString.html

Tasks

  • list_herbie: Lists herbie products.

  • download_herbie: Download herbie grib files.

Workflow Yaml

name: download_herbie
sources:
  user_input:
  - list_herbie.input_item
sinks:
  forecast: download_herbie.forecast
parameters:
  model: hrrr
  product: null
  frequency: 1
  forecast_lead_times: null
  forecast_start_date: null
  search_text: :TMP:2 m
tasks:
  list_herbie:
    op: list_herbie
    parameters:
      model: '@from(model)'
      product: '@from(product)'
      frequency: '@from(frequency)'
      forecast_lead_times: '@from(forecast_lead_times)'
      forecast_start_date: '@from(forecast_start_date)'
      search_text: '@from(search_text)'
  download_herbie:
    op: download_herbie
edges:
- origin: list_herbie.product
  destination:
  - download_herbie.herbie_product
description:
  short_description: Downloads forecast data for provided location & time range using
    herbie python package.
  long_description: Herbie is a python package that downloads recent and archived
    numerical weather prediction (NWP) model outputs from different cloud archive
    sources. Its most popular capability is to download HRRR model data. NWP data
    in GRIB2 format can be read with xarray+cfgrib. Model data Herbie can retrieve
    includes the High Resolution Rapid Refresh (HRRR), Rapid Refresh (RAP), Global
    Forecast System (GFS), National Blend of Models (NBM), Rapid Refresh Forecast
    System - Prototype (RRFS), and ECMWF open data forecast products (ECMWF).
  sources:
    user_input: Time range and geometry of interest.
  sinks:
    forecast: Grib file with the requested forecast.
  parameters:
    model: Model name as defined in the models template folder. CASE INSENSITIVE Below
      are examples of model types 'hrrr' HRRR contiguous United States model 'hrrrak'
      HRRR Alaska model (alias 'alaska') 'rap' RAP model 'gfs' Global Forecast System
      (atmosphere) 'gfs_wave' Global Forecast System (wave) 'rrfs' Rapid Refresh Forecast
      System prototype for more information see https://herbie.readthedocs.io/en/latest/user_guide/model_info.html
    product: Output variable product file type (sfc (surface fields), prs (pressure
      fields), nat (native fields), subh (subhourly fields)). Not specifying this
      will use the first product in model template file.
    frequency: frequency in hours of the forecast
    forecast_lead_times: Forecast lead time in the format [start_time, end_time, increment]
      (in hours). This parameter can be None, and in this case see parameter 'forecast_start_date'
      for more details. You cannot specify 'forecast_lead_times' and 'forecast_start_date'
      at the same time.
    forecast_start_date: latest datetime (in the format "%Y-%m-%d %H:%M") for which
      analysis (zero lead time) are retrieved. After this datetime, forecasts with
      progressively increasing lead times are retrieved. If this parameter is set
      to None and 'forecast_lead_times' is also set to None, then the workflow returns
      analysis (zero lead time) up to the latest analysis available, and from that
      point it returns forecasts with progressively increasing lead times.
    search_text: It's a regular expression used to search on GRIB2 Index files and
      allow you to download just the layer of the file required instead of complete
      file. For more information on search_text refer to below url. https://blaylockbk.github.io/Herbie/_build/html/user_guide/searchString.html