data_ingestion/weather/download_herbie
Downloads forecast data for provided location & time range using herbie python package. Herbie is a python package that downloads recent and archived numerical weather prediction (NWP) model outputs from different cloud archive sources. Its most popular capability is to download HRRR model data. NWP data in GRIB2 format can be read with xarray+cfgrib. Model data Herbie can retrieve includes the High Resolution Rapid Refresh (HRRR), Rapid Refresh (RAP), Global Forecast System (GFS), National Blend of Models (NBM), Rapid Refresh Forecast System - Prototype (RRFS), and ECMWF open data forecast products (ECMWF).
Sources
user_input: Time range and geometry of interest.
Sinks
forecast: Grib file with the requested forecast.
Parameters
model: Model name as defined in the models template folder. CASE INSENSITIVE Below are examples of model types ‘hrrr’ HRRR contiguous United States model ‘hrrrak’ HRRR Alaska model (alias ‘alaska’) ‘rap’ RAP model ‘gfs’ Global Forecast System (atmosphere) ‘gfs_wave’ Global Forecast System (wave) ‘rrfs’ Rapid Refresh Forecast System prototype for more information see https://herbie.readthedocs.io/en/latest/user_guide/model_info.html
product: Output variable product file type (sfc (surface fields), prs (pressure fields), nat (native fields), subh (subhourly fields)). Not specifying this will use the first product in model template file.
frequency: frequency in hours of the forecast
forecast_lead_times: Forecast lead time in the format [start_time, end_time, increment] (in hours). This parameter can be None, and in this case see parameter ‘forecast_start_date’ for more details. You cannot specify ‘forecast_lead_times’ and ‘forecast_start_date’ at the same time.
forecast_start_date: latest datetime (in the format “%Y-%m-%d %H:%M”) for which analysis (zero lead time) are retrieved. After this datetime, forecasts with progressively increasing lead times are retrieved. If this parameter is set to None and ‘forecast_lead_times’ is also set to None, then the workflow returns analysis (zero lead time) up to the latest analysis available, and from that point it returns forecasts with progressively increasing lead times.
search_text: It’s a regular expression used to search on GRIB2 Index files and allow you to download just the layer of the file required instead of complete file. For more information on search_text refer to below url. https://blaylockbk.github.io/Herbie/_build/html/user_guide/searchString.html
Tasks
list_herbie: Lists herbie products.
download_herbie: Download herbie grib files.
Workflow Yaml
name: download_herbie
sources:
user_input:
- list_herbie.input_item
sinks:
forecast: download_herbie.forecast
parameters:
model: hrrr
product: null
frequency: 1
forecast_lead_times: null
forecast_start_date: null
search_text: :TMP:2 m
tasks:
list_herbie:
op: list_herbie
parameters:
model: '@from(model)'
product: '@from(product)'
frequency: '@from(frequency)'
forecast_lead_times: '@from(forecast_lead_times)'
forecast_start_date: '@from(forecast_start_date)'
search_text: '@from(search_text)'
download_herbie:
op: download_herbie
edges:
- origin: list_herbie.product
destination:
- download_herbie.herbie_product
description:
short_description: Downloads forecast data for provided location & time range using
herbie python package.
long_description: Herbie is a python package that downloads recent and archived
numerical weather prediction (NWP) model outputs from different cloud archive
sources. Its most popular capability is to download HRRR model data. NWP data
in GRIB2 format can be read with xarray+cfgrib. Model data Herbie can retrieve
includes the High Resolution Rapid Refresh (HRRR), Rapid Refresh (RAP), Global
Forecast System (GFS), National Blend of Models (NBM), Rapid Refresh Forecast
System - Prototype (RRFS), and ECMWF open data forecast products (ECMWF).
sources:
user_input: Time range and geometry of interest.
sinks:
forecast: Grib file with the requested forecast.
parameters:
model: Model name as defined in the models template folder. CASE INSENSITIVE Below
are examples of model types 'hrrr' HRRR contiguous United States model 'hrrrak'
HRRR Alaska model (alias 'alaska') 'rap' RAP model 'gfs' Global Forecast System
(atmosphere) 'gfs_wave' Global Forecast System (wave) 'rrfs' Rapid Refresh Forecast
System prototype for more information see https://herbie.readthedocs.io/en/latest/user_guide/model_info.html
product: Output variable product file type (sfc (surface fields), prs (pressure
fields), nat (native fields), subh (subhourly fields)). Not specifying this
will use the first product in model template file.
frequency: frequency in hours of the forecast
forecast_lead_times: Forecast lead time in the format [start_time, end_time, increment]
(in hours). This parameter can be None, and in this case see parameter 'forecast_start_date'
for more details. You cannot specify 'forecast_lead_times' and 'forecast_start_date'
at the same time.
forecast_start_date: latest datetime (in the format "%Y-%m-%d %H:%M") for which
analysis (zero lead time) are retrieved. After this datetime, forecasts with
progressively increasing lead times are retrieved. If this parameter is set
to None and 'forecast_lead_times' is also set to None, then the workflow returns
analysis (zero lead time) up to the latest analysis available, and from that
point it returns forecasts with progressively increasing lead times.
search_text: It's a regular expression used to search on GRIB2 Index files and
allow you to download just the layer of the file required instead of complete
file. For more information on search_text refer to below url. https://blaylockbk.github.io/Herbie/_build/html/user_guide/searchString.html