Usage¤
Pretraining¤
Data Preparation¤
The code for downloading and preprocessing CMIP6 data is coming soon
Training¤
For example, to pretrain ClimaX on MPI-ESM dataset on 8 GPUs usepython src/climax/pretrain/train.py --config configs/pretrain_climax.yaml \
--trainer.strategy=ddp --trainer.devices=8 \
--trainer.max_epochs=100 \
--data.batch_size=16 \
--model.lr=5e-4 --model.beta_1="0.9" --model.beta_2="0.95" \
--model.weight_decay=1e-5
Tip
Make sure to update the paths of the data directories in the config files (or override them via the CLI).
Pretrained checkpoints¤
We provide two pretrained checkpoints, one was pretrained on 5.625deg data, and the other was pretrained on 1.40625deg data. Both checkpoints were pretrained using all 5 CMIP6 datasets.
Usage: We can load the checkpoint by passing the checkpoint url to the training script. See below for examples.
Global Forecasting¤
Data Preparation¤
First, download ERA5 data from WeatherBench. The data directory should look like the following
5.625deg
|-- 10m_u_component_of_wind
|-- 10m_v_component_of_wind
|-- 2m_temperature
|-- constants.nc
|-- geopotential
|-- relative_humidity
|-- specific_humidity
|-- temperature
|-- toa_incident_solar_radiation
|-- total_precipitation
|-- u_component_of_wind
|-- v_component_of_wind
Then, preprocess the netcdf data into small numpy files and compute important statistics
python src/data_preprocessing/nc2np_equally_era5.py \
--root_dir /mnt/data/5.625deg \
--save_dir /mnt/data/5.625deg_npz \
--start_train_year 1979 --start_val_year 2016 \
--start_test_year 2017 --end_year 2019 --num_shards 8
The preprocessed data directory will look like the following
5.625deg_npz
|-- train
|-- val
|-- test
|-- normalize_mean.npz
|-- normalize_std.npz
|-- lat.npy
|-- lon.npy
Training¤
To finetune ClimaX for global forecasting, use
For example, to finetune ClimaX on 8 GPUs usepython src/climax/global_forecast/train.py --config configs/global_forecast_climax.yaml \
--trainer.strategy=ddp --trainer.devices=8 \
--trainer.max_epochs=50 \
--data.root_dir=/mnt/data/5.625deg_npz \
--data.predict_range=72 --data.out_variables=['z_500','t_850','t2m'] \
--data.batch_size=16 \
--model.pretrained_path='https://huggingface.co/tungnd/climax/resolve/main/5.625deg.ckpt' \
--model.lr=5e-7 --model.beta_1="0.9" --model.beta_2="0.99" \
--model.weight_decay=1e-5
--model.pretrained_path=""
.
Regional Forecasting¤
Data Preparation¤
We use the same ERA5 data as in global forecasting and extract the regional data on the fly during training. If you have already downloaded and preprocessed the data, you do not have to do it again.
Training¤
To finetune ClimaX for regional forecasting, use
For example, to finetune ClimaX on North America using 8 GPUs, usepython src/climax/regional_forecast/train.py --config configs/regional_forecast_climax.yaml \
--trainer.strategy=ddp --trainer.devices=8 \
--trainer.max_epochs=50 \
--data.root_dir=/mnt/data/5.625deg_npz \
--data.region="NorthAmerica"
--data.predict_range=72 --data.out_variables=['z_500','t_850','t2m'] \
--data.batch_size=16 \
--model.pretrained_path='https://huggingface.co/tungnd/climax/resolve/main/1.40625deg.ckpt' \
--model.lr=5e-7 --model.beta_1="0.9" --model.beta_2="0.99" \
--model.weight_decay=1e-5
--model.pretrained_path=""
.
Climate Projection¤
Data Preparation¤
First, download ClimateBench data. ClimaX can work with either the original ClimateBench data or the regridded version. In the experiment in the paper, we regridded to ClimateBench data to 5.625 degree. To do that, run
python src/data_preprocessing/regrid_climatebench.py /mnt/data/climatebench/train_val \
--save_path /mnt/data/climatebench/5.625deg/train_val --ddeg_out 5.625
python src/data_preprocessing/regrid_climatebench.py /mnt/data/climatebench/test \
--save_path /mnt/data/climatebench/5.625deg/test --ddeg_out 5.625
Training¤
To finetune ClimaX for climate projection, use
For example, to finetune ClimaX on 8 GPUs usepython python src/climax/climate_projection/train.py --config configs/climate_projection.yaml \
--trainer.strategy=ddp --trainer.devices=8 \
--trainer.max_epochs=50 \
--data.root_dir=/mnt/data/climatebench/5.625deg \
--data.out_variables="tas" \
--data.batch_size=16 \
--model.pretrained_path='https://huggingface.co/tungnd/climax/resolve/main/5.625deg.ckpt' \
--model.out_vars="tas" \
--model.lr=5e-4 --model.beta_1="0.9" --model.beta_2="0.99" \
--model.weight_decay=1e-5
--model.pretrained_path=""
.
Visualization¤
Coming soon