1. Quickstart Guide

This section provides a recipe for an end-to-end run of nested-EAGLE on Ursa. At present, Ursa is the only supported platform. Future development will include additional platforms.

Note

GNU make version 3.82 or higher is required.

Complete the following steps from the src/ directory.

Note

The EAGLE runtime software environment currently requires over 50 GB of disk space. Consider available space, quota, etc. when choosing where to clone the EAGLE repository and run the following steps.

1.1. Building and Running EAGLE

  1. Create all environments

    make env cudascript=ursa
    

    This step creates the runtime software environment, comprising conda virtual environments to support data preparation, training, inference, and verification. The conda/ subdirectory it creates is self-contained and can be removed and recreated by running the make env command again, as long as pipeline steps are not currently running.

    Developers who will be modifying Python driver code should replace make env with make devenv, which will create the same environments but also install additional code-quality tools for formatting, linting, shellchecking, typechecking, unit testing, and YAML linting.

  2. Create the EAGLE YAML config

    make config compose=base:ursa >eagle.yaml
    

    The config target operates on .yaml files in the config/ directory, so this command composes config/base.yaml and config/ursa.yaml and redirects the composed config into eagle.yaml.

  3. Set the app.base value in eagle.yaml to the absolute path to the current src/ directory.

    The run directories from subsequent steps, along with the output of those steps, will be created in the run/<expname> subdirectory of app.base, where <expname> is the value of app.experiment_name.

    Verify the app.account value. The default configuration sets app.account to epic. If you do not have access to the epic account on Ursa, update this value to an account you are authorized to use.

  4. Create training data

    make data config=eagle.yaml
    

    This step provisions data required for training and inference. The data target delegates to targets grids-and-meshes, zarr-gfs, and zarr-hrrr, which can also be run individually (e.g. make grids-and-meshes config=eagle.yaml), but note that grids-and-meshes, which runs locally, must be run first. The zarr-gfs and zarr-hrrr targets can be run in quick succession, as they submit batch jobs: Do not proceed until their batch jobs complete successfully (see the files run/<expname>/data/*.out).

  5. Train the ML model

    make training config=eagle.yaml
    

    This step trains a model using data provisioned by the previous step. It submits a batch job; do not proceed until the batch job completes successfully (see the file run/<expname>/training/runscript.training.out).

  6. Run inference

    make inference config=eagle.yaml
    

    This step performs inference, producing a forecast. It submits a batch job. Do not proceed until the batch job completes successfully (see the file run/<expname>/inference/runscript.inference.out.)

  7. Model verification

    make vx-grid-global config=eagle.yaml
    make vx-grid-lam config=eagle.yaml
    make vx-obs-global config=eagle.yaml
    make vx-obs-lam config=eagle.yaml
    

    Before running verification, the WXVX driver will run prewxvx to prepare forecast output from the previous step. See the files run/<expname>/vx/prewxvx/{global,lam}/runscript.prewxvx-*.out for details.

    These steps perform verification of the global or LAM forecasts against gridded analyses (*-grid-*) or PrepBUFR observations (*-obs-*) as truth. Each submits a batch job, so the four make commands can be run in quick succession to get all the batch jobs running in parallel. When each batch job completes, MET .stat files and .png plot files can be found under the stats/ and plots/ subdirectories of run/<expname>/vx/grid2{grid,obs}/{global,lam}/run/. The files run/<expname>/vx/*.log contain the logs from each verification run.

  8. Make additional visualization outputs

    make vis-grid-global config=eagle.yaml
    make vis-grid-lam config=eagle.yaml
    make vis-obs-global config=eagle.yaml
    make vis-obs-lam config=eagle.yaml
    

    These steps will first call eagle-tools’s postwxvx tool to create and save a series of netCDF files with all relevant statistics in the corresponding wxvx directory for each variable. It will then create a series of basic plots (provided by DataArray.plot() from the xarray library) in the run/<expname>/visualization/grid2{grid,obs}/{global,lam}/plots-basic directory.

    For the grid-based vis-grid-global and vis-grid-lam targets, additional error plots (forecast vs truth differences) will be created under run/<expname>/visualization/grid2grid/{global,lam}/plots-spatial-stats/. These plots depend on 1. The config value at key-path vx.grid2grid.{global,lam}.wxvx.wxvx.ncdiffs being set to true, which instructs MET to produce netCDF difference files during verification; and 2. The config block at key-path visualization.grid2grid.{global,lam}.visualization.spatial_stat_plots, which enables and configures plot generation, being present.

  9. Run inference in near-real-time (NRT)

    1. Create the EAGLE NRT config

    make config compose=base:ursa:nrt > nrt-composed.yaml
    
    1. Set the app.base value in nrt-composed.yaml to the absolute path to the current src/ directory.

    This should match the path used when generating the main EAGLE config above.

    Two additional paths may require attention:
    • inference.anemoi.checkpoint_dir

    • grids_and_meshes.rundir

    If you are following only the quickstart workflow, you do not need to modify these values. The config automatically pulls both paths from the quickstart run. However, if you ran multiple experiments or stored outputs in a different location, update these paths so they point to the correct directories.

    1. Realize the EAGLE NRT config

    make realize config=nrt-composed.yaml > nrt.yaml
    

    This creates the final config to begin a NRT run. It is required because it freezes the NOW environment variable across the entire configuration. Since jobs may be submitted at different times, this ensures a consistent timestamp is used throughout the run.

    1. Load current initial conditions

    make data config=nrt.yaml
    
    1. Run inference

    make inference config=nrt.yaml
    

    Your forecast will save to path/to/eagle/src/run/default/nrt_inference/YYYY/MM/DD/HH/inference.