Skip to article frontmatterSkip to article content
xwrf Logo

Introduction to xwrf


Overview

This notebook will introduce the basics of gridded, labeled data with Xarray. Since Xarray introduces additional abstractions on top of plain arrays of data, our goal is to show why these abstractions are useful and how they frequently lead to simpler, more robust code.

We’ll cover these topics:

  1. Create a DataArray, one of the core object types in Xarray
  2. Understand how to use named coordinates and metadata in a DataArray
  3. Combine individual DataArrays into a Dataset, the other core object type in Xarray
  4. Subset, slice, and interpolate the data using named coordinates
  5. Open netCDF data using XArray
  6. Basic subsetting and aggregation of a Dataset
  7. Brief introduction to plotting with Xarray

Prerequisites

ConceptsImportanceNotes
NumPy BasicsNecessary
Introduction to XarrayNeccessaryUnderstanding of data structures
Intermediate NumPyHelpfulFamiliarity with indexing and slicing arrays
DatetimeHelpfulFamiliarity with time formats and the timedelta object
Understanding of NetCDFHelpfulFamiliarity with metadata structure
  • Time to learn: 30 minutes

Imports

Simmilar to numpy, np; pandas, pd; you may often encounter xarray imported within a shortened namespace as xr.

from datetime import timedelta

import cmweather
import xarray as xr
import xwrf
import glob

import matplotlib.pyplot as plt
/home/runner/micromamba/envs/arm-field-site-cookbook-dev/lib/python3.11/site-packages/xwrf/__init__.py:5: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import DistributionNotFound, get_distribution

Introducing xwrf

xWRF is a package designed to make the post-processing of WRF output data more pythonic. It’s aim is to smooth the rough edges around the unique, non CF-compliant WRF output data format and make the data accessible to utilities like dask and the wider Pangeo universe.

It is built as an Accessor on top of xarray, providing a very simple user interface.

Finding a Dataset

Our dataset of interest is a WRF dataset from the SAIL domain, near Gothic, Colorado, available as an ARM PI dataset - https://iop.archive.arm.gov/arm-iop-file/2021/guc/sail/xu-wrf/README.html.

More information on the dataset:

These are the Weather Research and Forecasting (WRF) regional climate model simulations for supporting the analysis of temperature, precipitation, and other hydroclimate variables and evaluating SAIL data. The WRF model has three nested domains centered at the SAIL location (East River, Colorado) for the SAIL period from Oct 01, 2021 to Dec 31, 2022. We used the BSU subgrid-scale physics schemes, CFSR meteorological forcing datasets, and the topographic shading radiation schemes in our WRF simulation. Detailed information on the model configuration can be found at https://doi.org/10.5194/egusphere-2022-437

Examining the data

When opening up a normal WRF output file with the simple xarray netcdf backend, one can see that it does not provide a lot of useful information.

ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00")
ds
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00")
      2 ds

File ~/micromamba/envs/arm-field-site-cookbook-dev/lib/python3.11/site-packages/xarray/backends/api.py:696, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, create_default_indexes, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    693     kwargs.update(backend_kwargs)
    695 if engine is None:
--> 696     engine = plugins.guess_engine(filename_or_obj)
    698 if from_array_kwargs is None:
    699     from_array_kwargs = {}

File ~/micromamba/envs/arm-field-site-cookbook-dev/lib/python3.11/site-packages/xarray/backends/plugins.py:194, in guess_engine(store_spec)
    186 else:
    187     error_msg = (
    188         "found the following matches with the input file in xarray's IO "
    189         f"backends: {compatible_engines}. But their dependencies may not be installed, see:\n"
    190         "https://docs.xarray.dev/en/stable/user-guide/io.html \n"
    191         "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
    192     )
--> 194 raise ValueError(error_msg)

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy', 'cfradial1', 'datamet', 'furuno', 'gamic', 'gini', 'hpl', 'iris', 'metek', 'nexradlevel2', 'odim', 'rainbow', 'uf']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html

Use .postprocess() to clean the dataset

While all variables are present, e.g. the information about the projection is still in the metadata and also for some fields, there are non-metpy compliant units attributes.

So let’s try to use the standard xWRF.postprocess() function in order to make this information useable.

ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00").xwrf.postprocess()
ds

As you see, xWRF added some coordinate data, reassigned some dimensions and generally increased the amount of information available in the dataset.

Plot the Data!

Now that we have our data in an easier to analyze format, let’s plot one of the fields.

ds.SNOW.plot(x='XLONG',
             y='XLAT',
             cmap='Blues')

Investigate the change in SWE over time

first_ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00").xwrf.postprocess().squeeze()
last_ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-11_00_00_00").xwrf.postprocess().squeeze()

Calculate the Change in SWE

Here, we take the difference using Xarray!

difference = last_ds["SNOW"] - first_ds["SNOW"]
last_ds["SNOW"]
difference.plot(x='XLONG',
                y='XLAT',
                cbar_kwargs={'label': "Change in Snow Water Equivalent ($kgm^{-2}$)"})
plt.title("24 Hour Difference in \n Snow Water Liquid Equivalent \n 10 March to 11 March 2023")

Challenge: Why is there more snow in some areas, and less in others?

  • Investigate other fields in the datasets
  • Look at other time steps - where are our precipitation fields?
  • Find possible scientific explanations here!
files = sorted(glob.glob("../data/sail/wrf/*"))
files

Summary

xwrf can be a helpful tool when working with WRF data in Python! In this tutorial, we investigated WRF data from the SAIL campaign, digging into the datasets and using visualization techniques to analyze our results.

What’s next?

How do we scale up this analysis beyond a 24 hour run? Or higher resolution data? In future notebooks, we explore tools to help with increasing our ability to analyze high resolution datasets with xwrf.

Resources and references

This notebook was adapated from material in xwrf Documentation.

The dataset used here is an ARM PI dataset, courtesy of Zexuan Xu (zexuanxu@lbl.gov). If you use this dataset, please be sure to cite:

References
  1. Xu, Z. (2023). WRF East River simulation Oct 2021 - Dec 2022. Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US); ARM Data Center, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). 10.5439/1971597
  2. Xu, Z., Siirila-Woodburn, E. R., Rhoades, A. M., & Feldman, D. (2022). Sensitivities of subgrid-scale physics schemes, meteorological forcing, and topographic radiation in atmosphere-through-bedrock integrated process models: A case study in the Upper Colorado River Basin. 10.5194/egusphere-2022-437