Introduction to xwrf¶

Overview¶

This notebook will introduce the basics of gridded, labeled data with Xarray. Since Xarray introduces additional abstractions on top of plain arrays of data, our goal is to show why these abstractions are useful and how they frequently lead to simpler, more robust code.

We’ll cover these topics:

Create a DataArray, one of the core object types in Xarray
Understand how to use named coordinates and metadata in a DataArray
Combine individual DataArrays into a Dataset, the other core object type in Xarray
Subset, slice, and interpolate the data using named coordinates
Open netCDF data using XArray
Basic subsetting and aggregation of a Dataset
Brief introduction to plotting with Xarray

Prerequisites¶

Concepts	Importance	Notes
NumPy Basics	Necessary
Introduction to Xarray	Neccessary	Understanding of data structures
Intermediate NumPy	Helpful	Familiarity with indexing and slicing arrays
Datetime	Helpful	Familiarity with time formats and the `timedelta` object
Understanding of NetCDF	Helpful	Familiarity with metadata structure

Time to learn: 30 minutes

Imports¶

Simmilar to numpy, np; pandas, pd; you may often encounter xarray imported within a shortened namespace as xr.

from datetime import timedelta

import cmweather
import xarray as xr
import xwrf
import glob

import matplotlib.pyplot as plt

Introducing xwrf¶

xWRF is a package designed to make the post-processing of WRF output data more pythonic. It’s aim is to smooth the rough edges around the unique, non CF-compliant WRF output data format and make the data accessible to utilities like dask and the wider Pangeo universe.

It is built as an Accessor on top of xarray, providing a very simple user interface.

Finding a Dataset¶

Our dataset of interest is a WRF dataset from the SAIL domain, near Gothic, Colorado, available as an ARM PI dataset - https://iop.archive.arm.gov/arm-iop-file/2021/guc/sail/xu-wrf/README.html.

More information on the dataset:

These are the Weather Research and Forecasting (WRF) regional climate model simulations for supporting the analysis of temperature, precipitation, and other hydroclimate variables and evaluating SAIL data. The WRF model has three nested domains centered at the SAIL location (East River, Colorado) for the SAIL period from Oct 01, 2021 to Dec 31, 2022. We used the BSU subgrid-scale physics schemes, CFSR meteorological forcing datasets, and the topographic shading radiation schemes in our WRF simulation. Detailed information on the model configuration can be found at https://doi.org/10.5194/egusphere-2022-437

Examining the data¶

When opening up a normal WRF output file with the simple xarray netcdf backend, one can see that it does not provide a lot of useful information.

ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00")
ds

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 1
----> 1 ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00")
      2 ds

File ~/micromamba/envs/arm-field-site-cookbook-dev/lib/python3.11/site-packages/xarray/backends/api.py:587, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, create_default_indexes, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    584     kwargs.update(backend_kwargs)
    586 if engine is None:
--> 587     engine = plugins.guess_engine(filename_or_obj)
    589 if from_array_kwargs is None:
    590     from_array_kwargs = {}

File ~/micromamba/envs/arm-field-site-cookbook-dev/lib/python3.11/site-packages/xarray/backends/plugins.py:212, in guess_engine(store_spec, must_support_groups)
    204 else:
    205     error_msg = (
    206         "found the following matches with the input file in xarray's IO "
    207         f"backends: {compatible_engines}. But their dependencies may not be installed, see:\n"
    208         "https://docs.xarray.dev/en/stable/user-guide/io.html \n"
    209         "https://docs.xarray.dev/en/stable/getting-started-guide/installing.html"
    210     )
--> 212 raise ValueError(error_msg)

ValueError: did not find a match in any of xarray's currently installed IO backends ['netcdf4', 'h5netcdf', 'scipy', 'cfradial1', 'datamet', 'furuno', 'gamic', 'gini', 'hpl', 'iris', 'metek', 'nexradlevel2', 'odim', 'rainbow', 'uf']. Consider explicitly selecting one of the installed engines via the ``engine`` parameter, or installing additional IO dependencies, see:
https://docs.xarray.dev/en/stable/getting-started-guide/installing.html
https://docs.xarray.dev/en/stable/user-guide/io.html

Use `.postprocess()` to clean the dataset¶

While all variables are present, e.g. the information about the projection is still in the metadata and also for some fields, there are non-metpy compliant units attributes.

So let’s try to use the standard xWRF.postprocess() function in order to make this information useable.

ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00").xwrf.postprocess()
ds

As you see, xWRF added some coordinate data, reassigned some dimensions and generally increased the amount of information available in the dataset.

Plot the Data!¶

Now that we have our data in an easier to analyze format, let’s plot one of the fields.

ds.SNOW.plot(x='XLONG',
             y='XLAT',
             cmap='Blues')

Investigate the change in SWE over time¶

first_ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-10_00_00_00").xwrf.postprocess().squeeze()
last_ds = xr.open_dataset("../data/sail/wrf/wrfout_d03_2023-03-11_00_00_00").xwrf.postprocess().squeeze()

Calculate the Change in SWE¶

Here, we take the difference using Xarray!

difference = last_ds["SNOW"] - first_ds["SNOW"]

last_ds["SNOW"]

difference.plot(x='XLONG',
                y='XLAT',
                cbar_kwargs={'label': "Change in Snow Water Equivalent ($kgm^{-2}$)"})
plt.title("24 Hour Difference in \n Snow Water Liquid Equivalent \n 10 March to 11 March 2023")

Challenge: Why is there more snow in some areas, and less in others?¶

Investigate other fields in the datasets
Look at other time steps - where are our precipitation fields?
Find possible scientific explanations here!

files = sorted(glob.glob("../data/sail/wrf/*"))
files

Summary¶

xwrf can be a helpful tool when working with WRF data in Python! In this tutorial, we investigated WRF data from the SAIL campaign, digging into the datasets and using visualization techniques to analyze our results.

What’s next?¶

How do we scale up this analysis beyond a 24 hour run? Or higher resolution data? In future notebooks, we explore tools to help with increasing our ability to analyze high resolution datasets with xwrf.

Resources and references¶

This notebook was adapated from material in xwrf Documentation.

The dataset used here is an ARM PI dataset, courtesy of Zexuan Xu (zexuanxu@lbl.gov). If you use this dataset, please be sure to cite:

Xu (2023)
More information can be found in the related publication - Xu et al. (2022)

References¶

Xu, Z. (2023). WRF East River simulation Oct 2021 - Dec 2022. Atmospheric Radiation Measurement (ARM) Archive, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (US); ARM Data Center, Oak Ridge National Laboratory (ORNL), Oak Ridge, TN (United States). 10.5439/1971597
Xu, Z., Siirila-Woodburn, E. R., Rhoades, A. M., & Feldman, D. (2022). Sensitivities of subgrid-scale physics schemes, meteorological forcing, and topographic radiation in atmosphere-through-bedrock integrated process models: A case study in the Upper Colorado River Basin. 10.5194/egusphere-2022-437

Model and Observational Data with Xarray

Dask Demo

Model and Observational Data with Xarray

Summer School Projects