Introduction to Pyresample

https://pyresample.readthedocs.io/en/latest/

This package seems a bit more speciallized, and does not have as tight of integration with xarray like xESMF and Verde do. If working with satellite or swath data, this is not one to miss! This package integrates with Satpy https://satpy.readthedocs.io/en/stable/ .

(more) Integration with xarray

This is on the to-do list: https://pyresample.readthedocs.io/en/latest/roadmap.html?highlight=xarray#xarray-and-geoxarray

Prerequisites

Knowing your way around xarray, numpy is beneficial. This is not deisgned to be an introduction to any of those packages. Would do this notebook after doing the xESMF one!

Imports

import pandas as pd
import numpy as np
import xarray as xr

import matplotlib.pyplot as plt

from appdirs import *

import dask.array as da

import pyresample
from pyresample import image, geometry
from pyresample.bilinear import NumpyBilinearResampler

from xarray import DataArray
from pyresample.bilinear import XArrayBilinearResampler

import os
%load_ext watermark
%watermark --iversions
pandas    : 2.2.2
dask      : 2024.5.2
numpy     : 1.26.4
matplotlib: 3.8.4
sys       : 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
pyresample: 1.28.3
xarray    : 2024.5.0

Loading in one netCDF

file = '../data/onestorm.nc'

Let’s open this file with xarray:

ds = xr.open_dataset(file)
ds
<xarray.Dataset> Size: 39MB
Dimensions:            (x: 768, y: 768, t: 12, x2: 192, y2: 192, x3: 384,
                        y3: 384, x4: 48, y4: 48)
Dimensions without coordinates: x, y, t, x2, y2, x3, y3, x4, y4
Data variables:
    visible            (x, y, t) float32 28MB ...
    water_vapor        (x2, y2, t) float32 2MB ...
    clean_infrared     (x2, y2, t) float32 2MB ...
    vil                (x3, y3, t) float32 7MB ...
    lightning_flashes  (x4, y4, t) float32 111kB ...
Attributes:
    t:        time dimension of all images. These are 5-min time steps
    x:        x pixel dimension of the visible imagery
    y:        y pixel dimension of the visible imagery
    x2:       x pixel dimension of the water vapor and infrared imagery
    y2:       y pixel dimension the water vapor and infrared imagery
    x3:       x pixel dimension of the vertically integrated liquid imagery
    y3:       y pixel dimension the vertically integrated liquid imagery
    x4:       x pixel dimension of the lightning flashes
    y4:       y pixel dimension of the lightning flashes

Trying to do the same thing with pyresample:

from pyresample.utils import load_cf_area
area_def, cf_info = load_cf_area('data/onestorm.nc', variable='visible', x='x', y='y')
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/file_manager.py:211, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    210 try:
--> 211     file = self._cache[self._key]
    212 except KeyError:

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.__getitem__(self, key)
     55 with self._lock:
---> 56     value = self._cache[key]
     57     self._cache.move_to_end(key)

KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/runner/work/gridding-cookbook/gridding-cookbook/notebooks/data/onestorm.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), '4974837f-36ba-4276-85ad-b1cae336923b']

During handling of the above exception, another exception occurred:

FileNotFoundError                         Traceback (most recent call last)
Cell In[6], line 1
----> 1 area_def, cf_info = load_cf_area('data/onestorm.nc', variable='visible', x='x', y='y')

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/pyresample/utils/cf.py:444, in load_cf_area(nc_file, variable, y, x)
    441 if (x is not None and y is None) or (x is None and y is not None):
    442     raise ValueError("You must specify both or neither of x= and y=")
--> 444 nc_handle = _open_nc_file(nc_file)
    445 if variable is None:
    446     # if the variable=None, we search through all variables
    447     area_def, cf_info = _load_cf_area_several_variables(nc_handle)

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/pyresample/utils/cf.py:478, in _open_nc_file(nc_file)
    475 if isinstance(nc_file, xr.Dataset):
    476     return nc_file
--> 478 return xr.open_dataset(nc_file)

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/api.py:571, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
    559 decoders = _resolve_decoders_kwargs(
    560     decode_cf,
    561     open_backend_dataset_parameters=backend.open_dataset_parameters,
   (...)
    567     decode_coords=decode_coords,
    568 )
    570 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 571 backend_ds = backend.open_dataset(
    572     filename_or_obj,
    573     drop_variables=drop_variables,
    574     **decoders,
    575     **kwargs,
    576 )
    577 ds = _dataset_from_backend_dataset(
    578     backend_ds,
    579     filename_or_obj,
   (...)
    589     **kwargs,
    590 )
    591 return ds

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:646, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
    625 def open_dataset(  # type: ignore[override]  # allow LSP violation, not supporting **kwargs
    626     self,
    627     filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
   (...)
    643     autoclose=False,
    644 ) -> Dataset:
    645     filename_or_obj = _normalize_path(filename_or_obj)
--> 646     store = NetCDF4DataStore.open(
    647         filename_or_obj,
    648         mode=mode,
    649         format=format,
    650         group=group,
    651         clobber=clobber,
    652         diskless=diskless,
    653         persist=persist,
    654         lock=lock,
    655         autoclose=autoclose,
    656     )
    658     store_entrypoint = StoreBackendEntrypoint()
    659     with close_on_error(store):

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:409, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
    403 kwargs = dict(
    404     clobber=clobber, diskless=diskless, persist=persist, format=format
    405 )
    406 manager = CachingFileManager(
    407     netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
    408 )
--> 409 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:356, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
    354 self._group = group
    355 self._mode = mode
--> 356 self.format = self.ds.data_model
    357 self._filename = self.ds.filepath()
    358 self.is_remote = is_remote_uri(self._filename)

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:418, in NetCDF4DataStore.ds(self)
    416 @property
    417 def ds(self):
--> 418     return self._acquire()

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:412, in NetCDF4DataStore._acquire(self, needs_lock)
    411 def _acquire(self, needs_lock=True):
--> 412     with self._manager.acquire_context(needs_lock) as root:
    413         ds = _nc4_require_group(root, self._group, self._mode)
    414     return ds

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self)
    133 del self.args, self.kwds, self.func
    134 try:
--> 135     return next(self.gen)
    136 except StopIteration:
    137     raise RuntimeError("generator didn't yield") from None

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/file_manager.py:199, in CachingFileManager.acquire_context(self, needs_lock)
    196 @contextlib.contextmanager
    197 def acquire_context(self, needs_lock=True):
    198     """Context manager for acquiring a file."""
--> 199     file, cached = self._acquire_with_cache_info(needs_lock)
    200     try:
    201         yield file

File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/file_manager.py:217, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
    215     kwargs = kwargs.copy()
    216     kwargs["mode"] = self._mode
--> 217 file = self._opener(*self._args, **kwargs)
    218 if self._mode == "w":
    219     # ensure file doesn't get overridden when opened again
    220     self._mode = "a"

File src/netCDF4/_netCDF4.pyx:2469, in netCDF4._netCDF4.Dataset.__init__()

File src/netCDF4/_netCDF4.pyx:2028, in netCDF4._netCDF4._ensure_nc_success()

FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/gridding-cookbook/gridding-cookbook/notebooks/data/onestorm.nc'

This is supposed to fail. Will chat about pro’s and con’s in the summary.

Resampling of gridded data using pyresample

Link to this turtorial is here: https://pyresample.readthedocs.io/en/latest/swath.html#pyresample-bilinear

We will be deconstructing it a bit to get into the details, but all of the code is from the above link.

target_def = geometry.AreaDefinition('areaD',
                                     'Europe (3km, HRV, VTC)',
                                     'areaD',
                                     {'a': '6378144.0', 'b': '6356759.0',
                                      'lat_0': '50.00', 'lat_ts': '50.00',
                                      'lon_0': '8.00', 'proj': 'stere'},
                                     800, 800,
                                     [-1370912.72, -909968.64,
                                      1029087.28, 1490031.36])

Unlike using xESMF, this does not depend or work with xarray:

print('target def type', type(target_def))
target def type <class 'pyresample.geometry.AreaDefinition'>
data = DataArray(da.from_array(np.fromfunction(lambda y, x: y*x, (500, 100))), dims=('y', 'x'))
type(data)
xarray.core.dataarray.DataArray
lons = da.from_array(np.fromfunction(lambda y, x: 3 + x * 0.1, (500, 100)))
lats = da.from_array(np.fromfunction(lambda y, x: 75 - y * 0.1, (500, 100)))
source_def = geometry.SwathDefinition(lons=lons, lats=lats)
resampler = XArrayBilinearResampler(source_def, target_def, 30e3)
result = resampler.resample(data)
type(result)
/home/runner/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/pyproj/crs/crs.py:1293: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
  proj = self._crs.to_proj4(version=version)
xarray.core.dataarray.DataArray

Can export to xarray

result.to_dataset()
<xarray.Dataset> Size: 5MB
Dimensions:                                   (x: 800, y: 800)
Coordinates:
  * x                                         (x) float64 6kB -1.369e+06 ... ...
  * y                                         (y) float64 6kB 1.489e+06 ... -...
Data variables:
    reshape-c1dd57ad960628aa19ec19ddee8ffa66  (y, x) float64 5MB dask.array<chunksize=(800, 800), meta=np.ndarray>
data.to_dataset()
<xarray.Dataset> Size: 400kB
Dimensions:                                 (y: 500, x: 100)
Dimensions without coordinates: y, x
Data variables:
    array-76c109148983aaf79e69a6d8292615f3  (y, x) float64 400kB dask.array<chunksize=(500, 100), meta=np.ndarray>

Summary

Pyresample is a speciallist program, with strong functionality with satpy. Would reccomend if swath/sat image data is part of your normal workflow. For others, the requirement of the data being CF compliant and API is a hurdle.