Introduction to Pyresample
https://pyresample.readthedocs.io/en/latest/
This package seems a bit more speciallized, and does not have as tight of integration with xarray like xESMF and Verde do. If working with satellite or swath data, this is not one to miss! This package integrates with Satpy https://satpy.readthedocs.io/en/stable/ .
(more) Integration with xarray
This is on the to-do list: https://pyresample.readthedocs.io/en/latest/roadmap.html?highlight=xarray#xarray-and-geoxarray
Prerequisites
Knowing your way around xarray, numpy is beneficial. This is not deisgned to be an introduction to any of those packages. Would do this notebook after doing the xESMF one!
Imports
import pandas as pd
import numpy as np
import xarray as xr
import matplotlib.pyplot as plt
from appdirs import *
import dask.array as da
import pyresample
from pyresample import image, geometry
from pyresample.bilinear import NumpyBilinearResampler
from xarray import DataArray
from pyresample.bilinear import XArrayBilinearResampler
import os
%load_ext watermark
%watermark --iversions
pandas : 2.2.2
dask : 2024.5.2
numpy : 1.26.4
matplotlib: 3.8.4
sys : 3.10.14 | packaged by conda-forge | (main, Mar 20 2024, 12:45:18) [GCC 12.3.0]
pyresample: 1.28.3
xarray : 2024.5.0
Loading in one netCDF
file = '../data/onestorm.nc'
Let’s open this file with xarray:
ds = xr.open_dataset(file)
ds
<xarray.Dataset> Size: 39MB Dimensions: (x: 768, y: 768, t: 12, x2: 192, y2: 192, x3: 384, y3: 384, x4: 48, y4: 48) Dimensions without coordinates: x, y, t, x2, y2, x3, y3, x4, y4 Data variables: visible (x, y, t) float32 28MB ... water_vapor (x2, y2, t) float32 2MB ... clean_infrared (x2, y2, t) float32 2MB ... vil (x3, y3, t) float32 7MB ... lightning_flashes (x4, y4, t) float32 111kB ... Attributes: t: time dimension of all images. These are 5-min time steps x: x pixel dimension of the visible imagery y: y pixel dimension of the visible imagery x2: x pixel dimension of the water vapor and infrared imagery y2: y pixel dimension the water vapor and infrared imagery x3: x pixel dimension of the vertically integrated liquid imagery y3: y pixel dimension the vertically integrated liquid imagery x4: x pixel dimension of the lightning flashes y4: y pixel dimension of the lightning flashes
Trying to do the same thing with pyresample:
from pyresample.utils import load_cf_area
area_def, cf_info = load_cf_area('data/onestorm.nc', variable='visible', x='x', y='y')
---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/file_manager.py:211, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
210 try:
--> 211 file = self._cache[self._key]
212 except KeyError:
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/lru_cache.py:56, in LRUCache.__getitem__(self, key)
55 with self._lock:
---> 56 value = self._cache[key]
57 self._cache.move_to_end(key)
KeyError: [<class 'netCDF4._netCDF4.Dataset'>, ('/home/runner/work/gridding-cookbook/gridding-cookbook/notebooks/data/onestorm.nc',), 'r', (('clobber', True), ('diskless', False), ('format', 'NETCDF4'), ('persist', False)), '4974837f-36ba-4276-85ad-b1cae336923b']
During handling of the above exception, another exception occurred:
FileNotFoundError Traceback (most recent call last)
Cell In[6], line 1
----> 1 area_def, cf_info = load_cf_area('data/onestorm.nc', variable='visible', x='x', y='y')
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/pyresample/utils/cf.py:444, in load_cf_area(nc_file, variable, y, x)
441 if (x is not None and y is None) or (x is None and y is not None):
442 raise ValueError("You must specify both or neither of x= and y=")
--> 444 nc_handle = _open_nc_file(nc_file)
445 if variable is None:
446 # if the variable=None, we search through all variables
447 area_def, cf_info = _load_cf_area_several_variables(nc_handle)
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/pyresample/utils/cf.py:478, in _open_nc_file(nc_file)
475 if isinstance(nc_file, xr.Dataset):
476 return nc_file
--> 478 return xr.open_dataset(nc_file)
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/api.py:571, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
559 decoders = _resolve_decoders_kwargs(
560 decode_cf,
561 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...)
567 decode_coords=decode_coords,
568 )
570 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 571 backend_ds = backend.open_dataset(
572 filename_or_obj,
573 drop_variables=drop_variables,
574 **decoders,
575 **kwargs,
576 )
577 ds = _dataset_from_backend_dataset(
578 backend_ds,
579 filename_or_obj,
(...)
589 **kwargs,
590 )
591 return ds
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:646, in NetCDF4BackendEntrypoint.open_dataset(self, filename_or_obj, mask_and_scale, decode_times, concat_characters, decode_coords, drop_variables, use_cftime, decode_timedelta, group, mode, format, clobber, diskless, persist, lock, autoclose)
625 def open_dataset( # type: ignore[override] # allow LSP violation, not supporting **kwargs
626 self,
627 filename_or_obj: str | os.PathLike[Any] | BufferedIOBase | AbstractDataStore,
(...)
643 autoclose=False,
644 ) -> Dataset:
645 filename_or_obj = _normalize_path(filename_or_obj)
--> 646 store = NetCDF4DataStore.open(
647 filename_or_obj,
648 mode=mode,
649 format=format,
650 group=group,
651 clobber=clobber,
652 diskless=diskless,
653 persist=persist,
654 lock=lock,
655 autoclose=autoclose,
656 )
658 store_entrypoint = StoreBackendEntrypoint()
659 with close_on_error(store):
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:409, in NetCDF4DataStore.open(cls, filename, mode, format, group, clobber, diskless, persist, lock, lock_maker, autoclose)
403 kwargs = dict(
404 clobber=clobber, diskless=diskless, persist=persist, format=format
405 )
406 manager = CachingFileManager(
407 netCDF4.Dataset, filename, mode=mode, kwargs=kwargs
408 )
--> 409 return cls(manager, group=group, mode=mode, lock=lock, autoclose=autoclose)
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:356, in NetCDF4DataStore.__init__(self, manager, group, mode, lock, autoclose)
354 self._group = group
355 self._mode = mode
--> 356 self.format = self.ds.data_model
357 self._filename = self.ds.filepath()
358 self.is_remote = is_remote_uri(self._filename)
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:418, in NetCDF4DataStore.ds(self)
416 @property
417 def ds(self):
--> 418 return self._acquire()
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/netCDF4_.py:412, in NetCDF4DataStore._acquire(self, needs_lock)
411 def _acquire(self, needs_lock=True):
--> 412 with self._manager.acquire_context(needs_lock) as root:
413 ds = _nc4_require_group(root, self._group, self._mode)
414 return ds
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/contextlib.py:135, in _GeneratorContextManager.__enter__(self)
133 del self.args, self.kwds, self.func
134 try:
--> 135 return next(self.gen)
136 except StopIteration:
137 raise RuntimeError("generator didn't yield") from None
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/file_manager.py:199, in CachingFileManager.acquire_context(self, needs_lock)
196 @contextlib.contextmanager
197 def acquire_context(self, needs_lock=True):
198 """Context manager for acquiring a file."""
--> 199 file, cached = self._acquire_with_cache_info(needs_lock)
200 try:
201 yield file
File ~/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/xarray/backends/file_manager.py:217, in CachingFileManager._acquire_with_cache_info(self, needs_lock)
215 kwargs = kwargs.copy()
216 kwargs["mode"] = self._mode
--> 217 file = self._opener(*self._args, **kwargs)
218 if self._mode == "w":
219 # ensure file doesn't get overridden when opened again
220 self._mode = "a"
File src/netCDF4/_netCDF4.pyx:2469, in netCDF4._netCDF4.Dataset.__init__()
File src/netCDF4/_netCDF4.pyx:2028, in netCDF4._netCDF4._ensure_nc_success()
FileNotFoundError: [Errno 2] No such file or directory: '/home/runner/work/gridding-cookbook/gridding-cookbook/notebooks/data/onestorm.nc'
This is supposed to fail. Will chat about pro’s and con’s in the summary.
Resampling of gridded data using pyresample
Link to this turtorial is here: https://pyresample.readthedocs.io/en/latest/swath.html#pyresample-bilinear
We will be deconstructing it a bit to get into the details, but all of the code is from the above link.
target_def = geometry.AreaDefinition('areaD',
'Europe (3km, HRV, VTC)',
'areaD',
{'a': '6378144.0', 'b': '6356759.0',
'lat_0': '50.00', 'lat_ts': '50.00',
'lon_0': '8.00', 'proj': 'stere'},
800, 800,
[-1370912.72, -909968.64,
1029087.28, 1490031.36])
Unlike using xESMF, this does not depend or work with xarray:
print('target def type', type(target_def))
target def type <class 'pyresample.geometry.AreaDefinition'>
data = DataArray(da.from_array(np.fromfunction(lambda y, x: y*x, (500, 100))), dims=('y', 'x'))
type(data)
xarray.core.dataarray.DataArray
lons = da.from_array(np.fromfunction(lambda y, x: 3 + x * 0.1, (500, 100)))
lats = da.from_array(np.fromfunction(lambda y, x: 75 - y * 0.1, (500, 100)))
source_def = geometry.SwathDefinition(lons=lons, lats=lats)
resampler = XArrayBilinearResampler(source_def, target_def, 30e3)
result = resampler.resample(data)
type(result)
/home/runner/miniconda3/envs/gridding-cookbook-dev/lib/python3.10/site-packages/pyproj/crs/crs.py:1293: UserWarning: You will likely lose important projection information when converting to a PROJ string from another format. See: https://proj.org/faq.html#what-is-the-best-format-for-describing-coordinate-reference-systems
proj = self._crs.to_proj4(version=version)
xarray.core.dataarray.DataArray
Can export to xarray
result.to_dataset()
<xarray.Dataset> Size: 5MB Dimensions: (x: 800, y: 800) Coordinates: * x (x) float64 6kB -1.369e+06 ... ... * y (y) float64 6kB 1.489e+06 ... -... Data variables: reshape-c1dd57ad960628aa19ec19ddee8ffa66 (y, x) float64 5MB dask.array<chunksize=(800, 800), meta=np.ndarray>
data.to_dataset()
<xarray.Dataset> Size: 400kB Dimensions: (y: 500, x: 100) Dimensions without coordinates: y, x Data variables: array-76c109148983aaf79e69a6d8292615f3 (y, x) float64 400kB dask.array<chunksize=(500, 100), meta=np.ndarray>
Summary
Pyresample is a speciallist program, with strong functionality with satpy. Would reccomend if swath/sat image data is part of your normal workflow. For others, the requirement of the data being CF compliant and API is a hurdle.