Use xrefcoord
to Generate Coordinates¶
When using Kerchunk
to generate reference datasets for GeoTIFF’s, only the dimensions are preserved. xrefcoord
is a small utility that allows us to generate coordinates for these reference datasets using the geospatial metadata. Similar to other accessor add-on libraries for Xarray
such as rioxarray
and xwrf
, xrefcord
provides an accessor for an Xarray
dataset. Importing xrefcoord
allows us to use the .xref
accessor to access additional methods.
In this tutorial we will use the generate_coords
method to build coordinates for the Xarray
dataset. xrefcoord
is very experimental and makes assumptions about the underlying data, such as each variable shares the same dimensions etc. Use with caution!
Overview¶
Within this notebook, we will cover:
- How to load a Kerchunk reference dataset created from a collection of GeoTIFFs
- How to use
xrefcoord
to generate coordinates from a GeoTIFF reference dataset
Prerequisites¶
Concepts | Importance | Notes |
---|---|---|
Kerchunk Basics | Required | Core |
Xarray Tutorial | Required | Core |
- Time to learn: 45 minutes
import xarray as xr
import xrefcoord # noqa
storage_options = {
"remote_protocol": "s3",
"skip_instance_cache": True,
"remote_options": {"anon": True}
} # options passed to fsspec
open_dataset_options = {"chunks": {}} # opens passed to xarray
ds = xr.open_dataset(
"references/RADAR.json",
engine="kerchunk",
storage_options=storage_options,
open_dataset_options=open_dataset_options,
)
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[1], line 11
4 storage_options = {
5 "remote_protocol": "s3",
6 "skip_instance_cache": True,
7 "remote_options": {"anon": True}
8 } # options passed to fsspec
9 open_dataset_options = {"chunks": {}} # opens passed to xarray
---> 11 ds = xr.open_dataset(
12 "references/RADAR.json",
13 engine="kerchunk",
14 storage_options=storage_options,
15 open_dataset_options=open_dataset_options,
16 )
File ~/micromamba/envs/kerchunk-cookbook/lib/python3.13/site-packages/xarray/backends/api.py:687, in open_dataset(filename_or_obj, engine, chunks, cache, decode_cf, mask_and_scale, decode_times, decode_timedelta, use_cftime, concat_characters, decode_coords, drop_variables, inline_array, chunked_array_type, from_array_kwargs, backend_kwargs, **kwargs)
675 decoders = _resolve_decoders_kwargs(
676 decode_cf,
677 open_backend_dataset_parameters=backend.open_dataset_parameters,
(...) 683 decode_coords=decode_coords,
684 )
686 overwrite_encoded_chunks = kwargs.pop("overwrite_encoded_chunks", None)
--> 687 backend_ds = backend.open_dataset(
688 filename_or_obj,
689 drop_variables=drop_variables,
690 **decoders,
691 **kwargs,
692 )
693 ds = _dataset_from_backend_dataset(
694 backend_ds,
695 filename_or_obj,
(...) 705 **kwargs,
706 )
707 return ds
File ~/micromamba/envs/kerchunk-cookbook/lib/python3.13/site-packages/kerchunk/xarray_backend.py:13, in KerchunkBackend.open_dataset(self, filename_or_obj, storage_options, open_dataset_options, **kw)
9 def open_dataset(
10 self, filename_or_obj, *, storage_options=None, open_dataset_options=None, **kw
11 ):
12 open_dataset_options = (open_dataset_options or {}) | kw
---> 13 ref_ds = open_reference_dataset(
14 filename_or_obj,
15 storage_options=storage_options,
16 open_dataset_options=open_dataset_options,
17 )
18 return ref_ds
File ~/micromamba/envs/kerchunk-cookbook/lib/python3.13/site-packages/kerchunk/xarray_backend.py:45, in open_reference_dataset(filename_or_obj, storage_options, open_dataset_options)
42 if open_dataset_options is None:
43 open_dataset_options = {}
---> 45 store = refs_as_store(filename_or_obj, **storage_options)
47 return xr.open_zarr(
48 store, zarr_format=2, consolidated=False, **open_dataset_options
49 )
TypeError: refs_as_store() got an unexpected keyword argument 'skip_instance_cache'
# Generate coordinates from reference dataset
ref_ds = ds.xref.generate_coords(time_dim_name="time", x_dim_name="X", y_dim_name="Y")
# Rename to rain accumulation in 24 hour period
ref_ds = ref_ds.rename({"0": "rr24h"})
Create a Map¶
Here we are using Xarray
to select a single time slice and create a map of 24 hour accumulated rainfall.
ref_ds["rr24h"].where(ref_ds.rr24h < 60000).isel(time=0).plot(robust=True)
Create a Time-Series¶
Next we are plotting accumulated rain as a function of time for a specific point.
ref_ds["rr24h"][:, 700, 700].plot()