Skip to article frontmatterSkip to article content

Use xrefcoord to Generate Coordinates

Use xrefcoord to Generate Coordinates

When using Kerchunk to generate reference datasets for GeoTIFF’s, only the dimensions are preserved. xrefcoord is a small utility that allows us to generate coordinates for these reference datasets using the geospatial metadata. Similar to other accessor add-on libraries for Xarray such as rioxarray and xwrf, xrefcord provides an accessor for an Xarray dataset. Importing xrefcoord allows us to use the .xref accessor to access additional methods.

In this tutorial we will use the generate_coords method to build coordinates for the Xarray dataset. xrefcoord is very experimental and makes assumptions about the underlying data, such as each variable shares the same dimensions etc. Use with caution!

Overview

Within this notebook, we will cover:

  1. How to load a Kerchunk reference dataset created from a collection of GeoTIFFs
  2. How to use xrefcoord to generate coordinates from a GeoTIFF reference dataset

Prerequisites

ConceptsImportanceNotes
Kerchunk BasicsRequiredCore
Xarray TutorialRequiredCore
  • Time to learn: 45 minutes

import xarray as xr
import xrefcoord  # noqa

storage_options = {
    "remote_protocol": "s3",
    "skip_instance_cache": True,
    "remote_options": {"anon": True}
}  # options passed to fsspec
open_dataset_options = {"chunks": {}}  # opens passed to xarray

ds = xr.open_dataset(
    "references/RADAR.json",
    engine="kerchunk",
    storage_options=storage_options,
    open_dataset_options=open_dataset_options,
)
# Generate coordinates from reference dataset
ref_ds = ds.xref.generate_coords(time_dim_name="time", x_dim_name="X", y_dim_name="Y")
# Rename to rain accumulation in 24 hour period
ref_ds = ref_ds.rename({"0": "rr24h"})
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[2], line 2
      1 # Generate coordinates from reference dataset
----> 2 ref_ds = ds.xref.generate_coords(time_dim_name="time", x_dim_name="X", y_dim_name="Y")
      3 # Rename to rain accumulation in 24 hour period
      4 ref_ds = ref_ds.rename({"0": "rr24h"})

File ~/micromamba/envs/kerchunk-cookbook/lib/python3.13/site-packages/xrefcoord/accessors.py:46, in XRefDatasetAccessor.generate_coords(self, time_dim_name, x_dim_name, y_dim_name)
     43 shape = _get_shape(self.xarray_obj)
     45 # Generate coords
---> 46 coord_dict = _generate_coords(self.xarray_obj.attrs, shape)
     48 # Drop extra multiscale dims
     49 ds = _drop_dims(
     50     ds=self.xarray_obj,
     51     time_dim_name=time_dim_name,
     52     x_dim_name=x_dim_name,
     53     y_dim_name=y_dim_name,
     54 )

File ~/micromamba/envs/kerchunk-cookbook/lib/python3.13/site-packages/xrefcoord/coords.py:37, in _generate_coords(attrs, shape)
     24 """Produce coordinate arrays for given variable
     25 
     26 Specific to GeoTIFF input attributes
   (...)     33     The array size in numpy (C) order
     34 """
     35 import numpy as np
---> 37 height, width = shape[-2:]
     38 xscale, yscale, zscale = attrs["ModelPixelScale"][:3]
     39 x0, y0, z0 = attrs["ModelTiepoint"][3:6]

ValueError: not enough values to unpack (expected 2, got 1)

Create a Map

Here we are using Xarray to select a single time slice and create a map of 24 hour accumulated rainfall.

ref_ds["rr24h"].where(ref_ds.rr24h < 60000).isel(time=0).plot(robust=True)

Create a Time-Series

Next we are plotting accumulated rain as a function of time for a specific point.

ref_ds["rr24h"][:, 700, 700].plot()