Load Kerchunked dataset with Xarray
Overview
Within this notebook, we will cover:
How to load a Kerchunk pre-generated reference file into Xarray as if it were a Zarr store.
Prerequisites
Concepts |
Importance |
Notes |
---|---|---|
Required |
Core |
|
Required |
Core |
Time to learn: 45 minutes
Opening Reference Dataset with Fsspec and Xarray
One way of using our reference dataset is opening it with Xarray
. To do this, we will create an fsspec
filesystem and pass it to Xarray
.
# create an fsspec reference filesystem from the Kerchunk output
import fsspec
import xarray as xr
fs = fsspec.filesystem(
"reference",
fo="references/ARG_combined.json",
remote_protocol="s3",
remote_options={"anon": True},
skip_instance_cache=True,
)
m = fs.get_mapper("")
ds = xr.open_dataset(m, engine="zarr", backend_kwargs={"consolidated": False})
Opening Reference Dataset with Xarray and the Kerchunk
Engine
As of writing, the latest version of Kerchunk supports opening an reference dataset with Xarray without specifically creating an fsspec filesystem. This is the same behavior as the example above, just a few less lines of code.
storage_options = {
"remote_protocol": "s3",
"skip_instance_cache": True,
"remote_options": {"anon": True}
} # options passed to fsspec
open_dataset_options = {"chunks": {}} # opens passed to xarray
ds = xr.open_dataset(
"references/ARG_combined.json",
engine="kerchunk",
storage_options=storage_options,
open_dataset_options=open_dataset_options,
)