Load CMIP6 Data with Intake-ESM

Overview¶

Intake-ESM is an experimental new package that aims to provide a higher-level interface to searching and loading Earth System Model data archives, such as CMIP6. The package is under very active development, and features may be unstable. Please report any issues or suggestions on GitHub.

Prerequisites¶

Concepts	Importance	Notes
Intro to Xarray	Necessary
Understanding of NetCDF	Helpful	Familiarity with metadata structure

Time to learn: 5 minutes

Imports¶

import xarray as xr
xr.set_options(display_style='html')
import intake
%matplotlib inline

Loading Data¶

Intake ESM works by parsing an ESM Collection Spec and converting it to an Intake catalog. The collection spec is stored in a .json file. Here we open it using Intake.

cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(cat_url)
col

We can now use Intake methods to search the collection, and, if desired, export a Pandas dataframe.

cat = col.search(experiment_id=['historical', 'ssp585'], table_id='Oyr', variable_id='o2',
                 grid_label='gn')
cat.df

Intake knows how to automatically open the Datasets using Xarray. Furthermore, Intake-ESM contains special logic to concatenate and merge the individual results of our query into larger, more high-level aggregated Xarray Datasets.

dset_dict = cat.to_dataset_dict(zarr_kwargs={'consolidated': True})
list(dset_dict.keys())

ds = dset_dict['CMIP.CCCma.CanESM5.historical.Oyr.gn']
ds

Summary¶

In this notebook, we used Intake-ESM to open an Xarray Dataset for one particular model and experiment.

What’s next?¶

We will see an example of downloading a dataset with fsspec and zarr.

Resources and references¶

Original notebook in the Pangeo Gallery by Henri Drake and Ryan Abernathey

Preamble

How to Cite This Cookbook

Foundations

Google Cloud CMIP6 Public Data: Basic Python Example