Skip to article frontmatterSkip to article content

Load CMIP6 Data with Intake-ESM


Overview

Intake-ESM is an experimental new package that aims to provide a higher-level interface to searching and loading Earth System Model data archives, such as CMIP6. The package is under very active development, and features may be unstable. Please report any issues or suggestions on GitHub.

Prerequisites

ConceptsImportanceNotes
Intro to XarrayNecessary
Understanding of NetCDFHelpfulFamiliarity with metadata structure
  • Time to learn: 5 minutes

Imports

import xarray as xr
xr.set_options(display_style='html')
import intake
%matplotlib inline

Loading Data

Intake ESM works by parsing an ESM Collection Spec and converting it to an Intake catalog. The collection spec is stored in a .json file. Here we open it using Intake.

cat_url = "https://storage.googleapis.com/cmip6/pangeo-cmip6.json"
col = intake.open_esm_datastore(cat_url)
col
Loading...

We can now use Intake methods to search the collection, and, if desired, export a Pandas dataframe.

cat = col.search(experiment_id=['historical', 'ssp585'], table_id='Oyr', variable_id='o2',
                 grid_label='gn')
cat.df
Loading...

Intake knows how to automatically open the Datasets using Xarray. Furthermore, Intake-ESM contains special logic to concatenate and merge the individual results of our query into larger, more high-level aggregated Xarray Datasets.

dset_dict = cat.to_dataset_dict(zarr_kwargs={'consolidated': True})
list(dset_dict.keys())
Loading...
ds = dset_dict['CMIP.CCCma.CanESM5.historical.Oyr.gn']
ds
Loading...

Summary

In this notebook, we used Intake-ESM to open an Xarray Dataset for one particular model and experiment.

What’s next?

We will see an example of downloading a dataset with fsspec and zarr.

Resources and references