Overview¶
This notebook is an adpation of a workflow in the NCAR gallery of the Pangeo collection
This notebook illustrates how to compute surface ocean heat content using potential temperature data from CESM2 Large Ensemble Dataset (Community Earth System Model 2) hosted on NCAR’s GDEX.
This data is open access and is accessed via OSDF using the pelicanFS package and demonstrates how you can stream data from NCAR’s GDEX
Please refer to the first chapter of this cookbook to learn more about OSDF, pelican or pelicanFS
Prerequisites¶
| Concepts | Importance | Notes |
|---|---|---|
| Intro to Intake-ESM | Necessary | Used for searching CMIP6 data |
| Understanding of Zarr | Helpful | Familiarity with metadata structure |
| Matplotlib | Helpful | Package used for plotting |
| PelicanFS | Necessary | The python package used to stream data in this notebook |
| OSDF | Helpful | OSDF is used to stream data in this notebook |
Time to learn: 20 mins
Imports¶
import intake
import numpy as np
import pandas as pd
import xarray as xr
import seaborn as sns
import re
import matplotlib.pyplot as plt
import dask
from dask.distributed import LocalCluster
import pelicanfs
import cf_units as cf# Load Catalog URL
cat_url = 'https://osdata.gdex.ucar.edu/d010092/catalogs/d010092-osdf.json'Set up local dask cluster¶
Before we do any computation let us first set up a local cluster using dask
cluster = LocalCluster()
client = cluster.get_client()# Scale the cluster
n_workers = 5
cluster.scale(n_workers)
clusterData Loading¶
Load CESM2 LENS data from NCAR’s GDEX¶
Load CESM2 LENS zarr data from GDEx using an intake-ESM catalog that has OSDF links
For more details regarding the dataset. See, https://
gdex .ucar .edu /datasets /d010092 /#
col = intake.open_esm_datastore(cat_url)
col# Uncomment this line to see all the variables
# cesm_cat.df['variable'].valuescesm_temp = col.search(variable ='TEMP', frequency ='monthly')
cesm_tempcesm_temp.df['path'].valuesarray(['osdf:///ncar-gdex/d010092/ocn/monthly/cesm2LE-historical-cmip6-TEMP.zarr',
'osdf:///ncar-gdex/d010092/ocn/monthly/cesm2LE-ssp370-cmip6-TEMP.zarr',
'osdf:///ncar-gdex/d010092/ocn/monthly/cesm2LE-ssp370-smbb-TEMP.zarr'],
dtype=object)dsets_cesm = cesm_temp.to_dataset_dict()
--> The keys in the returned dictionary of datasets are constructed as follows:
'component.experiment.frequency.forcing_variant'
dsets_cesm.keys()dict_keys(['ocn.ssp370.monthly.smbb', 'ocn.ssp370.monthly.cmip6', 'ocn.historical.monthly.cmip6'])historical = dsets_cesm['ocn.historical.monthly.cmip6']
future_smbb = dsets_cesm['ocn.ssp370.monthly.smbb']
future_cmip6 = dsets_cesm['ocn.ssp370.monthly.cmip6']Change units¶
orig_units = cf.Unit(historical.z_t.attrs['units'])
orig_unitsUnit('centimeters')def change_units(ds, variable_str, variable_bounds_str, target_unit_str):
orig_units = cf.Unit(ds[variable_str].attrs['units'])
target_units = cf.Unit(target_unit_str)
variable_in_new_units = xr.apply_ufunc(orig_units.convert, ds[variable_bounds_str], target_units, dask='parallelized', output_dtypes=[ds[variable_bounds_str].dtype])
return variable_in_new_unitsdepth_levels_in_m = change_units(historical, 'z_t', 'z_t', 'm')
hist_temp_in_degK = change_units(historical, 'TEMP', 'TEMP', 'degK')
fut_cmip6_temp_in_degK = change_units(future_cmip6, 'TEMP', 'TEMP', 'degK')
fut_smbb_temp_in_degK = change_units(future_smbb, 'TEMP', 'TEMP', 'degK')
#
hist_temp_in_degK = hist_temp_in_degK.assign_coords(z_t=("z_t", depth_levels_in_m['z_t'].data))
hist_temp_in_degK["z_t"].attrs["units"] = "m"
hist_temp_in_degKdepth_levels_in_m.isel(z_t=slice(0, -1))Compute depth level deltas using the difference of z_t levels
depth_level_deltas = depth_levels_in_m.isel(z_t=slice(1, None)).values - depth_levels_in_m.isel(z_t=slice(0, -1)).values
# Optionally, if you want to keep it as an xarray DataArray, re-wrap the result
depth_level_deltas = xr.DataArray(depth_level_deltas, dims=["z_t"], coords={"z_t": depth_levels_in_m.z_t.isel(z_t=slice(0, -1))})
depth_level_deltas Compute Ocean Heat content for ocean surface¶
Ocean surface is considered to be the top 100m
The formula for this is:
Where H is ocean heat content, the value we are trying to calculate,
is the density of sea water, ,
is the specific heat of sea water, ,
is the depth limit of the calculation in meters,
and is the temperature at each depth in degrees Kelvin.
def calc_ocean_heat(delta_level, temperature):
rho = 1026 #kg/m^3
c_p = 3990 #J/(kg K)
weighted_temperature = delta_level * temperature
heat = weighted_temperature.sum(dim="z_t")*rho*c_p
return heat# Remember that the coordinate z_t still has values in cm
hist_temp_ocean_surface = hist_temp_in_degK.where(hist_temp_in_degK['z_t'] < 1e4,drop=True)
hist_temp_ocean_surfacedepth_level_deltas_surface = depth_level_deltas.where(depth_level_deltas['z_t'] <1e4, drop= True)
depth_level_deltas_surfacehist_ocean_heat = calc_ocean_heat(depth_level_deltas_surface,hist_temp_ocean_surface)
hist_ocean_heatPlot Ocean Heat¶
%%time
# Compute annual and ensemble mean
hist_oceanheat_ann_mean = hist_ocean_heat.mean('member_id').groupby('time.year').mean()
hist_oceanheat_ann_mean CPU times: user 348 ms, sys: 19 ms, total: 367 ms
Wall time: 352 ms
hist_oceanheat_ano = \
hist_oceanheat_ann_mean.sel(year=2014) - hist_oceanheat_ann_mean.sel(year=1850)%%time
hist_oceanheat_ano.plot()2026-06-04 04:16:49,276 - distributed.worker - ERROR - Compute Failed
Key: ('convert-convert_0-transpose-e6fb9c836fd2c42c20255265d80299c0', 47, 329, 0, 0, 0)
State: executing
Task: <Task ('convert-convert_0-transpose-e6fb9c836fd2c42c20255265d80299c0', 47, 329, 0, 0, 0) _execute_subgraph(...)>
Exception: 'ClientPayloadError("Response payload is not completed: <ContentLengthError: 400, message=\'Not enough data to satisfy content length header.\'>")'
Traceback: ' File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/dask/array/core.py", line 140, in getter\n c = np.asarray(c)\n ^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py", line 686, in __array__\n return np.asarray(self.get_duck_array(), dtype=dtype, copy=copy)\n ^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py", line 691, in get_duck_array\n return self.array.get_duck_array()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py", line 924, in get_duck_array\n return self.array.get_duck_array()\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/coding/common.py", line 80, in get_duck_array\n return self.func(self.array.get_duck_array())\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py", line 764, in get_duck_array\n array = self.array[self.key]\n ~~~~~~~~~~^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/backends/zarr.py", line 316, in __getitem__\n return indexing.explicit_indexing_adapter(\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py", line 1156, in explicit_indexing_adapter\n result = raw_indexing_method(raw_key.tuple)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/backends/zarr.py", line 279, in _getitem\n return self._array[key]\n ~~~~~~~~~~~^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py", line 798, in __getitem__\n result = self.get_orthogonal_selection(pure_selection, fields=fields)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py", line 1080, in get_orthogonal_selection\n return self._get_selection(indexer=indexer, out=out, fields=fields)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py", line 1343, in _get_selection\n self._chunk_getitems(\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py", line 2179, in _chunk_getitems\n cdatas = self.chunk_store.getitems(ckeys, contexts=contexts)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/storage.py", line 1435, in getitems\n raise v\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/asyn.py", line 244, in _run_coro\n return await asyncio.wait_for(coro, timeout=timeout), i\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/asyncio/tasks.py", line 520, in wait_for\n return await fut\n ^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/implementations/http.py", line 249, in _cat_file\n out = await r.read()\n ^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 693, in read\n self._body = await self.content.read()\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/streams.py", line 446, in read\n block = await self.readany()\n ^^^^^^^^^^^^^^^^^^^^\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/streams.py", line 468, in readany\n await self._wait("readany")\n File "/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/streams.py", line 372, in _wait\n await waiter\n'
CPU times: user 4.87 s, sys: 843 ms, total: 5.71 s
Wall time: 59.8 s
---------------------------------------------------------------------------
ContentLengthError Traceback (most recent call last)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/client_proto.py:144, in connection_lost()
143 try:
--> 144 uncompleted = self._parser.feed_eof()
145 except Exception as underlying_exc:
File aiohttp/_http_parser.pyx:546
--> 546 def feed_eof(self):
547 cdef bytes desc
ContentLengthError: 400, message:
Not enough data to satisfy content length header.
The above exception was the direct cause of the following exception:
ClientPayloadError Traceback (most recent call last)
Cell In[23], line 1
----> 1 get_ipython().run_cell_magic('time', '', 'hist_oceanheat_ano.plot()\n')
File <timed eval>:1
----> 1 'Could not get source, probably due dynamically evaluated source code.'
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/plot/accessor.py:48, in DataArrayPlotAccessor.__call__(self, **kwargs)
46 @functools.wraps(dataarray_plot.plot, assigned=("__doc__", "__annotations__"))
47 def __call__(self, **kwargs) -> Any:
---> 48 return dataarray_plot.plot(self._da, **kwargs)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/plot/dataarray_plot.py:279, in plot(darray, row, col, col_wrap, ax, hue, subplot_kws, **kwargs)
226 def plot(
227 darray: DataArray,
228 *,
(...) 235 **kwargs: Any,
236 ) -> Any:
237 """
238 Default plot of DataArray using :py:mod:`matplotlib:matplotlib.pyplot`.
239
(...) 275 xarray.DataArray.squeeze
276 """
277 darray = darray.squeeze(
278 d for d, s in darray.sizes.items() if s == 1 and d not in (row, col, hue)
--> 279 ).compute()
281 plot_dims = set(darray.dims)
282 plot_dims.discard(row)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/dataarray.py:1247, in DataArray.compute(self, **kwargs)
1217 """Trigger loading data into memory and return a new dataarray.
1218
1219 Data will be computed and/or loaded from disk or a remote source.
(...) 1244 Variable.compute
1245 """
1246 new = self.copy(deep=False)
-> 1247 return new.load(**kwargs)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/dataarray.py:1173, in DataArray.load(self, **kwargs)
1143 def load(self, **kwargs) -> Self:
1144 """Trigger loading data into memory and return this dataarray.
1145
1146 Data will be computed and/or loaded from disk or a remote source.
(...) 1171 Variable.load
1172 """
-> 1173 ds = self._to_temp_dataset().load(**kwargs)
1174 new = self._from_temp_dataset(ds)
1175 self._variable = new._variable
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/dataset.py:569, in Dataset.load(self, **kwargs)
565 if chunked_data:
566 chunkmanager = get_chunked_array_type(*chunked_data.values())
567
568 # evaluate all the chunked arrays simultaneously
--> 569 evaluated_data: tuple[np.ndarray[Any, Any], ...] = chunkmanager.compute(
570 *chunked_data.values(), **kwargs
571 )
572
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/namedarray/daskmanager.py:85, in DaskManager.compute(self, *data, **kwargs)
80 def compute(
81 self, *data: Any, **kwargs: Any
82 ) -> tuple[np.ndarray[Any, _DType_co], ...]:
83 from dask.array import compute
---> 85 return compute(*data, **kwargs)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/dask/base.py:685, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
682 expr = expr.optimize()
683 keys = list(flatten(expr.__dask_keys__()))
--> 685 results = schedule(expr, keys, **kwargs)
687 return repack(results)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py:686, in __array__()
682 def __array__(
683 self, dtype: DTypeLike | None = None, /, *, copy: bool | None = None
684 ) -> np.ndarray:
685 if Version(np.__version__) >= Version("2.0.0"):
--> 686 return np.asarray(self.get_duck_array(), dtype=dtype, copy=copy)
687 else:
688 return np.asarray(self.get_duck_array(), dtype=dtype)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py:691, in get_duck_array()
690 def get_duck_array(self):
--> 691 return self.array.get_duck_array()
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py:924, in get_duck_array()
923 def get_duck_array(self):
--> 924 return self.array.get_duck_array()
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/coding/common.py:80, in get_duck_array()
79 def get_duck_array(self):
---> 80 return self.func(self.array.get_duck_array())
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py:764, in get_duck_array()
761 from xarray.backends.common import BackendArray
763 if isinstance(self.array, BackendArray):
--> 764 array = self.array[self.key]
765 else:
766 array = apply_indexer(self.array, self.key)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/backends/zarr.py:316, in __getitem__()
314 elif isinstance(key, indexing.OuterIndexer):
315 method = self._oindex
--> 316 return indexing.explicit_indexing_adapter(
317 key, array.shape, indexing.IndexingSupport.VECTORIZED, method
318 )
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/core/indexing.py:1156, in explicit_indexing_adapter()
1134 """Support explicit indexing by delegating to a raw indexing method.
1135
1136 Outer and/or vectorized indexers are supported by indexing a second time
(...) 1153 Indexing result, in the form of a duck numpy-array.
1154 """
1155 raw_key, numpy_indices = decompose_indexer(key, shape, indexing_support)
-> 1156 result = raw_indexing_method(raw_key.tuple)
1157 if numpy_indices.tuple:
1158 # index the loaded duck array
1159 indexable = as_indexable(result)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/xarray/backends/zarr.py:279, in _getitem()
278 def _getitem(self, key):
--> 279 return self._array[key]
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py:798, in __getitem__()
796 result = self.vindex[selection]
797 elif is_pure_orthogonal_indexing(pure_selection, self.ndim):
--> 798 result = self.get_orthogonal_selection(pure_selection, fields=fields)
799 else:
800 result = self.get_basic_selection(pure_selection, fields=fields)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py:1080, in get_orthogonal_selection()
1077 # setup indexer
1078 indexer = OrthogonalIndexer(selection, self)
-> 1080 return self._get_selection(indexer=indexer, out=out, fields=fields)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py:1343, in _get_selection()
1340 if math.prod(out_shape) > 0:
1341 # allow storage to get multiple items at once
1342 lchunk_coords, lchunk_selection, lout_selection = zip(*indexer)
-> 1343 self._chunk_getitems(
1344 lchunk_coords,
1345 lchunk_selection,
1346 out,
1347 lout_selection,
1348 drop_axes=indexer.drop_axes,
1349 fields=fields,
1350 )
1351 if out.shape:
1352 return out
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/core.py:2179, in _chunk_getitems()
2177 if not isinstance(self._meta_array, np.ndarray):
2178 contexts = ConstantMap(ckeys, constant=Context(meta_array=self._meta_array))
-> 2179 cdatas = self.chunk_store.getitems(ckeys, contexts=contexts)
2181 for ckey, chunk_select, out_select in zip(ckeys, lchunk_selection, lout_selection):
2182 if ckey in cdatas:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/zarr/storage.py:1435, in getitems()
1432 continue
1433 elif isinstance(v, Exception):
1434 # Raise any other exception
-> 1435 raise v
1436 else:
1437 # The function calling this method may not recognize the transformed
1438 # keys, so we send the values returned by self.map.getitems back into
1439 # the original key space.
1440 results[keys_transformed[k]] = v
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/asyn.py:244, in _run_coro()
242 async def _run_coro(coro, i):
243 try:
--> 244 return await asyncio.wait_for(coro, timeout=timeout), i
245 except Exception as e:
246 if not return_exceptions:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/asyncio/tasks.py:520, in wait_for()
517 raise TimeoutError from exc
519 async with timeouts.timeout(timeout):
--> 520 return await fut
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/implementations/http.py:249, in _cat_file()
247 session = await self.set_session()
248 async with session.get(self.encode_url(url), **kw) as r:
--> 249 out = await r.read()
250 self._raise_not_found_for_status(r, url)
251 return out
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/client_reqrep.py:693, in read()
691 if self._body is None:
692 try:
--> 693 self._body = await self.content.read()
694 for trace in self._traces:
695 await trace.send_response_chunk_received(
696 self.method, self.url, self._body
697 )
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/streams.py:446, in read()
444 blocks = []
445 while True:
--> 446 block = await self.readany()
447 if not block:
448 break
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/streams.py:468, in readany()
464 # TODO: should be `if` instead of `while`
465 # because waiter maybe triggered on chunk end,
466 # without feeding any data
467 while not self._buffer and not self._eof:
--> 468 await self._wait("readany")
470 return self._read_nowait(-1)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/streams.py:372, in _wait()
370 try:
371 with self._timer:
--> 372 await waiter
373 finally:
374 self._waiter = None
ClientPayloadError: Response payload is not completed: <ContentLengthError: 400, message='Not enough data to satisfy content length header.'>Indeed! The surface ocean is trapping heat as the globe warms!
Summary¶
In this notebook we used sea temperature data from the Community Earth System Model (CESM2) Large Ensemble dataset to compute surface ocean heat and convince ourselves that the surface ocean is trapping extra heat as the globe warms. We used an intake-ESM catalog backed by pelican links to stream data from NCAR’s Geoscience Data Exchange via NCAR’s OSDF origin!
What’s next?¶
In the next notebook, we will see how to load data from multiple OSDF origins into a workflow. We will stream CMIP6 model data from AWS and observational data from NCAR’s GDEX.
Resources and references¶
Original notebook on the pangeo NCAR gallery
CESM2 Large Ensemble Dataset (Community Earth System Model 2) hosted on NCAR’s GDEX.