Overview¶
- This notebook is an adpation of a workflow in the NCAR gallery of the Pangeo collection
- This notebook illustrates how to compute surface ocean heat content using potential temperature data from CESM2 Large Ensemble Dataset (Community Earth System Model 2) hosted on NCAR’s RDA.
- This data is open access and is accessed via OSDF using the pelicanFS package and demonstrates how you can stream data from NCAR’s RDA
- Please refer to the first chapter of this cookbook to learn more about OSDF, pelican or pelicanFS
Prerequisites¶
Concepts | Importance | Notes |
---|---|---|
Intro to Intake-ESM | Necessary | Used for searching CMIP6 data |
Understanding of Zarr | Helpful | Familiarity with metadata structure |
Matplotlib | Helpful | Package used for plotting |
PelicanFS | Necessary | The python package used to stream data in this notebook |
OSDF | Helpful | OSDF is used to stream data in this notebook |
- Time to learn: 20 mins
Imports¶
import intake
import numpy as np
import pandas as pd
import xarray as xr
import seaborn as sns
import re
import matplotlib.pyplot as plt
import dask
from dask.distributed import LocalCluster
import pelicanfs
import cf_units as cf
# Load Catalog URL
cat_url = 'https://stratus.rda.ucar.edu/d010092/catalogs/d010092-osdf-zarr-gdex.json'
Set up local dask cluster¶
Before we do any computation let us first set up a local cluster using dask
cluster = LocalCluster()
client = cluster.get_client()
# Scale the cluster
n_workers = 5
cluster.scale(n_workers)
cluster
Data Loading¶
Load CESM2 LENS data from NCAR’s RDA¶
- Load CESM2 LENS zarr data from RDA using an intake-ESM catalog that has OSDF links
- For more details regarding the dataset. See, https://
rda .ucar .edu /datasets /d010092 /#
col = intake.open_esm_datastore(cat_url)
col
/home/runner/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/intake_esm/__init__.py:6: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import DistributionNotFound, get_distribution
---------------------------------------------------------------------------
SSLCertVerificationError Traceback (most recent call last)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/connector.py:1283, in TCPConnector._wrap_create_connection(self, addr_infos, req, timeout, client_error, *args, **kwargs)
1282 kwargs["ssl_shutdown_timeout"] = self._ssl_shutdown_timeout
-> 1283 return await self._loop.create_connection(*args, **kwargs, sock=sock)
1284 except cert_errors as exc:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/asyncio/base_events.py:1159, in BaseEventLoop.create_connection(self, protocol_factory, host, port, ssl, family, proto, flags, sock, local_addr, server_hostname, ssl_handshake_timeout, ssl_shutdown_timeout, happy_eyeballs_delay, interleave, all_errors)
1156 raise ValueError(
1157 f'A Stream Socket was expected, got {sock!r}')
-> 1159 transport, protocol = await self._create_connection_transport(
1160 sock, protocol_factory, ssl, server_hostname,
1161 ssl_handshake_timeout=ssl_handshake_timeout,
1162 ssl_shutdown_timeout=ssl_shutdown_timeout)
1163 if self._debug:
1164 # Get the socket from the transport because SSL transport closes
1165 # the old socket and creates a new SSL socket
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/asyncio/base_events.py:1192, in BaseEventLoop._create_connection_transport(self, sock, protocol_factory, ssl, server_hostname, server_side, ssl_handshake_timeout, ssl_shutdown_timeout)
1191 try:
-> 1192 await waiter
1193 except:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/asyncio/sslproto.py:581, in SSLProtocol._on_handshake_complete(self, handshake_exc)
580 else:
--> 581 raise handshake_exc
583 peercert = sslobj.getpeercert()
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/asyncio/sslproto.py:563, in SSLProtocol._do_handshake(self)
562 try:
--> 563 self._sslobj.do_handshake()
564 except SSLAgainErrors:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/ssl.py:916, in SSLObject.do_handshake(self)
915 """Start the SSL/TLS handshake."""
--> 916 self._sslobj.do_handshake()
SSLCertVerificationError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1010)
The above exception was the direct cause of the following exception:
ClientConnectorCertificateError Traceback (most recent call last)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/implementations/http.py:437, in HTTPFileSystem._info(self, url, **kwargs)
435 try:
436 info.update(
--> 437 await _file_info(
438 self.encode_url(url),
439 size_policy=policy,
440 session=session,
441 **self.kwargs,
442 **kwargs,
443 )
444 )
445 if info.get("size") is not None:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/implementations/http.py:849, in _file_info(url, session, size_policy, **kwargs)
848 elif size_policy == "get":
--> 849 r = await session.get(url, allow_redirects=ar, **kwargs)
850 else:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/client.py:770, in ClientSession._request(self, method, str_or_url, params, data, json, cookies, headers, skip_auto_headers, auth, allow_redirects, max_redirects, compress, chunked, expect100, raise_for_status, read_until_eof, proxy, proxy_auth, timeout, verify_ssl, fingerprint, ssl_context, ssl, server_hostname, proxy_headers, trace_request_ctx, read_bufsize, auto_decompress, max_line_size, max_field_size, middlewares)
769 try:
--> 770 resp = await handler(req)
771 # Client connector errors should not be retried
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/client.py:725, in ClientSession._request.<locals>._connect_and_send_request(req)
724 try:
--> 725 conn = await self._connector.connect(
726 req, traces=traces, timeout=real_timeout
727 )
728 except asyncio.TimeoutError as exc:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/connector.py:642, in BaseConnector.connect(self, req, traces, timeout)
641 await trace.send_connection_create_start()
--> 642 proto = await self._create_connection(req, traces, timeout)
643 if traces:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/connector.py:1209, in TCPConnector._create_connection(self, req, traces, timeout)
1208 else:
-> 1209 _, proto = await self._create_direct_connection(req, traces, timeout)
1211 return proto
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/connector.py:1581, in TCPConnector._create_direct_connection(self, req, traces, timeout, client_error)
1580 assert last_exc is not None
-> 1581 raise last_exc
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/connector.py:1550, in TCPConnector._create_direct_connection(self, req, traces, timeout, client_error)
1549 try:
-> 1550 transp, proto = await self._wrap_create_connection(
1551 self._factory,
1552 timeout=timeout,
1553 ssl=sslcontext,
1554 addr_infos=addr_infos,
1555 server_hostname=server_hostname,
1556 req=req,
1557 client_error=client_error,
1558 )
1559 except (ClientConnectorError, asyncio.TimeoutError) as exc:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/aiohttp/connector.py:1285, in TCPConnector._wrap_create_connection(self, addr_infos, req, timeout, client_error, *args, **kwargs)
1284 except cert_errors as exc:
-> 1285 raise ClientConnectorCertificateError(req.connection_key, exc) from exc
1286 except ssl_errors as exc:
ClientConnectorCertificateError: Cannot connect to host stratus.rda.ucar.edu:443 ssl:True [SSLCertVerificationError: (1, '[SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed: self-signed certificate in certificate chain (_ssl.c:1010)')]
The above exception was the direct cause of the following exception:
FileNotFoundError Traceback (most recent call last)
Cell In[5], line 1
----> 1 col = intake.open_esm_datastore(cat_url)
2 col
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/intake_esm/core.py:107, in esm_datastore.__init__(self, obj, progressbar, sep, registry, read_csv_kwargs, columns_with_iterables, storage_options, **intake_kwargs)
105 self.esmcat = ESMCatalogModel.from_dict(obj)
106 else:
--> 107 self.esmcat = ESMCatalogModel.load(
108 obj, storage_options=self.storage_options, read_csv_kwargs=read_csv_kwargs
109 )
111 self.derivedcat = registry or default_registry
112 self._entries = {}
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/intake_esm/cat.py:238, in ESMCatalogModel.load(cls, json_file, storage_options, read_csv_kwargs)
235 json_file = str(json_file) # We accept Path, but fsspec doesn't.
236 _mapper = fsspec.get_mapper(json_file, **storage_options)
--> 238 with fsspec.open(json_file, **storage_options) as fobj:
239 data = json.loads(fobj.read())
240 if 'last_updated' not in data:
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/core.py:105, in OpenFile.__enter__(self)
102 mode = self.mode.replace("t", "").replace("b", "") + "b"
104 try:
--> 105 f = self.fs.open(self.path, mode=mode)
106 except FileNotFoundError as e:
107 if has_magic(self.path):
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/spec.py:1338, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
1336 else:
1337 ac = kwargs.pop("autocommit", not self._intrans)
-> 1338 f = self._open(
1339 path,
1340 mode=mode,
1341 block_size=block_size,
1342 autocommit=ac,
1343 cache_options=cache_options,
1344 **kwargs,
1345 )
1346 if compression is not None:
1347 from fsspec.compression import compr
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/implementations/http.py:376, in HTTPFileSystem._open(self, path, mode, block_size, autocommit, cache_type, cache_options, size, **kwargs)
374 kw.update(kwargs)
375 info = {}
--> 376 size = size or info.update(self.info(path, **kwargs)) or info["size"]
377 session = sync(self.loop, self.set_session)
378 if block_size and size and info.get("partial", True):
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/asyn.py:118, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
115 @functools.wraps(func)
116 def wrapper(*args, **kwargs):
117 self = obj or args[0]
--> 118 return sync(self.loop, func, *args, **kwargs)
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/asyn.py:103, in sync(loop, func, timeout, *args, **kwargs)
101 raise FSTimeoutError from return_result
102 elif isinstance(return_result, BaseException):
--> 103 raise return_result
104 else:
105 return return_result
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/asyn.py:56, in _runner(event, coro, result, timeout)
54 coro = asyncio.wait_for(coro, timeout=timeout)
55 try:
---> 56 result[0] = await coro
57 except Exception as ex:
58 result[0] = ex
File ~/micromamba/envs/osdf-cookbook/lib/python3.12/site-packages/fsspec/implementations/http.py:450, in HTTPFileSystem._info(self, url, **kwargs)
447 except Exception as exc:
448 if policy == "get":
449 # If get failed, then raise a FileNotFoundError
--> 450 raise FileNotFoundError(url) from exc
451 logger.debug("", exc_info=exc)
453 return {"name": url, "size": None, **info, "type": "file"}
FileNotFoundError: https://stratus.rda.ucar.edu/d010092/catalogs/d010092-osdf-zarr-gdex.json
# Uncomment this line to see all the variables
# cesm_cat.df['variable'].values
cesm_temp = col.search(variable ='TEMP', frequency ='monthly')
cesm_temp
cesm_temp.df['path'].values
dsets_cesm = cesm_temp.to_dataset_dict()
dsets_cesm.keys()
historical = dsets_cesm['ocn.historical.monthly.cmip6']
future_smbb = dsets_cesm['ocn.ssp370.monthly.smbb']
future_cmip6 = dsets_cesm['ocn.ssp370.monthly.cmip6']
Change units¶
orig_units = cf.Unit(historical.z_t.attrs['units'])
orig_units
def change_units(ds, variable_str, variable_bounds_str, target_unit_str):
orig_units = cf.Unit(ds[variable_str].attrs['units'])
target_units = cf.Unit(target_unit_str)
variable_in_new_units = xr.apply_ufunc(orig_units.convert, ds[variable_bounds_str], target_units, dask='parallelized', output_dtypes=[ds[variable_bounds_str].dtype])
return variable_in_new_units
depth_levels_in_m = change_units(historical, 'z_t', 'z_t', 'm')
hist_temp_in_degK = change_units(historical, 'TEMP', 'TEMP', 'degK')
fut_cmip6_temp_in_degK = change_units(future_cmip6, 'TEMP', 'TEMP', 'degK')
fut_smbb_temp_in_degK = change_units(future_smbb, 'TEMP', 'TEMP', 'degK')
#
hist_temp_in_degK = hist_temp_in_degK.assign_coords(z_t=("z_t", depth_levels_in_m['z_t'].data))
hist_temp_in_degK["z_t"].attrs["units"] = "m"
hist_temp_in_degK
depth_levels_in_m.isel(z_t=slice(0, -1))
- Compute depth level deltas using the difference of z_t levels
depth_level_deltas = depth_levels_in_m.isel(z_t=slice(1, None)).values - depth_levels_in_m.isel(z_t=slice(0, -1)).values
# Optionally, if you want to keep it as an xarray DataArray, re-wrap the result
depth_level_deltas = xr.DataArray(depth_level_deltas, dims=["z_t"], coords={"z_t": depth_levels_in_m.z_t.isel(z_t=slice(0, -1))})
depth_level_deltas
Compute Ocean Heat content for ocean surface¶
- Ocean surface is considered to be the top 100m
- The formula for this is:
Where H is ocean heat content, the value we are trying to calculate,
is the density of sea water, ,
is the specific heat of sea water, ,
is the depth limit of the calculation in meters,
and is the temperature at each depth in degrees Kelvin.
def calc_ocean_heat(delta_level, temperature):
rho = 1026 #kg/m^3
c_p = 3990 #J/(kg K)
weighted_temperature = delta_level * temperature
heat = weighted_temperature.sum(dim="z_t")*rho*c_p
return heat
# Remember that the coordinate z_t still has values in cm
hist_temp_ocean_surface = hist_temp_in_degK.where(hist_temp_in_degK['z_t'] < 1e4,drop=True)
hist_temp_ocean_surface
depth_level_deltas_surface = depth_level_deltas.where(depth_level_deltas['z_t'] <1e4, drop= True)
depth_level_deltas_surface
hist_ocean_heat = calc_ocean_heat(depth_level_deltas_surface,hist_temp_ocean_surface)
hist_ocean_heat
Plot Ocean Heat¶
%%time
# Compute annual and ensemble mean
hist_oceanheat_ann_mean = hist_ocean_heat.mean('member_id').groupby('time.year').mean()
hist_oceanheat_ann_mean
hist_oceanheat_ano = \
hist_oceanheat_ann_mean.sel(year=2014) - hist_oceanheat_ann_mean.sel(year=1850)
%%time
hist_oceanheat_ano.plot()
Indeed! The surface ocean is trapping heat as the globe warms!
Summary¶
In this notebook we used sea temperature data from the Community Earth System Model (CESM2) Large Ensemble dataset to compute surface ocean heat and convince ourselves that the surface ocean is trapping extra heat as the globe warms. We used an intake-ESM catalog backed by pelican links to stream data from NCAR’s Research Data Archive via NCAR’s OSDF origin!
What’s next?¶
In the next notebook, we will see how to load data from multiple OSDF origins into a workflow. We will stream CMIP6 model data from AWS and observational data from RDA.
Resources and references¶
- Original notebook on the pangeo NCAR gallery
- CESM2 Large Ensemble Dataset (Community Earth System Model 2) hosted on NCAR’s RDA.