Extreme SSTs¶
Overview¶
Extreme weather events, both atmospheric and oceanic, are increasing in frequency and intensity as a consequence of anthropogenic warming. The processes responsible for such events and their impacts on Caribbean lives remain to be well understood. Our Caribbeans for Climate community (a community of Caribbean-identified climate scientists, oceanographers, and practitioners) have created a cookbook analyzing Caribbean atmospheric and oceanic extreme weather variability from 1850-2015, using Coupled Model Intercomparison Project Phase 6 (CMIP6) data. In this notebook, we execute basic statistical analysis to investigate the linkages between extreme atmospheric and oceanic heat-related events and the possible causes behind them.
- Locate times when sea surface temperatures exceeded the 99th percentile threshold and persisted for 10 days (MHW events)
- Timeseries of ENSO 3.4 index overlaid on top of MHW events
- Run basic statistics to determine any correlation between the aforementioned metrics
Prerequisites¶
Some relavent knowledge on how to use certain packages (e.g. xarray
, matplotlib
) would be helpful to you in understanding this tutorial. If you need a refresher on how to employ these packages please refer to the Pythia Foundations page. Also, please refer to the Project Pythia’s CMIP6 cookbook page to familiarize yourself on how to ingest CMIP6 data.
Concepts | Importance | Notes |
---|---|---|
CMIP6 data structure | Necessary | Familiarity with query techniques |
Xarray | Necessary | Familiarity with manipulating data structure |
Matplotlib | Helpful | |
Project management | Helpful |
- Time to learn: estimate in minutes. For a rough idea, use 5 mins per subsection, 10 if longer; add these up for a total. Safer to round up and overestimate.
- System requirements:
- Populate with any system, version, or non-Python software requirements if necessary
- Otherwise use the concepts table above and the Imports section below to describe required packages as necessary
- If no extra requirements, remove the System requirements point altogether
Pip install regionmask
and xmip
packages into current environment¶
# !pip install git+https://github.com/mathause/regionmask.git
# !pip install git+https://github.com/jbusecke/xmip.git
We will need Dask to load in our data into memory to be able to run the rest of the notebook smoothly. Let’s initiate our Dask Client here.
To initiate your Client
, navigate to the Dask tab in the far-left sidebar and click on the
+New
button in the lower left of the sidebar window (you can Scale
to 8 workers to make the computation faster). Once the cluster is active, a <>
button will appear left of the Scale
button . Click on it and it will paste a cell with the Client code (like the one below) - run that and delete this old cell. Click on the Launch dashboard in JupyterLab
button and the Task Stream, Workers Memory and Progress windows will open. In order to get them to render you have to replace tcp://127.0.0.1:
with proxy
in the searchbar at the top of the sidebar window
import os
import sys
from dask_gateway import Gateway
from dask.distributed import Client
platform = sys.platform
if (platform == 'win32'):
import multiprocessing.popen_spawn_win32
else:
import multiprocessing.popen_spawn_posix
client = Client()
client
Imports¶
import numpy as np
import pandas as pd
import xarray as xr
import zarr
import fsspec
import regionmask
from xmip.preprocessing import combined_preprocessing
from xmip.regionmask import merged_mask
from xgcm import Grid
import cmocean as cm
from matplotlib import pyplot as plt
%matplotlib inline
plt.rcParams['figure.figsize'] = 12, 6
Marine Heat Waves¶
Marine heatwaves (MHWs) are characterized by extremely warm ocean temperatures that can span thousands of kilometers and last from days to months (Holbrook et al., 2019). Generally, they are defined as a MHW event when sea surface temperature (SST) exceed a certain threshold (typically, the 90th percentile threshold), and lasts for ≥5 consecutive days. These events can be devastating to marine life and the people that depend on maritime regions for sustenance and economy; such as fisheries and tribal communities. Despite their societal importance, however, a comprehensive understanding of the physical mechanisms driving MHWs in current and future climates does not yet exist.
In this section, we will look at SST data from 2000-present and identify times when SST anomalies are within the 99th percentile and last for ≥10 days. These will be marked as extreme MHW events and help us better identify any potential correlative relationship between extreme SSTs and hurricane incidences.
Holbrook, Neil J., Hillary A. Scannell, Alexander Sen Gupta, Jessica A. Benthuysen, Ming Feng, Eric C. J. Oliver, Lisa V. Alexander, et al. “A Global Assessment of Marine Heatwaves and Their Drivers.” Nature Communications 10, no. 1 (June 14, 2019): 2624. Holbrook et al. (2019).
Load in the data¶
df = pd.read_csv('https://storage.googleapis.com/cmip6/cmip6-zarr-consolidated-stores.csv')
df.head()
df_tos = df.query("table_id == 'Oday' & variable_id == 'tos' & experiment_id == 'historical'" +
"& institution_id == 'NOAA-GFDL' & source_id == 'GFDL-CM4'")
df_tos
#choose native grid
np.unique(df_tos['zstore'])
array(['gs://cmip6/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/historical/r1i1p1f1/Oday/tos/gn/v20180701/',
'gs://cmip6/CMIP6/CMIP/NOAA-GFDL/GFDL-CM4/historical/r1i1p1f1/Oday/tos/gr/v20180701/'],
dtype=object)
# get the path to a specific zarr store (the last row from the dataframe above)
zstore = df_tos['zstore'].values[-1]
print(zstore)
# create a mutable-mapping-style interface to the store
mapper = fsspec.get_mapper(zstore)
# open it using xarray and zarr
ds = xr.open_zarr(mapper, consolidated=True)
ds
Define the Caribbean Sea & Gulf of Mexico region¶
# load ocean basin data to use for masking purposes in the cells below
basins = regionmask.defined_regions.natural_earth_v4_1_0.ocean_basins_50
basins.plot(add_ocean=False, add_label=False)
/tmp/ipykernel_3921/2839286909.py:2: UserWarning: `natural_earth_v4_1_0.ocean_basins_50` does not quite extend to 180°E - it's recommended to use `natural_earth_v5_1_2.ocean_basins_50` instead. See https://github.com/regionmask/regionmask/issues/410.
basins = regionmask.defined_regions.natural_earth_v4_1_0.ocean_basins_50
<GeoAxes: >
/home/runner/micromamba/envs/cookbook-dev/lib/python3.13/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/110m_physical/ne_110m_coastline.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)

basin_mask = merged_mask(basins, ds[['lat', 'lon']])
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[11], line 1
----> 1 basin_mask = merged_mask(basins, ds[['lat', 'lon']])
File ~/micromamba/envs/cookbook-dev/lib/python3.13/site-packages/xmip/regionmask.py:188, in merged_mask(basins, ds, lon_name, lat_name, merge_dict, verbose)
137 def merged_mask(
138 basins, ds, lon_name="lon", lat_name="lat", merge_dict=None, verbose=False
139 ):
140 """Combine geographical basins (from regionmask) to larger ocean basins.
141
142 Parameters
(...) 186
187 """
--> 188 mask = basins.mask(ds, lon_name=lon_name, lat_name=lat_name)
190 if merge_dict is None:
191 merge_dict = _default_merge_dict()
TypeError: Regions.mask() got an unexpected keyword argument 'lon_name'
basin_mask
basin_mask.plot(cmap='tab20')
#Mask the SST to keep only the Caribbean/Gulf of Mexico region -
#this excludes the Pacific ocean side of our boxed region
natl = 0
sst = ds.tos.where(basin_mask==natl)
carib = dict(x=slice(-98, -55), y=slice(8, 31), time=slice('1980', '2015'))
csst_unchunked = sst.sel(**carib)
#manipulated chunking to optimize run
csst = csst_unchunked.chunk({'time':-1, 'x':1,'y':1}).load()
csst
Warning
Shutdown cluster, you don't need it after loading in the data. Also, the kernel crashes in the cell computing the 99th quantile with an active cluster.client.close()
Plot timeseries¶
#Create a trend line to visualize the increase in SST over time
from scipy.stats import linregress
lr = linregress(np.arange(0,12775), csst.mean(['x', 'y']).fillna(0.))
trend = (lr[0]*np.arange(0, 12775) + lr[1])
csst.mean(['x', 'y']).plot()
plt.plot(csst.time, trend, color='k')
plt.ylim(22, 30)
plt.xlim(csst.time[0].values, csst.time[-1].values)
plt.xlabel('Time')
plt.ylabel('Temperature (˚C)')
plt.title('Sea Surface Temperature in the Caribbean region')
plt.grid();
Find the 99th percentile SST¶
sst99 = csst.quantile(0.99)
sst99.values
Meet the conditions: above 30.4 ˚C and >10 days¶
from scipy.ndimage import label
import flox
def _consecutive_mask(mask):
labels, _ = label(mask, structure=np.array([1, 1, 1]))
counts, _ = flox.groupby_reduce(labels, labels, func="count")
counts[0] = 0
return counts[labels]
def consecutive_mask(mask):
return xr.apply_ufunc(
_consecutive_mask,
mask,
input_core_dims=[["time"]],
output_core_dims=[["time"]],
vectorize=True,
dask="parallelized",
output_dtypes=["uint16"],
)
group_lengths = consecutive_mask(csst >= 30.4)
longer_than_10_days = group_lengths >= 10
mhws = csst.where(longer_than_10_days)
time_series = csst.where(longer_than_10_days).mean(["x", "y"])
mhws
time_series
Plot timeseries of extreme SSTs¶
ticks = np.append(mhws.time[0::5*365].values,mhws.time[-1].values)
tick_labels = ['1980', '1985', '1990', '1995', '2000', '2005', '2010', '2015']
time_series.plot()
plt.ylim(30.3, 31.65)
plt.xlim(csst.time[0].values, csst.time[-1].values)
plt.xticks(ticks, tick_labels)
plt.xlabel('Time')
plt.ylabel('Temperature (˚C)')
plt.title('MHW events in the Caribbean region')
plt.grid();
#create a land mask for plotting purposes
land_mask = ds.tos.sel(**carib).isel(time=0).isnull()
def plot_mhw(time):
land_mask.plot(cmap='Greys', add_colorbar=False)
mhws.sel(time=time).where(~land_mask).squeeze().plot.contourf(levels=np.arange(27, 31, .2),
cmap=cm.cm.thermal, vmin=25)
plt.xlabel('Longitude')
plt.ylabel('Latitude')
plt.title(f'Extreme SST on {time}')
plt.grid();
#play around with different dates to see spatial extent of some of these extreme ssts
time = '2012-07-15'
plot_mhw(time)
El Niño 3.4 index¶
This section is borrowed from one of Project Pythia’s Intro to Pandas notebook
# Install the ENSO data from Pythia's data library
! pip install pythia_datasets
import pandas as pd
from pythia_datasets import DATASETS
filepath = DATASETS.fetch('enso_data.csv')
df = pd.read_csv(filepath, index_col=0, parse_dates=True)
df
#Select the Nino3.4 data column
oni_time_unmatched = df['Nino34anom']
oni_pd = oni_time_unmatched.loc['1982':'2014']
oni = oni_pd.to_xarray()
mhws_cut = mhws.sel(time=slice('1982', '2015')).resample(time='MS').mean('time')
mhws_cut.shape
oni.shape
oni.plot(color='k')
plt.fill_between(oni.datetime, oni.where(oni>=0), color='xkcd:watermelon')
plt.fill_between(oni.datetime, oni.where(oni<=0), color='xkcd:light navy')
plt.hlines(0.5, xmin=oni.datetime[0], xmax=oni.datetime[-1], color='xkcd:deep red')
plt.hlines(-0.5, xmin=oni.datetime[0], xmax=oni.datetime[-1], color='xkcd:royal blue')
plt.xlim(oni.datetime[0], oni.datetime[-1])
plt.grid()
plt.ylabel('Anomalous Temperature (˚C)')
plt.title('Niño3.4 Index');
#match oni time dim/coord to mhws_cut obj
oni_nc = xr.DataArray(data=oni.values, coords={'time':mhws_cut.time}, name='Nino34anom')
fig, ax1 = plt.subplots()
ticks = mhws_cut.time[0::4*12].values
tick_labels = ['1982', '1986', '1990', '1994', '1998', '2002', '2006', '2010', '2014']
ax1.set_xlabel('Time')
ax1.tick_params(axis='y')
ax1.set_ylabel('Niño 3.3 Index', color='k') # we already handled the x-label with ax1
ax1.plot(oni_nc.time, oni.values, color='k')
ax1.tick_params(axis='y', labelcolor='k')
ax1.fill_between(oni_nc.time.values, oni_nc.where(oni_nc>=0), color='xkcd:watermelon')
ax1.fill_between(oni_nc.time.values, oni_nc.where(oni_nc<=0), color='xkcd:light navy')
ax1.hlines(0.5, xmin=oni_nc.time[0].values, xmax=oni_nc.time[-1].values, color='xkcd:deep red')
ax1.hlines(-0.5, xmin=oni_nc.time[0].values, xmax=oni_nc.time[-1].values, color='xkcd:royal blue')
ax1.grid()
ax1.set_xlim(mhws_cut.time[0].values, mhws_cut.time[-1].values)
ax1.set_xticks(ticks, tick_labels)
ax1.set_title('MHW events in the Caribbean region against Nino3.4 Index', fontsize=16)
ax2 = ax1.twinx() # instantiate a second Axes that shares the same x-axis
ax2.set_ylabel('Temperature (˚C)', color='xkcd:steel blue')#, rotation=270)
ax2.plot(mhws_cut.time, mhws_cut.mean(['x', 'y']), color='xkcd:steel blue')
fig.tight_layout() # otherwise the right y-label is slightly clipped
plt.show()
Summary¶
In this notebook we used CMIP6 NOAA/GFDL CM4 data to identify the 99th percentile heat extremes that persisted for 10 days. We also plotted the Nino3.4 index and overlaid that timeseries with the extreme SST timeseries. (Future work includes running basic correlation analysis to discern any relationship with ENSO and the extreme SSTs)
What’s next?¶
Next, we will look at environmental changes in the CMIP6 HighResMIP simulations and calculate changes in genesis potential index, an indicator of tropical cyclone activity for the Caribbean region.
- Holbrook, N. J., Scannell, H. A., Sen Gupta, A., Benthuysen, J. A., Feng, M., Oliver, E. C. J., Alexander, L. V., Burrows, M. T., Donat, M. G., Hobday, A. J., Moore, P. J., Perkins-Kirkpatrick, S. E., Smale, D. A., Straub, S. C., & Wernberg, T. (2019). A global assessment of marine heatwaves and their drivers. Nature Communications, 10(1). 10.1038/s41467-019-10206-z