
A coccolithophore, a type of phytoplankton. Art credit: Kristen Krumhardt
Overview¶
Phytoplankton are single-celled, photosynthesizing organisms found throughout the global ocean. Though there are many different species of phytoplankton, CESM-MARBL groups them into four categories called functional types: small phytoplankton, diatoms (which build silica-based shells), coccolithophores (which build calcium carbonate-based shells), and diazotrophs (which fix nitrogen). In this notebook, we evaluate the biomass and total production of these phytoplankton in different areas, as modeled by CESM-MARBL.
General setup
Subsetting
Taking a quick look
Processing - long-term mean
Mapping biomass at different depths
Mapping productivity
Compare NPP to satellite observations
Prerequisites¶
| Concepts | Importance | Notes |
|---|---|---|
| Matplotlib | Necessary | |
| Intro to Cartopy | Necessary | |
| Dask Cookbook | Helpful | |
| Intro to Xarray | Helpful |
Time to learn: 30 min
Imports¶
import xarray as xr
import glob
import numpy as np
import matplotlib.pyplot as plt
import cartopy
import cartopy.crs as ccrs
import pop_tools
from dask.distributed import LocalCluster
import s3fs
from datetime import datetime
from module import adjust_pop_grid/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/pop_tools/__init__.py:4: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
from pkg_resources import DistributionNotFound, get_distribution
General setup (see intro notebooks for explanations)¶
Connect to cluster¶
cluster = LocalCluster()
client = cluster.get_client()/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/node.py:187: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 42659 instead
warnings.warn(
Bring in POP grid utilities¶
ds_grid = pop_tools.get_grid('POP_gx1v7')
lons = ds_grid.TLONG
lats = ds_grid.TLAT
depths = ds_grid.z_t * 0.01Downloading file 'inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8' from 'https://svn-ccsm-inputdata.cgd.ucar.edu/trunk/inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8' to '/home/runner/.pop_tools'.
---------------------------------------------------------------------------
ConnectionRefusedError Traceback (most recent call last)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connection.py:198, in HTTPConnection._new_conn(self)
197 try:
--> 198 sock = connection.create_connection(
199 (self._dns_host, self.port),
200 self.timeout,
201 source_address=self.source_address,
202 socket_options=self.socket_options,
203 )
204 except socket.gaierror as e:
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/util/connection.py:85, in create_connection(address, timeout, source_address, socket_options)
84 try:
---> 85 raise err
86 finally:
87 # Break explicitly a reference cycle
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/util/connection.py:73, in create_connection(address, timeout, source_address, socket_options)
72 sock.bind(source_address)
---> 73 sock.connect(sa)
74 # Break explicitly a reference cycle
ConnectionRefusedError: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
NewConnectionError Traceback (most recent call last)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:787, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
786 # Make the request on the HTTPConnection object
--> 787 response = self._make_request(
788 conn,
789 method,
790 url,
791 timeout=timeout_obj,
792 body=body,
793 headers=headers,
794 chunked=chunked,
795 retries=retries,
796 response_conn=response_conn,
797 preload_content=preload_content,
798 decode_content=decode_content,
799 **response_kw,
800 )
802 # Everything went great!
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:488, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
487 new_e = _wrap_proxy_error(new_e, conn.proxy.scheme)
--> 488 raise new_e
490 # conn.request() calls http.client.*.request, not the method in
491 # urllib3.request. It also calls makefile (recv) on the socket.
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:464, in HTTPConnectionPool._make_request(self, conn, method, url, body, headers, retries, timeout, chunked, response_conn, preload_content, decode_content, enforce_content_length)
463 try:
--> 464 self._validate_conn(conn)
465 except (SocketTimeout, BaseSSLError) as e:
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:1093, in HTTPSConnectionPool._validate_conn(self, conn)
1092 if conn.is_closed:
-> 1093 conn.connect()
1095 # TODO revise this, see https://github.com/urllib3/urllib3/issues/2791
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connection.py:753, in HTTPSConnection.connect(self)
752 sock: socket.socket | ssl.SSLSocket
--> 753 self.sock = sock = self._new_conn()
754 server_hostname: str = self.host
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connection.py:213, in HTTPConnection._new_conn(self)
212 except OSError as e:
--> 213 raise NewConnectionError(
214 self, f"Failed to establish a new connection: {e}"
215 ) from e
217 sys.audit("http.client.connect", self, self.host, self.port)
NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7f655c774050>: Failed to establish a new connection: [Errno 111] Connection refused
The above exception was the direct cause of the following exception:
MaxRetryError Traceback (most recent call last)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/requests/adapters.py:644, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
643 try:
--> 644 resp = conn.urlopen(
645 method=request.method,
646 url=url,
647 body=request.body,
648 headers=request.headers,
649 redirect=False,
650 assert_same_host=False,
651 preload_content=False,
652 decode_content=False,
653 retries=self.max_retries,
654 timeout=timeout,
655 chunked=chunked,
656 )
658 except (ProtocolError, OSError) as err:
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:841, in HTTPConnectionPool.urlopen(self, method, url, body, headers, retries, redirect, assert_same_host, timeout, pool_timeout, release_conn, chunked, body_pos, preload_content, decode_content, **response_kw)
839 new_e = ProtocolError("Connection aborted.", new_e)
--> 841 retries = retries.increment(
842 method, url, error=new_e, _pool=self, _stacktrace=sys.exc_info()[2]
843 )
844 retries.sleep()
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/util/retry.py:519, in Retry.increment(self, method, url, response, error, _pool, _stacktrace)
518 reason = error or ResponseError(cause)
--> 519 raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
521 log.debug("Incremented Retry for (url='%s'): %r", url, new_retry)
MaxRetryError: HTTPSConnectionPool(host='svn-ccsm-inputdata.cgd.ucar.edu', port=443): Max retries exceeded with url: /trunk/inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f655c774050>: Failed to establish a new connection: [Errno 111] Connection refused'))
During handling of the above exception, another exception occurred:
ConnectionError Traceback (most recent call last)
Cell In[3], line 1
----> 1 ds_grid = pop_tools.get_grid('POP_gx1v7')
2 lons = ds_grid.TLONG
3 lats = ds_grid.TLAT
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/pop_tools/grid.py:137, in get_grid(grid_name, scrip)
134 nlon = grid_attrs['lateral_dims'][1]
136 # read horizontal grid
--> 137 horiz_grid_fname = INPUTDATA.fetch(grid_attrs['horiz_grid_fname'], downloader=downloader)
138 grid_file_data = np.fromfile(horiz_grid_fname, dtype='>f8', count=-1)
139 grid_file_data = grid_file_data.reshape((7, nlat, nlon))
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/pop_tools/grid.py:92, in fetch(self, fname, processor, downloader)
89 if downloader is None:
90 downloader = pooch.downloaders.choose_downloader(url)
---> 92 pooch.core.stream_download(url, full_path, known_hash, downloader, pooch=self)
94 if processor is not None:
95 return processor(str(full_path), action, self)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/pooch/core.py:807, in stream_download(url, fname, known_hash, downloader, pooch, retry_if_failed)
803 try:
804 # Stream the file to a temporary so that we can safely check its
805 # hash before overwriting the original.
806 with temporary_file(path=str(fname.parent)) as tmp:
--> 807 downloader(url, tmp, pooch)
808 hash_matches(tmp, known_hash, strict=True, source=str(fname.name))
809 shutil.move(tmp, str(fname))
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/pooch/downloaders.py:220, in HTTPDownloader.__call__(self, url, output_file, pooch, check_only)
218 # pylint: enable=consider-using-with
219 try:
--> 220 response = requests.get(url, timeout=timeout, **kwargs)
221 response.raise_for_status()
222 content = response.iter_content(chunk_size=self.chunk_size)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/requests/api.py:73, in get(url, params, **kwargs)
62 def get(url, params=None, **kwargs):
63 r"""Sends a GET request.
64
65 :param url: URL for the new :class:`Request` object.
(...) 70 :rtype: requests.Response
71 """
---> 73 return request("get", url, params=params, **kwargs)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/requests/api.py:59, in request(method, url, **kwargs)
55 # By using the 'with' statement we are sure the session is closed, thus we
56 # avoid leaving sockets open which can trigger a ResourceWarning in some
57 # cases, and look like a memory leak in others.
58 with sessions.Session() as session:
---> 59 return session.request(method=method, url=url, **kwargs)
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/requests/sessions.py:589, in Session.request(self, method, url, params, data, headers, cookies, files, auth, timeout, allow_redirects, proxies, hooks, stream, verify, cert, json)
584 send_kwargs = {
585 "timeout": timeout,
586 "allow_redirects": allow_redirects,
587 }
588 send_kwargs.update(settings)
--> 589 resp = self.send(prep, **send_kwargs)
591 return resp
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/requests/sessions.py:703, in Session.send(self, request, **kwargs)
700 start = preferred_clock()
702 # Send the request
--> 703 r = adapter.send(request, **kwargs)
705 # Total elapsed time of the request (approximately)
706 elapsed = preferred_clock() - start
File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/requests/adapters.py:677, in HTTPAdapter.send(self, request, stream, timeout, verify, cert, proxies)
673 if isinstance(e.reason, _SSLError):
674 # This branch is for urllib3 v1.22 and later.
675 raise SSLError(e, request=request)
--> 677 raise ConnectionError(e, request=request)
679 except ClosedPoolError as e:
680 raise ConnectionError(e, request=request)
ConnectionError: HTTPSConnectionPool(host='svn-ccsm-inputdata.cgd.ucar.edu', port=443): Max retries exceeded with url: /trunk/inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8 (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f655c774050>: Failed to establish a new connection: [Errno 111] Connection refused'))Load the data¶
jetstream_url = 'https://js2.jetstream-cloud.org:8001/'
s3 = s3fs.S3FileSystem(anon=True, client_kwargs=dict(endpoint_url=jetstream_url))
# Generate a list of all files in CESM folder
s3path = 's3://pythia/ocean-bgc/cesm/g.e22.GOMIPECOIAF_JRA-1p4-2018.TL319_g17.4p2z.002branch/ocn/proc/tseries/month_1/*'
remote_files = s3.glob(s3path)
s3.invalidate_cache()
# Open all files from folder
fileset = [s3.open(file) for file in remote_files]
# Open with xarray
ds = xr.open_mfdataset(fileset, data_vars="minimal", coords='minimal', compat="override", parallel=True,
drop_variables=["transport_components", "transport_regions", 'moc_components'], decode_times=True)
dsSubsetting¶
variables =['diatC', 'coccoC','spC','diazC',
'photoC_TOT_zint',
'photoC_sp_zint','photoC_diat_zint',
'photoC_diaz_zint','photoC_cocco_zint']
keep_vars=['z_t','z_t_150m','dz','time_bound', 'time', 'TAREA','TLAT','TLONG'] + variables
ds = ds.drop_vars([v for v in ds.variables if v not in keep_vars])Taking a quick look¶
Let’s plot the biomass of coccolithophores as a first look. These plots show snapshots six months apart - note the difference between seasons! Also take a look at the increased concentrations of coccolithophores in the Southern Ocean during Southern-hemisphere summer; the increased concentrations of calcite caused by these plankton building calcite shells leads to this region being known as the Great Calcite Belt.
ds.coccoC.isel(time=0,z_t_150m=0).plot()ds.coccoC.isel(time=6,z_t_150m=0).plot()Processing - long-term mean¶
Pull in the function we defined in the nutrients notebook...
def year_mean(ds):
"""
Properly convert monthly data to annual means, taking into account month lengths.
Source: https://ncar.github.io/esds/posts/2021/yearly-averages-xarray/
"""
# Make a DataArray with the number of days in each month, size = len(time)
month_length = ds.time.dt.days_in_month
# Calculate the weights by grouping by 'time.year'
weights = (
month_length.groupby("time.year") / month_length.groupby("time.year").sum()
)
# Test that the sum of the year for each season is 1.0
np.testing.assert_allclose(weights.groupby("time.year").sum().values, np.ones((len(ds.groupby("time.year")), )))
# Calculate the weighted average
return (ds * weights).groupby("time.year").sum(dim="time")
Take the long-term mean of our data set. We process monthly to annual with our custom function, then use xarray’s built-in .mean() function to process from annual data to a single mean over time, since each year is the same length.
ds_ann = year_mean(ds)ds = ds_ann.mean("year")ds['spC'].isel(z_t_150m=0).plot()Mapping biomass at different depths¶
Note the different colorbar scales on each of these maps!
Phytoplankton biomass at the surface¶
######
fig = plt.figure(figsize=(8,10))
ax = fig.add_subplot(4,1,1, projection=ccrs.Robinson(central_longitude=305.0))
# spC stands for "small phytoplankton carbon"
ax.set_title('spC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.spC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=1,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='spC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(4,1,2, projection=ccrs.Robinson(central_longitude=305.0))
# diatC stands for "diatom carbon"
ax.set_title('diatC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.diatC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Blues',vmin=0,vmax=4,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diatC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(4,1,3, projection=ccrs.Robinson(central_longitude=305.0))
# coccoC stands for "coccolithophore carbon"
ax.set_title('coccoC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.coccoC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Reds',vmin=0,vmax=1,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='coccoC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(4,1,4, projection=ccrs.Robinson(central_longitude=305.0))
# diazC stands for "diazotroph carbon"
ax.set_title('diazC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.diazC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Oranges',vmin=0,vmax=0.1,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diazC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)Phytoplankton biomass at 100m¶
######
fig = plt.figure(figsize=(8,10))
ax = fig.add_subplot(4,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('spC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.spC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=0.4,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='spC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(4,1,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('diatC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.diatC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Blues',vmin=0,vmax=0.4,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diatC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(4,1,3, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('coccoC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.coccoC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Reds',vmin=0,vmax=0.2,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='coccoC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(4,1,4, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('diazC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.diazC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Oranges',vmin=0,vmax=0.2,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diazC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)Mapping productivity¶
fig = plt.figure(figsize=(8,10))
ax = fig.add_subplot(4,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Small phytoplankton production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_sp_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=30,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='sp prod (mmol m$^{-2}$ d$^{-1}$)')
ax = fig.add_subplot(4,1,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Diatom production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_diat_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Blues',vmin=0,vmax=30,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diat prod (mmol m$^{-2}$ d$^{-1}$)')
ax = fig.add_subplot(4,1,3, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Diazotroph production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_diaz_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Reds',vmin=0,vmax=5,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diaz prod (mmol m$^{-2}$ d$^{-1}$)')
ax = fig.add_subplot(4,1,4, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Coccolithophore production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_cocco_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Oranges',vmin=0,vmax=5,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='cocco prod (mmol m$^{-2}$ d$^{-1}$)');fig = plt.figure(figsize=(12,5))
ax = fig.add_subplot(1,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Total NPP', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_TOT_zint*864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='NPP (mmol m$^{-2}$ d$^{-1}$)');Globally integrated NPP¶
def global_mean(ds, ds_grid, compute_vars, normalize=True, include_ms=False):
"""
Compute the global mean on a POP dataset.
Return computed quantity in conventional units.
"""
other_vars = list(set(ds.variables) - set(compute_vars))
# note TAREA is in cm^2, which affects units
if include_ms: # marginal seas!
surface_mask = ds_grid.TAREA.where(ds_grid.KMT > 0).fillna(0.)
else:
surface_mask = ds_grid.TAREA.where(ds_grid.REGION_MASK > 0).fillna(0.)
masked_area = {
v: surface_mask.where(ds[v].notnull()).fillna(0.)
for v in compute_vars
}
with xr.set_options(keep_attrs=True):
dso = xr.Dataset({
v: (ds[v] * masked_area[v]).sum(['nlat', 'nlon'])
for v in compute_vars
})
if normalize:
dso = xr.Dataset({
v: dso[v] / masked_area[v].sum(['nlat', 'nlon'])
for v in compute_vars
})
return dsods_glb = global_mean(ds, ds_grid, variables,normalize=False).compute()
# convert from nmol C/s to Pg C/yr
nmols_to_PgCyr = 1e-9 * 12. * 1e-15 * 365. * 86400.
for v in variables:
ds_glb[v] = ds_glb[v] * nmols_to_PgCyr
ds_glb[v].attrs['units'] = 'Pg C yr$^{-1}$'
ds_glbComparing to NPP satellite data¶
We load in a satellite-derived estimate of NPP, calculated with the VGPM algorithm (Behrenfeld and Falkowski, 1997). This data can be found at this website; we’ve re-uploaded a portion of it for easier access. It was originally provided in the format of HDF4 files; we have converted these to NetCDF files to make reading in data from the cloud more straightforward, but some additional processing is still required to format the time and space coordinates correctly before we can work with the data.
s3path = 's3://pythia/ocean-bgc/obs/vgpm/*.nc'
remote_files = s3.glob(s3path)
s3.invalidate_cache()
# Open all files from bucket
fileset = [s3.open(file) for file in remote_files]
Let’s try reading in one of these files to see what the format looks like.
test_ds = xr.open_dataset(fileset[0])
test_dsall_single_ds = []
for file in fileset:
ds_singlefile = xr.open_dataset(file)
timestr = ds_singlefile["band_data"].attrs["Start Time String"]
format_data = "%m/%d/%Y %H:%M:%S"
ds_singlefile["time"] = datetime.strptime(timestr, format_data)
all_single_ds.append(ds_singlefile)
ds_sat = xr.concat(all_single_ds, dim="time")
ds_satNow we have a time dimension! Let’s try plotting the data to see what else we need to fix.
ds_sat.band_data.isel(time=0, band=0).plot()There are a few things going on here. The data is upside down from a more common map projection, and the spatial coordinates are a generic x and y rather than latitude and longitude. The color scale also doesn’t look right because areas like land that should be masked out are showing up as a low negative value, throwing off the positive values we actually want to see. We also have an extra band coordinate in the dataset - probably a holdover from the satellite data this product was generated from, but no longer giving us useful information. In the next block, we fix these problems.
Preliminary processing¶
# fix coords
ds_sat = ds_sat.rename(name_dict={"x": "lon", "y": "lat", "band_data": "NPP"})
ds_sat["lon"] = (ds_sat.lon/6 + 180) % 360
ds_sat = ds_sat.sortby(ds_sat.lon)
ds_sat["lat"] = (ds_sat.lat/6 - 90)[::-1]
# mask values
ds_sat = ds_sat.where(ds_sat.NPP != -9999.)
# get rid of extra dimensions
ds_sat = ds_sat.squeeze(dim="band", drop=True)
ds_sat = ds_sat.drop_vars("spatial_ref")
# make NPP units match previous dataset
ds_sat["NPP"] = ds_sat.NPP / 12.01
ds_sat["NPP"] = ds_sat.NPP.assign_attrs(
units="mmol m-2 day-1")
ds_satds_sat.NPP.isel(time=0).plot(vmin=0, vmax=60)ds_satfig = plt.figure(figsize=(12,5))
ax = fig.add_subplot(1,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('NPP in January 2010', fontsize=12)
pc=ax.pcolormesh(ds_sat.lon, ds_sat.lat, ds_sat.NPP.isel(time=0), cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='NPP (mmol m$^{-2}$ d$^{-1}$)');Making a comparison map¶
Now let’s process in time. Use the monthly to annual function that we made before.
ds_sat_ann = year_mean(ds_sat)ds_sat_timemean = ds_sat_ann.mean("year")ds_sat_timemeanfig = plt.figure(figsize=(16,5))
fig.suptitle("NPP, mean over 2010-2019")
ax = fig.add_subplot(1,2,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('CESM (Model)', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_TOT_zint*864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax = fig.add_subplot(1,2,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('VGPM (Satellite-based algorithm)', fontsize=12)
pc=ax.pcolormesh(ds_sat_timemean.lon, ds_sat_timemean.lat, ds_sat_timemean.NPP, cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
fig.colorbar(pc, cax=cbar_ax, label='NPP (mmol m$^{-2}$ d$^{-1}$)')
plt.show()
And close the Dask cluster we spun up at the beginning.
cluster.close()Summary¶
You’ve learned how to take a look at a few quantities related to phytoplankton in CESM, as well as processing an observation-derived dataset in a different format.
Resources and references¶
Sarmiento and Gruber Chapter 4: Organic Matter Production (see Phytoplankton in Section 4.2)
- Behrenfeld, M. J., & Falkowski, P. G. (1997). Photosynthetic rates derived from satellite‐based chlorophyll concentration. Limnology and Oceanography, 42(1), 1–20. 10.4319/lo.1997.42.1.0001
- (2013). In Ocean Biogeochemical Dynamics (pp. 102–172). Princeton University Press. 10.2307/j.ctt3fgxqx.7