Skip to article frontmatterSkip to article content

Phytoplankton biomass

A coccolithophore, a type of phytoplankton. Art credit: Kristen Krumhardt


Overview

Phytoplankton are single-celled, photosynthesizing organisms found throughout the global ocean. Though there are many different species of phytoplankton, CESM-MARBL groups them into four categories called functional types: small phytoplankton, diatoms (which build silica-based shells), coccolithophores (which build calcium carbonate-based shells), and diazotrophs (which fix nitrogen). In this notebook, we evaluate the biomass and total production of these phytoplankton in different areas, as modeled by CESM-MARBL.

  1. General setup
  2. Subsetting
  3. Taking a quick look
  4. Processing - long-term mean
  5. Mapping biomass at different depths
  6. Mapping productivity
  7. Compare NPP to satellite observations

Prerequisites

ConceptsImportanceNotes
MatplotlibNecessary
Intro to CartopyNecessary
Dask CookbookHelpful
Intro to XarrayHelpful
  • Time to learn: 30 min

Imports

import xarray as xr
import glob
import numpy as np
import matplotlib.pyplot as plt
import cartopy
import cartopy.crs as ccrs
import pop_tools
from dask.distributed import LocalCluster
import s3fs
from datetime import datetime

from module import adjust_pop_grid
/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/pop_tools/__init__.py:4: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  from pkg_resources import DistributionNotFound, get_distribution

General setup (see intro notebooks for explanations)

Connect to cluster

cluster = LocalCluster()
client = cluster.get_client()
/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/node.py:187: UserWarning: Port 8787 is already in use.
Perhaps you already have a cluster running?
Hosting the HTTP server on port 32827 instead
  warnings.warn(

Bring in POP grid utilities

ds_grid = pop_tools.get_grid('POP_gx1v7')
lons = ds_grid.TLONG
lats = ds_grid.TLAT
depths = ds_grid.z_t * 0.01
Downloading file 'inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8' from 'https://svn-ccsm-inputdata.cgd.ucar.edu/trunk/inputdata/ocn/pop/gx1v7/grid/horiz_grid_20010402.ieeer8' to '/home/runner/.pop_tools'.
/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:1097: InsecureRequestWarning: Unverified HTTPS request is being made to host 'svn-ccsm-inputdata.cgd.ucar.edu'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(
Downloading file 'inputdata/ocn/pop/gx1v7/grid/topography_20161215.ieeei4' from 'https://svn-ccsm-inputdata.cgd.ucar.edu/trunk/inputdata/ocn/pop/gx1v7/grid/topography_20161215.ieeei4' to '/home/runner/.pop_tools'.
/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/urllib3/connectionpool.py:1097: InsecureRequestWarning: Unverified HTTPS request is being made to host 'svn-ccsm-inputdata.cgd.ucar.edu'. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#tls-warnings
  warnings.warn(

Load the data

jetstream_url = 'https://js2.jetstream-cloud.org:8001/'

s3 = s3fs.S3FileSystem(anon=True, client_kwargs=dict(endpoint_url=jetstream_url))

# Generate a list of all files in CESM folder
s3path = 's3://pythia/ocean-bgc/cesm/g.e22.GOMIPECOIAF_JRA-1p4-2018.TL319_g17.4p2z.002branch/ocn/proc/tseries/month_1/*'
remote_files = s3.glob(s3path)
s3.invalidate_cache()

# Open all files from folder
fileset = [s3.open(file) for file in remote_files]

# Open with xarray
ds = xr.open_mfdataset(fileset, data_vars="minimal", coords='minimal', compat="override", parallel=True,
                       drop_variables=["transport_components", "transport_regions", 'moc_components'], decode_times=True)

ds
Fetching long content....
2025-09-07 01:51:06,807 - distributed.protocol.pickle - ERROR - Failed to serialize <xarray.Dataset> Size: 61MB
Dimensions:        (time: 120, nlat: 384, nlon: 320)
Coordinates:
    TLAT           (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG          (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time           (time) object 960B 2010-01-16 12:00:00 ... 2019-12-16 12:0...
Dimensions without coordinates: nlat, nlon
Data variables:
    POC_FLUX_100m  (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,837 - distributed.protocol.pickle - ERROR - Failed to serialize <bound method H5NetCDFStore.close of <xarray.backends.h5netcdf_.H5NetCDFStore object at 0x7f3e8d6df9a0>>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,843 - distributed.protocol.pickle - ERROR - Failed to serialize <xarray.Dataset> Size: 61MB
Dimensions:           (time: 120, nlat: 384, nlon: 320)
Coordinates:
    TLAT              (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG             (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time              (time) object 960B 2010-01-16 12:00:00 ... 2019-12-16 1...
Dimensions without coordinates: nlat, nlon
Data variables:
    diaz_Fe_lim_surf  (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,851 - distributed.protocol.pickle - ERROR - Failed to serialize <xarray.Dataset> Size: 61MB
Dimensions:             (time: 120, nlat: 384, nlon: 320)
Coordinates:
    TLAT                (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG               (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time                (time) object 960B 2010-01-16 12:00:00 ... 2019-12-16...
Dimensions without coordinates: nlat, nlon
Data variables:
    diat_SiO3_lim_surf  (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,862 - distributed.protocol.pickle - ERROR - Failed to serialize <bound method H5NetCDFStore.close of <xarray.backends.h5netcdf_.H5NetCDFStore object at 0x7f3e8deef8e0>>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,871 - distributed.protocol.pickle - ERROR - Failed to serialize <xarray.Dataset> Size: 887MB
Dimensions:   (time: 120, z_t_150m: 15, nlat: 384, nlon: 320)
Coordinates:
    TLAT      (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG     (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time      (time) object 960B 2010-01-16 12:00:00 ... 2019-12-16 12:00:00
  * z_t_150m  (z_t_150m) float32 60B 500.0 1.5e+03 2.5e+03 ... 1.35e+04 1.45e+04
Dimensions without coordinates: nlat, nlon
Data variables:
    coccoC    (time, z_t_150m, nlat, nlon) float32 885MB dask.array<chunksize=(40, 5, 128, 107), meta=np.ndarray>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,883 - distributed.protocol.pickle - ERROR - Failed to serialize <bound method H5NetCDFStore.close of <xarray.backends.h5netcdf_.H5NetCDFStore object at 0x7f3e8d38ebc0>>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,896 - distributed.protocol.pickle - ERROR - Failed to serialize <xarray.Dataset> Size: 61MB
Dimensions:                        (time: 120, nlat: 384, nlon: 320)
Coordinates:
    TLAT                           (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG                          (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time                           (time) object 960B 2010-01-16 12:00:00 ......
Dimensions without coordinates: nlat, nlon
Data variables:
    cocco_Fe_lim_Cweight_avg_100m  (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:06,912 - distributed.protocol.pickle - ERROR - Failed to serialize <xarray.Dataset> Size: 61MB
Dimensions:           (time: 120, nlat: 384, nlon: 320)
Coordinates:
    TLAT              (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG             (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time              (time) object 960B 2010-01-16 12:00:00 ... 2019-12-16 1...
Dimensions without coordinates: nlat, nlon
Data variables:
    photoC_diaz_zint  (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>.
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled
2025-09-07 01:51:07,090 - distributed.protocol.core - CRITICAL - Failed to deserialize
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/core.py", line 175, in loads
    return msgpack.loads(
           ~~~~~~~~~~~~~^
        frames[0], object_hook=_decode_default, use_list=False, **msgpack_opts
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "msgpack/_unpacker.pyx", line 194, in msgpack._cmsgpack.unpackb
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/core.py", line 159, in _decode_default
    return merge_and_deserialize(
        sub_header, sub_frames, deserializers=deserializers
    )
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/contextlib.py", line 85, in inner
    return func(*args, **kwds)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 525, in merge_and_deserialize
    return deserialize(header, merged_frames, deserializers=deserializers)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 452, in deserialize
    return loads(header, frames)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 195, in serialization_error_loads
    raise TypeError(msg)
TypeError: Could not serialize object of type Dataset
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 366, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
                     ~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 78, in pickle_dumps
    frames[0] = pickle.dumps(
                ~~~~~~~~~~~~^
        x,
        ^^
        buffer_callback=buffer_callback,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        protocol=context.get("pickle-protocol", None) if context else None,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[4], line 14
     11 fileset = [s3.open(file) for file in remote_files]
     13 # Open with xarray
---> 14 ds = xr.open_mfdataset(fileset, data_vars="minimal", coords='minimal', compat="override", parallel=True,
     15                        drop_variables=["transport_components", "transport_regions", 'moc_components'], decode_times=True)
     17 ds

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/xarray/backends/api.py:1812, in open_mfdataset(paths, chunks, concat_dim, compat, preprocess, engine, data_vars, coords, combine, parallel, join, attrs_file, combine_attrs, errors, **kwargs)
   1807     datasets = [preprocess(ds) for ds in datasets]
   1809 if parallel:
   1810     # calling compute here will return the datasets/file_objs lists,
   1811     # the underlying datasets will still be stored as dask arrays
-> 1812     datasets, closers = dask.compute(datasets, closers)
   1814 # Combine all datasets, closing them in case of a ValueError
   1815 try:

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/dask/base.py:681, in compute(traverse, optimize_graph, scheduler, get, *args, **kwargs)
    678     expr = expr.optimize()
    679     keys = list(flatten(expr.__dask_keys__()))
--> 681     results = schedule(expr, keys, **kwargs)
    683 return repack(results)

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/utils_comm.py:416, in retry_operation(coro, operation, *args, **kwargs)
    410 retry_delay_min = parse_timedelta(
    411     dask.config.get("distributed.comm.retry.delay.min"), default="s"
    412 )
    413 retry_delay_max = parse_timedelta(
    414     dask.config.get("distributed.comm.retry.delay.max"), default="s"
    415 )
--> 416 return await retry(
    417     partial(coro, *args, **kwargs),
    418     count=retry_count,
    419     delay_min=retry_delay_min,
    420     delay_max=retry_delay_max,
    421     operation=operation,
    422 )

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/utils_comm.py:395, in retry(coro, count, delay_min, delay_max, jitter_fraction, retry_on_exceptions, operation)
    393             delay *= 1 + random.random() * jitter_fraction
    394         await asyncio.sleep(delay)
--> 395 return await coro()

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/core.py:1259, in PooledRPCCall.__getattr__.<locals>.send_recv_from_rpc(**kwargs)
   1257 prev_name, comm.name = comm.name, "ConnectionPool." + key
   1258 try:
-> 1259     return await send_recv(comm=comm, op=key, **kwargs)
   1260 finally:
   1261     self.pool.reuse(self.addr, comm)

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/core.py:1018, in send_recv(comm, reply, serializers, deserializers, **kwargs)
   1016 await comm.write(msg, serializers=serializers, on_error="raise")
   1017 if reply:
-> 1018     response = await comm.read(deserializers=deserializers)
   1019 else:
   1020     response = None

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/comm/tcp.py:248, in TCP.read(self, deserializers)
    246 else:
    247     try:
--> 248         msg = await from_frames(
    249             frames,
    250             deserialize=self.deserialize,
    251             deserializers=deserializers,
    252             allow_offload=self.allow_offload,
    253         )
    254     except EOFError:
    255         # Frames possibly garbled or truncated by communication error
    256         self.abort()

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/comm/utils.py:78, in from_frames(frames, deserialize, deserializers, allow_offload)
     76     res = await offload(_from_frames)
     77 else:
---> 78     res = _from_frames()
     80 return res

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/comm/utils.py:61, in from_frames.<locals>._from_frames()
     59 def _from_frames():
     60     try:
---> 61         return protocol.loads(
     62             frames, deserialize=deserialize, deserializers=deserializers
     63         )
     64     except EOFError:
     65         if size > 1000:

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/core.py:175, in loads(frames, deserialize, deserializers)
    172             return pickle.loads(sub_header["pickled-obj"], buffers=sub_frames)
    173         return msgpack_decode_default(obj)
--> 175     return msgpack.loads(
    176         frames[0], object_hook=_decode_default, use_list=False, **msgpack_opts
    177     )
    179 except Exception:
    180     logger.critical("Failed to deserialize", exc_info=True)

File msgpack/_unpacker.pyx:194, in msgpack._cmsgpack.unpackb()

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/core.py:159, in loads.<locals>._decode_default(obj)
    157     if "compression" in sub_header:
    158         sub_frames = decompress(sub_header, sub_frames)
--> 159     return merge_and_deserialize(
    160         sub_header, sub_frames, deserializers=deserializers
    161     )
    162 else:
    163     return Serialized(sub_header, sub_frames)

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/contextlib.py:85, in ContextDecorator.__call__.<locals>.inner(*args, **kwds)
     82 @wraps(func)
     83 def inner(*args, **kwds):
     84     with self._recreate_cm():
---> 85         return func(*args, **kwds)

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py:525, in merge_and_deserialize(header, frames, deserializers)
    521             merged = host_array_from_buffers(subframes)
    523         merged_frames.append(merged)
--> 525 return deserialize(header, merged_frames, deserializers=deserializers)

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py:452, in deserialize(header, frames, deserializers)
    447     raise TypeError(
    448         "Data serialized with %s but only able to deserialize "
    449         "data with %s" % (name, str(list(deserializers)))
    450     )
    451 dumps, loads, wants_context = families[name]
--> 452 return loads(header, frames)

File ~/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py:195, in serialization_error_loads(header, frames)
    193 def serialization_error_loads(header, frames):
    194     msg = "\n".join([codecs.decode(frame, "utf8") for frame in frames])
--> 195     raise TypeError(msg)

TypeError: Could not serialize object of type Dataset
Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 63, in dumps
    result = pickle.dumps(x, **dump_kwargs)
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 68, in dumps
    pickler.dump(x)
    ~~~~~~~~~~~~^^^
TypeError: cannot pickle 'module' object

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 366, in serialize
    header, frames = dumps(x, context=context) if wants_context else dumps(x)
                     ~~~~~^^^^^^^^^^^^^^^^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/serialize.py", line 78, in pickle_dumps
    frames[0] = pickle.dumps(
                ~~~~~~~~~~~~^
        x,
        ^^
        buffer_callback=buffer_callback,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        protocol=context.get("pickle-protocol", None) if context else None,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/distributed/protocol/pickle.py", line 80, in dumps
    result = cloudpickle.dumps(x, **dump_kwargs)
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1537, in dumps
    cp.dump(obj)
    ~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/cloudpickle/cloudpickle.py", line 1303, in dump
    return super().dump(obj)
           ~~~~~~~~~~~~^^^^^
  File "/home/runner/micromamba/envs/ocean-bgc-cookbook-dev/lib/python3.13/site-packages/h5py/_hl/base.py", line 369, in __getnewargs__
    raise TypeError("h5py objects cannot be pickled")
TypeError: h5py objects cannot be pickled

Subsetting

variables =['diatC', 'coccoC','spC','diazC',
            'photoC_TOT_zint',
            'photoC_sp_zint','photoC_diat_zint',
            'photoC_diaz_zint','photoC_cocco_zint']
keep_vars=['z_t','z_t_150m','dz','time_bound', 'time', 'TAREA','TLAT','TLONG'] + variables
ds = ds.drop_vars([v for v in ds.variables if v not in keep_vars])

Taking a quick look

Let’s plot the biomass of coccolithophores as a first look. These plots show snapshots six months apart - note the difference between seasons! Also take a look at the increased concentrations of coccolithophores in the Southern Ocean during Southern-hemisphere summer; the increased concentrations of calcite caused by these plankton building calcite shells leads to this region being known as the Great Calcite Belt.

ds.coccoC.isel(time=0,z_t_150m=0).plot()
ds.coccoC.isel(time=6,z_t_150m=0).plot()

Processing - long-term mean

Pull in the function we defined in the nutrients notebook...

def year_mean(ds):
    """
    Properly convert monthly data to annual means, taking into account month lengths.
    Source: https://ncar.github.io/esds/posts/2021/yearly-averages-xarray/
    """
    
    # Make a DataArray with the number of days in each month, size = len(time)
    month_length = ds.time.dt.days_in_month

    # Calculate the weights by grouping by 'time.year'
    weights = (
        month_length.groupby("time.year") / month_length.groupby("time.year").sum()
    )

    # Test that the sum of the year for each season is 1.0
    np.testing.assert_allclose(weights.groupby("time.year").sum().values, np.ones((len(ds.groupby("time.year")), )))

    # Calculate the weighted average
    return (ds * weights).groupby("time.year").sum(dim="time")
    

Take the long-term mean of our data set. We process monthly to annual with our custom function, then use xarray’s built-in .mean() function to process from annual data to a single mean over time, since each year is the same length.

ds_ann = year_mean(ds)
ds = ds_ann.mean("year")
ds['spC'].isel(z_t_150m=0).plot()

Mapping biomass at different depths

Note the different colorbar scales on each of these maps!

Phytoplankton biomass at the surface

###### 
fig = plt.figure(figsize=(8,10))

ax = fig.add_subplot(4,1,1, projection=ccrs.Robinson(central_longitude=305.0))
# spC stands for "small phytoplankton carbon"
ax.set_title('spC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.spC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=1,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='spC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)


ax = fig.add_subplot(4,1,2, projection=ccrs.Robinson(central_longitude=305.0))
# diatC stands for "diatom carbon"
ax.set_title('diatC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.diatC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Blues',vmin=0,vmax=4,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diatC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)


ax = fig.add_subplot(4,1,3, projection=ccrs.Robinson(central_longitude=305.0))
# coccoC stands for "coccolithophore carbon"
ax.set_title('coccoC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.coccoC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Reds',vmin=0,vmax=1,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='coccoC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

ax = fig.add_subplot(4,1,4, projection=ccrs.Robinson(central_longitude=305.0))
# diazC stands for "diazotroph carbon"
ax.set_title('diazC at surface', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.diazC.isel(z_t_150m=0))
pc=ax.pcolormesh(lon, lat, field, cmap='Oranges',vmin=0,vmax=0.1,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diazC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

Phytoplankton biomass at 100m

###### 
fig = plt.figure(figsize=(8,10))


ax = fig.add_subplot(4,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('spC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.spC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=0.4,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='spC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

ax = fig.add_subplot(4,1,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('diatC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.diatC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Blues',vmin=0,vmax=0.4,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diatC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

ax = fig.add_subplot(4,1,3, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('coccoC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.coccoC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Reds',vmin=0,vmax=0.2,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='coccoC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

ax = fig.add_subplot(4,1,4, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('diazC at 100m', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.diazC.isel(z_t_150m=9))
pc=ax.pcolormesh(lon, lat, field, cmap='Oranges',vmin=0,vmax=0.2,transform=ccrs.PlateCarree())
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diazC (mmol m$^{-3}$)')
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

Mapping productivity

fig = plt.figure(figsize=(8,10))

ax = fig.add_subplot(4,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Small phytoplankton production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_sp_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=30,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='sp prod (mmol m$^{-2}$ d$^{-1}$)')

ax = fig.add_subplot(4,1,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Diatom production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_diat_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Blues',vmin=0,vmax=30,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diat prod (mmol m$^{-2}$ d$^{-1}$)')

ax = fig.add_subplot(4,1,3, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Diazotroph production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_diaz_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Reds',vmin=0,vmax=5,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='diaz prod (mmol m$^{-2}$ d$^{-1}$)')

ax = fig.add_subplot(4,1,4, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Coccolithophore production', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats, ds.photoC_cocco_zint * 864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Oranges',vmin=0,vmax=5,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='cocco prod (mmol m$^{-2}$ d$^{-1}$)');
fig = plt.figure(figsize=(12,5))

ax = fig.add_subplot(1,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('Total NPP', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.photoC_TOT_zint*864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='NPP (mmol m$^{-2}$ d$^{-1}$)');

Globally integrated NPP

def global_mean(ds, ds_grid, compute_vars, normalize=True, include_ms=False):
    """
    Compute the global mean on a POP dataset. 
    Return computed quantity in conventional units.
    """

    other_vars = list(set(ds.variables) - set(compute_vars))

    # note TAREA is in cm^2, which affects units

    if include_ms: # marginal seas!
        surface_mask = ds_grid.TAREA.where(ds_grid.KMT > 0).fillna(0.)
    else:
        surface_mask = ds_grid.TAREA.where(ds_grid.REGION_MASK > 0).fillna(0.)        
    
    masked_area = {
        v: surface_mask.where(ds[v].notnull()).fillna(0.) 
        for v in compute_vars
    }
    
    with xr.set_options(keep_attrs=True):
        
        dso = xr.Dataset({
            v: (ds[v] * masked_area[v]).sum(['nlat', 'nlon'])
            for v in compute_vars
        })
        
        if normalize:
            dso = xr.Dataset({
                v: dso[v] / masked_area[v].sum(['nlat', 'nlon'])
                for v in compute_vars
            })            
                
    return dso
ds_glb = global_mean(ds, ds_grid, variables,normalize=False).compute()

# convert from nmol C/s to Pg C/yr
nmols_to_PgCyr = 1e-9 * 12. * 1e-15 * 365. * 86400.

for v in variables:
    ds_glb[v] = ds_glb[v] * nmols_to_PgCyr        
    ds_glb[v].attrs['units'] = 'Pg C yr$^{-1}$'
    
ds_glb

Comparing to NPP satellite data

We load in a satellite-derived estimate of NPP, calculated with the VGPM algorithm (Behrenfeld and Falkowski, 1997). This data can be found at this website; we’ve re-uploaded a portion of it for easier access. It was originally provided in the format of HDF4 files; we have converted these to NetCDF files to make reading in data from the cloud more straightforward, but some additional processing is still required to format the time and space coordinates correctly before we can work with the data.

s3path = 's3://pythia/ocean-bgc/obs/vgpm/*.nc'

remote_files = s3.glob(s3path)
s3.invalidate_cache()

# Open all files from bucket
fileset = [s3.open(file) for file in remote_files]

Let’s try reading in one of these files to see what the format looks like.

test_ds = xr.open_dataset(fileset[0])

test_ds
all_single_ds = []

for file in fileset:
    ds_singlefile = xr.open_dataset(file)
    timestr = ds_singlefile["band_data"].attrs["Start Time String"]
    format_data = "%m/%d/%Y %H:%M:%S"
    ds_singlefile["time"] = datetime.strptime(timestr, format_data)
    all_single_ds.append(ds_singlefile)

ds_sat = xr.concat(all_single_ds, dim="time")
    
ds_sat

Now we have a time dimension! Let’s try plotting the data to see what else we need to fix.

ds_sat.band_data.isel(time=0, band=0).plot()

There are a few things going on here. The data is upside down from a more common map projection, and the spatial coordinates are a generic x and y rather than latitude and longitude. The color scale also doesn’t look right because areas like land that should be masked out are showing up as a low negative value, throwing off the positive values we actually want to see. We also have an extra band coordinate in the dataset - probably a holdover from the satellite data this product was generated from, but no longer giving us useful information. In the next block, we fix these problems.

Preliminary processing

# fix coords
ds_sat = ds_sat.rename(name_dict={"x": "lon", "y": "lat", "band_data": "NPP"})
ds_sat["lon"] = (ds_sat.lon/6 + 180) % 360
ds_sat = ds_sat.sortby(ds_sat.lon)
ds_sat["lat"] = (ds_sat.lat/6 - 90)[::-1]

# mask values
ds_sat = ds_sat.where(ds_sat.NPP != -9999.) 

# get rid of extra dimensions
ds_sat = ds_sat.squeeze(dim="band", drop=True)
ds_sat = ds_sat.drop_vars("spatial_ref")

# make NPP units match previous dataset
ds_sat["NPP"] = ds_sat.NPP / 12.01
ds_sat["NPP"] = ds_sat.NPP.assign_attrs(
    units="mmol m-2 day-1")
ds_sat
ds_sat.NPP.isel(time=0).plot(vmin=0, vmax=60)
ds_sat
fig = plt.figure(figsize=(12,5))

ax = fig.add_subplot(1,1,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('NPP in January 2010', fontsize=12)
pc=ax.pcolormesh(ds_sat.lon, ds_sat.lat, ds_sat.NPP.isel(time=0), cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
cbar1 = fig.colorbar(pc, ax=ax,extend='max',label='NPP (mmol m$^{-2}$ d$^{-1}$)');

Making a comparison map

Now let’s process in time. Use the monthly to annual function that we made before.

ds_sat_ann = year_mean(ds_sat)
ds_sat_timemean = ds_sat_ann.mean("year")
ds_sat_timemean
fig = plt.figure(figsize=(16,5))

fig.suptitle("NPP, mean over 2010-2019")

ax = fig.add_subplot(1,2,1, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('CESM (Model)', fontsize=12)
lon, lat, field = adjust_pop_grid(lons, lats,  ds.photoC_TOT_zint*864.)
pc=ax.pcolormesh(lon, lat, field, cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

ax = fig.add_subplot(1,2,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.set_title('VGPM (Satellite-based algorithm)', fontsize=12)
pc=ax.pcolormesh(ds_sat_timemean.lon, ds_sat_timemean.lat, ds_sat_timemean.NPP, cmap='Greens',vmin=0,vmax=60,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)

fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
fig.colorbar(pc, cax=cbar_ax, label='NPP (mmol m$^{-2}$ d$^{-1}$)')
plt.show()

And close the Dask cluster we spun up at the beginning.

cluster.close()

Summary

You’ve learned how to take a look at a few quantities related to phytoplankton in CESM, as well as processing an observation-derived dataset in a different format.

References
  1. Behrenfeld, M. J., & Falkowski, P. G. (1997). Photosynthetic rates derived from satellite‐based chlorophyll concentration. Limnology and Oceanography, 42(1), 1–20. 10.4319/lo.1997.42.1.0001
  2. (2013). In Ocean Biogeochemical Dynamics (pp. 102–172). Princeton University Press. 10.2307/j.ctt3fgxqx.7