Ocean iron concentration


Overview

Iron is one of the most important micronutrients in the ocean, limiting production in some areas. In this notebook, we make a map of iron concentration and compare it to observational data.

  1. General setup

  2. Subsetting

  3. Processing - long-term mean

  4. Comparing to observational data

Prerequisites

Concepts

Importance

Notes

Matplotlib

Necessary

Intro to Cartopy

Necessary

Dask Cookbook

Helpful

Intro to Xarray

Helpful

  • Time to learn: 15 min


Imports

import xarray as xr
import glob
import numpy as np
import matplotlib.pyplot as plt
import cartopy
import cartopy.crs as ccrs
import pop_tools
from dask.distributed import LocalCluster
import pandas as pd
import s3fs
import netCDF4

from module import adjust_pop_grid

General setup (see intro notebooks for explanations)

Connect to cluster

cluster = LocalCluster()
client = cluster.get_client()

Bring in POP grid utilities

ds_grid = pop_tools.get_grid('POP_gx1v7')
lons = ds_grid.TLONG
lats = ds_grid.TLAT
depths = ds_grid.z_t * 0.01

Load the data

jetstream_url = 'https://js2.jetstream-cloud.org:8001/'

s3 = s3fs.S3FileSystem(anon=True, client_kwargs=dict(endpoint_url=jetstream_url))

# Generate a list of all files in CESM folder
s3path = 's3://pythia/ocean-bgc/cesm/g.e22.GOMIPECOIAF_JRA-1p4-2018.TL319_g17.4p2z.002branch/ocn/proc/tseries/month_1/*'
remote_files = s3.glob(s3path)

# Open all files from folder
fileset = [s3.open(file) for file in remote_files]

# Open with xarray
ds = xr.open_mfdataset(fileset, data_vars="minimal", coords='minimal', compat="override", parallel=True,
                       drop_variables=["transport_components", "transport_regions", 'moc_components'], decode_times=True)

ds
<xarray.Dataset> Size: 28GB
Dimensions:                         (nlat: 384, nlon: 320, time: 120, z_t: 60,
                                     z_t_150m: 15)
Coordinates:
    TLAT                            (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
    TLONG                           (nlat, nlon) float64 983kB dask.array<chunksize=(384, 320), meta=np.ndarray>
  * time                            (time) object 960B 2010-01-16 12:00:00 .....
  * z_t                             (z_t) float32 240B 500.0 ... 5.375e+05
  * z_t_150m                        (z_t_150m) float32 60B 500.0 ... 1.45e+04
Dimensions without coordinates: nlat, nlon
Data variables: (12/45)
    FG_CO2                          (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    Fe                              (time, z_t, nlat, nlon) float32 4GB dask.array<chunksize=(30, 15, 96, 80), meta=np.ndarray>
    NO3                             (time, z_t, nlat, nlon) float32 4GB dask.array<chunksize=(30, 15, 96, 80), meta=np.ndarray>
    PO4                             (time, z_t, nlat, nlon) float32 4GB dask.array<chunksize=(30, 15, 96, 80), meta=np.ndarray>
    POC_FLUX_100m                   (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    SALT                            (time, z_t, nlat, nlon) float32 4GB dask.array<chunksize=(30, 15, 96, 80), meta=np.ndarray>
    ...                              ...
    sp_Fe_lim_Cweight_avg_100m      (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    sp_Fe_lim_surf                  (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    sp_N_lim_Cweight_avg_100m       (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    sp_N_lim_surf                   (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    sp_P_lim_Cweight_avg_100m       (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>
    sp_P_lim_surf                   (time, nlat, nlon) float32 59MB dask.array<chunksize=(60, 192, 160), meta=np.ndarray>

Subsetting

variables =['Fe']
keep_vars=['z_t','z_t_150m','dz','time_bound', 'time','TAREA','TLAT','TLONG'] + variables
ds = ds.drop_vars([v for v in ds.variables if v not in keep_vars])
ds.Fe.isel(time=0,z_t=0).plot()
/home/runner/miniconda3/envs/ocean-bgc-cookbook-dev/lib/python3.12/site-packages/distributed/client.py:3362: UserWarning: Sending large graph of size 10.01 MiB.
This may cause some slowdown.
Consider loading the data with Dask directly
 or using futures or delayed objects to embed the data into the graph without repetition.
See also https://docs.dask.org/en/stable/best-practices.html#load-data-with-dask for more information.
  warnings.warn(
<matplotlib.collections.QuadMesh at 0x7ff730864410>
../_images/3c8b071c7e142550f12dd170060670c94dbf1fe703148c2c9a087cabda88b82b.png

Processing - long-term mean

Pull in the function we defined in the nutrients notebook…

def year_mean(ds):
    """
    Properly convert monthly data to annual means, taking into account month lengths.
    Source: https://ncar.github.io/esds/posts/2021/yearly-averages-xarray/
    """
    
    # Make a DataArray with the number of days in each month, size = len(time)
    month_length = ds.time.dt.days_in_month

    # Calculate the weights by grouping by 'time.year'
    weights = (
        month_length.groupby("time.year") / month_length.groupby("time.year").sum()
    )

    # Test that the sum of the year for each season is 1.0
    np.testing.assert_allclose(weights.groupby("time.year").sum().values, np.ones((len(ds.groupby("time.year")), )))

    # Calculate the weighted average
    return (ds * weights).groupby("time.year").sum(dim="time")
    

Take the long-term mean of our data set. We process monthly to annual data with our custom function, then use xarray’s built-in .mean() function to process from annual data to a single mean over time, since each year is the same length.

ds = year_mean(ds).mean("year")
nmolcm3_to_nM = 1.e3
ds['Fe'] = ds['Fe'] * nmolcm3_to_nM
ds['Fe'].attrs['units'] = 'nM'

Compare to observational Fe database

This dataset includes observations from the following papers:

as collected by Long et al., 2021 – see this paper for details.

fe_obs_path = 's3://pythia/ocean-bgc/obs/dFe-database-2021-05-20.csv'

fe_obs = s3.open(fe_obs_path)

df = pd.read_csv(fe_obs, na_values=-999.).dropna(axis=0, how='all')

df
lon lat depth dFe_obs
0 210.010 -16.0018 20.0 0.540000
1 210.010 -16.0018 35.0 0.440000
2 210.010 -16.0019 50.0 0.480000
3 210.010 -16.0019 80.0 0.400000
4 210.010 -16.0020 100.0 0.390000
... ... ... ... ...
27777 160.051 47.0032 3929.6 0.825681
27778 160.051 47.0032 3929.8 0.902248
27779 160.051 47.0032 4900.4 0.555630
27780 160.051 47.0032 4900.9 0.621851
27781 160.051 47.0032 5210.1 0.573220

27782 rows × 4 columns

fig = plt.figure(figsize=(16,5))

fig.suptitle("Surface iron concentration comparison")

### CESM
ax = fig.add_subplot(1,2,1, projection=ccrs.Robinson(central_longitude=305.0))
lon, lat, field = adjust_pop_grid(lons, lats,  ds.Fe.isel(z_t=0))
pc=ax.pcolormesh(lon, lat, field, cmap='plasma',vmin=0,vmax=3,transform=ccrs.PlateCarree())
land = cartopy.feature.NaturalEarthFeature('physical', 'land', scale='110m', edgecolor='k', facecolor='white', linewidth=0.5)
ax.add_feature(land)
ax.set_title('CESM Fe at surface', fontsize=10)


### obs
ax = fig.add_subplot(1,2,2, projection=ccrs.Robinson(central_longitude=305.0))
ax.coastlines('10m',linewidth=0.5)
ax.set_title('Obs Fe at surface', fontsize=10)


df_sub = df.loc[(df.depth <= 2.5)]
sc = ax.scatter(df_sub.lon, df_sub.lat, c=df_sub.dFe_obs.values,
                cmap='plasma',
                vmin=0, vmax=3, 
                transform=ccrs.PlateCarree())

fig.subplots_adjust(right=0.8)
cbar_ax = fig.add_axes([0.85, 0.15, 0.02, 0.7])
fig.colorbar(pc, cax=cbar_ax, label='Fe (nM)')
<matplotlib.colorbar.Colorbar at 0x7ff740678680>
/home/runner/miniconda3/envs/ocean-bgc-cookbook-dev/lib/python3.12/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/10m_physical/ne_10m_coastline.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)
../_images/ff328a64e48fd6e1337b0c8d976dadfd8610e1786c25dc62fcf0dfcec93ce2cb.png

And close the Dask cluster we spun up at the beginning.

cluster.close()

Summary

You’ve learned how to make a map of ocean iron and compare it to observations.

Resources and references