Along Track Altimetry Analysis
Overview
Using CNES altimetry data
Visualizing data using hvplot
Use xhistogram to plot multidimensional data
Prerequisites
Concepts |
Importance |
Notes |
---|---|---|
Helpful |
||
Helpful |
Matplotlib knowledge also helpful |
|
Helpful |
||
Helpful |
Time to learn: 15 minutes
Imports
import fsspec
import xarray as xr
import numpy as np
import hvplot
import hvplot.dask
import hvplot.pandas
import hvplot.xarray
from xhistogram.xarray import histogram
from intake import open_catalog
Load Data
The analysis ready along-track altimetry data were prepared by CNES. They are catalogged in the Pangeo Cloud Data Catalog here: https://catalog.pangeo.io/browse/master/ocean/altimetry/
We will work with Jason 3.
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean/altimetry.yaml")
print(list(cat))
ds = cat['j3'].to_dask()
ds
['al', 'alg', 'c2', 'e1', 'e1g', 'e2', 'en', 'enn', 'g2', 'h2', 'j1', 'j1g', 'j1n', 'j2', 'j2g', 'j2n', 'j3', 's3a', 's3b', 'tp', 'tpn']
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[2], line 3
1 cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean/altimetry.yaml")
2 print(list(cat))
----> 3 ds = cat['j3'].to_dask()
4 ds
File ~/miniconda3/envs/po-cookbook-dev/lib/python3.10/site-packages/intake/catalog/base.py:472, in Catalog.__getitem__(self, key)
463 """Return a catalog entry by name.
464
465 Can also use attribute syntax, like ``cat.entry_name``, or
(...)
468 cat['name1', 'name2']
469 """
470 if not isinstance(key, list) and key in self:
471 # triggers reload_on_change
--> 472 s = self._get_entry(key)
473 if s.container == "catalog":
474 s.name = key
File ~/miniconda3/envs/po-cookbook-dev/lib/python3.10/site-packages/intake/catalog/utils.py:43, in reload_on_change.<locals>.wrapper(self, *args, **kwargs)
40 @functools.wraps(f)
41 def wrapper(self, *args, **kwargs):
42 self.reload()
---> 43 return f(self, *args, **kwargs)
File ~/miniconda3/envs/po-cookbook-dev/lib/python3.10/site-packages/intake/catalog/base.py:355, in Catalog._get_entry(self, name)
353 ups = [up for name, up in self.user_parameters.items() if name not in up_names]
354 entry._user_parameters = ups + (entry._user_parameters or [])
--> 355 return entry()
File ~/miniconda3/envs/po-cookbook-dev/lib/python3.10/site-packages/intake/catalog/entry.py:60, in CatalogEntry.__call__(self, persist, **kwargs)
58 def __call__(self, persist=None, **kwargs):
59 """Instantiate DataSource with given user arguments"""
---> 60 s = self.get(**kwargs)
61 s._entry = self
62 s._passed_kwargs = list(kwargs)
File ~/miniconda3/envs/po-cookbook-dev/lib/python3.10/site-packages/intake/catalog/local.py:313, in LocalCatalogEntry.get(self, **user_parameters)
310 return self._default_source
312 plugin, open_args = self._create_open_args(user_parameters)
--> 313 data_source = plugin(**open_args)
314 data_source.catalog_object = self._catalog
315 data_source.name = self.name
TypeError: ZarrArraySource.__init__() got an unexpected keyword argument 'consolidated'
Load some data into memory:
# Select latitude, longitude, and sea level anomaly
ds_ll = ds[['latitude', 'longitude', 'sla_filtered']].reset_coords().astype('f4').load()
ds_ll
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[3], line 2
1 # Select latitude, longitude, and sea level anomaly
----> 2 ds_ll = ds[['latitude', 'longitude', 'sla_filtered']].reset_coords().astype('f4').load()
3 ds_ll
NameError: name 'ds' is not defined
Convert to pandas dataframe:
df = ds_ll.to_dataframe()
df
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[4], line 1
----> 1 df = ds_ll.to_dataframe()
2 df
NameError: name 'ds_ll' is not defined
Visualize with hvplot
df.hvplot.scatter(x='longitude', y='latitude', datashade=True)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[5], line 1
----> 1 df.hvplot.scatter(x='longitude', y='latitude', datashade=True)
NameError: name 'df' is not defined
Bin using xhistogram
lon_bins = np.arange(0, 361, 2)
lat_bins = np.arange(-70, 71, 2)
# helps with memory management
ds_ll_chunked = ds_ll.chunk({'time': '5MB'})
sla_variance = histogram(ds_ll_chunked.longitude, ds_ll_chunked.latitude,
bins=[lon_bins, lat_bins],
weights=ds_ll_chunked.sla_filtered.fillna(0.)**2)
norm = histogram(ds_ll_chunked.longitude, ds_ll_chunked.latitude,
bins=[lon_bins, lat_bins])
# let's get at least 200 points in a box for it to be unmasked
thresh = 200
sla_variance = sla_variance / norm.where(norm > thresh)
sla_variance
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[6], line 5
2 lat_bins = np.arange(-70, 71, 2)
4 # helps with memory management
----> 5 ds_ll_chunked = ds_ll.chunk({'time': '5MB'})
7 sla_variance = histogram(ds_ll_chunked.longitude, ds_ll_chunked.latitude,
8 bins=[lon_bins, lat_bins],
9 weights=ds_ll_chunked.sla_filtered.fillna(0.)**2)
11 norm = histogram(ds_ll_chunked.longitude, ds_ll_chunked.latitude,
12 bins=[lon_bins, lat_bins])
NameError: name 'ds_ll' is not defined
sla_variance.load()
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[7], line 1
----> 1 sla_variance.load()
NameError: name 'sla_variance' is not defined
# plot the sea level anomaly variance
sla_variance.plot(x='longitude_bin', figsize=(12, 6), vmax=0.2)
---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[8], line 2
1 # plot the sea level anomaly variance
----> 2 sla_variance.plot(x='longitude_bin', figsize=(12, 6), vmax=0.2)
NameError: name 'sla_variance' is not defined
Summary
In this example we visualized sea level anomalies using along-track altimetry data using hvplot. Then, we used xhistogram to calculate and plot the variance of the data.
What’s next?
Other examples will look at other datasets to visualize sea surface temeratures, ocean depth, and currents.
Resources and references
This notebook is based on the Pangeo physical oceanography gallery example: https://gallery.pangeo.io/repos/pangeo-gallery/physical-oceanography/02_along_track.html