<img src="images/ProjectPythia_Logo_Final-01-Blue.svg" width=250 alt="Project Pythia Logo"></img>
<img src="images/logos/pangeo_simple_logo.svg" width=250 alt="Pangeo Logo"></img>

# Along Track Altimetry Analysis

---

## Overview

1. Using CNES altimetry data
1. Visualizing data using hvplot
1. Use xhistogram to plot multidimensional data

## Prerequisites

| Concepts | Importance | Notes |
| --- | --- | --- |
| [Intro to Pandas](https://foundations.projectpythia.org/core/pandas/pandas.html) | Helpful | |
| [Using hvplot](https://hvplot.holoviz.org) | Helpful | Matplotlib knowledge also helpful |
| [Dask](https://docs.dask.org/en/stable/) | Helpful | |
| [xhistogram](https://xhistogram.readthedocs.io/en/latest/) | Helpful | |

- **Time to learn**: 15 minutes

## Imports

---

In [10]:
import fsspec
import xarray as xr
import numpy as np
import hvplot
import hvplot.dask
import hvplot.pandas
import hvplot.xarray
from xhistogram.xarray import histogram
from intake import open_catalog

## Load Data

The analysis ready along-track altimetry data were prepared by CNES. They are catalogged in the Pangeo Cloud Data Catalog here: https://catalog.pangeo.io/browse/master/ocean/altimetry/

We will work with Jason 3.

In [11]:
cat = open_catalog("https://raw.githubusercontent.com/pangeo-data/pangeo-datastore/master/intake-catalogs/ocean/altimetry.yaml")
print(list(cat))
ds = cat['j3'].to_dask()
ds

['al', 'alg', 'c2', 'e1', 'e1g', 'e2', 'en', 'enn', 'g2', 'h2', 'j1', 'j1g', 'j1n', 'j2', 'j2g', 'j2n', 'j3', 's3a', 's3b', 'tp', 'tpn']


TypeError: Invalid variable type: value should be str, int or float, got None of type <class 'NoneType'>

Load some data into memory:

In [None]:
# Select latitude, longitude, and sea level anomaly
ds_ll = ds[['latitude', 'longitude', 'sla_filtered']].reset_coords().astype('f4').load()
ds_ll

Convert to pandas dataframe:

In [None]:
df = ds_ll.to_dataframe()
df

## Visualize with hvplot

In [None]:
df.hvplot.scatter(x='longitude', y='latitude', datashade=True)

## Bin using xhistogram

In [None]:
lon_bins = np.arange(0, 361, 2)
lat_bins = np.arange(-70, 71, 2)

# helps with memory management
ds_ll_chunked = ds_ll.chunk({'time': '5MB'})

sla_variance = histogram(ds_ll_chunked.longitude, ds_ll_chunked.latitude,
                         bins=[lon_bins, lat_bins],
                         weights=ds_ll_chunked.sla_filtered.fillna(0.)**2)

norm = histogram(ds_ll_chunked.longitude, ds_ll_chunked.latitude,
                         bins=[lon_bins, lat_bins])


# let's get at least 200 points in a box for it to be unmasked
thresh = 200
sla_variance = sla_variance / norm.where(norm > thresh)
sla_variance

In [None]:
sla_variance.load()

In [None]:
# plot the sea level anomaly variance
sla_variance.plot(x='longitude_bin', figsize=(12, 6), vmax=0.2)

## Summary

---

In this example we visualized sea level anomalies using along-track altimetry data using hvplot. Then, we used xhistogram to calculate and plot the variance of the data.

### What's next?

Other examples will look at other datasets to visualize sea surface temeratures, ocean depth, and currents.

## Resources and references
 - This notebook is based on the Pangeo physical oceanography gallery example: https://gallery.pangeo.io/repos/pangeo-gallery/physical-oceanography/02_along_track.html