Interactive dashboard for Sentinel-2 satellite imagery

Overview

In this notebook, we will take a look at how to retrieve Sentinel-2 L2A satellite imagery from the Microsoft Planetary Computer Data Catalog (MSPC). We will go over how to interact with the Data Catalog, which exposes a SpatioTemporal Asset Catalog (STAC) interface for querying, searching and retrieving data. We will use the odc-stac package to load the data lazily, which means data is not actually read unless required (say, for plotting). Once loaded, we will process the data and make a simple interactive dashboard to look at the satellite imagery over a location for different seasons. We will use the HoloViz ecosystem for the interactive dashboard.

Prerequisites

Concepts	Importance	Notes
Xarray	Helpful	Background
Dask + Xarray	Necessary	Background
About the Microsoft Planetary Computer (MSPC)	Helpful	Background
Documentation of pystac-client	Helpful	Consult as needed
Landsat ML Cookbook	Helpful	Similar cookbook, illustrates accessing Landsat data from MSPC
About the HoloViz ecosystem	Helpful	How different HoloViz packages work with each other
Sentinel-2 L2A User Guide	Necessary	Background about the satellite data
Sentinel-2 L2A data definitions	Helpful	Section 2.3.10 has some useful information about the data we access in this cookbook

Time to learn: 15 minutes

Imports

import os
import pandas as pd
import numpy as np
import xarray as xr
import pystac_client
import planetary_computer
import panel as pn
import panel.widgets as pnw
import hvplot.xarray
import holoviews as hv
import geoviews as gv
from pystac.extensions.eo import EOExtension as eo
import datetime
from cartopy import crs
from dask.distributed import Client, LocalCluster
import odc.stac

xr.set_options(keep_attrs=True)
hv.extension('bokeh')
gv.extension('bokeh')

Since we will use dask to distribute our computation, we can create a dask cluster and connect to it via a dask client. For this recipe, we will create a LocalCluster with default values.

cluster = LocalCluster()
client = Client(cluster)
client

Client

Client-90df8e34-2371-11ef-89dc-6045bdff2074

Connection method: Cluster object	Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

LocalCluster

3b2c33a5

Dashboard: http://127.0.0.1:8787/status	Workers: 4
Total threads: 4	Total memory: 15.61 GiB
Status: running	Using processes: True

Scheduler Info

Scheduler

Scheduler-2770dd40-80c9-4c69-8550-4784be2ba3ca

Comm: tcp://127.0.0.1:45891	Workers: 4
Dashboard: http://127.0.0.1:8787/status	Total threads: 4
Started: Just now	Total memory: 15.61 GiB

Workers

Worker: 0

Comm: tcp://127.0.0.1:45773	Total threads: 1
Dashboard: http://127.0.0.1:42195/status	Memory: 3.90 GiB
Nanny: tcp://127.0.0.1:35503
Local directory: /tmp/dask-scratch-space/worker-qbqro139

Worker: 1

Comm: tcp://127.0.0.1:46195	Total threads: 1
Dashboard: http://127.0.0.1:35139/status	Memory: 3.90 GiB
Nanny: tcp://127.0.0.1:38051
Local directory: /tmp/dask-scratch-space/worker-vxy5x3pp

Worker: 2

Comm: tcp://127.0.0.1:42929	Total threads: 1
Dashboard: http://127.0.0.1:45771/status	Memory: 3.90 GiB
Nanny: tcp://127.0.0.1:39357
Local directory: /tmp/dask-scratch-space/worker-_zp6mbt3

Worker: 3

Comm: tcp://127.0.0.1:33245	Total threads: 1
Dashboard: http://127.0.0.1:45795/status	Memory: 3.90 GiB
Nanny: tcp://127.0.0.1:45547
Local directory: /tmp/dask-scratch-space/worker-s3wb_blm

Open the catalog

The root of Microsoft Planetary Computer’s STAC API endpoint is located at https://planetarycomputer.microsoft.com/api/stac/v1. We will load in the catalog using the pystac_client.Client.open method. Even though the STAC metadata in MSPC is publicly accessible, authentication is required to access the actual data. The modifier keyword can be used to explicitly “sign” an item, which basically means we can access the privately stored data (more information here).

stac_root = 'https://planetarycomputer.microsoft.com/api/stac/v1'
catalog = pystac_client.Client.open(
    stac_root,
    modifier=planetary_computer.sign_inplace
)
print(f"{catalog.title} - {catalog.description}")

Microsoft Planetary Computer STAC API - Searchable spatiotemporal metadata describing Earth science datasets hosted by the Microsoft Planetary Computer

Let’s search for any collections that have the substring “sentinel-2” in it to discover the sentinel-2 data.

sentinel2_collections = [collection for collection in catalog.get_collections() if "sentinel-2" in collection.id]
sentinel2_collections

[<CollectionClient id=sentinel-2-l2a>]

Looks like there is only one collection (sentinel-2-l2a) available in the catalog - which is the dataset we want to use.

Query, Filter and Load the collection

Now that we have the ID of the collection of interest, we can specify certain filters to narrow down to exactly the data we want to look at. The final visualization in the recipe will look at how the NCAR Mesa Lab, Boulder CO looks like throughout the year as seen from space. To narrow down our search, we will use the following criteria -

Bounding box: We will limit our spatial extent to the bounding box of the NCAR Mesa Lab region.
Date range: We will look at the year 2022
Collection: sentinel-2-l2a (from previous cell)
Cloud threshold: Since cloud can block the satellite from making an observation of the ground, we will limit our search to satellite images where the cloud cover is less than a certain threshold, 40% in this case.

Feel free to change these filtering paramters to suit your needs when running in an interactive session.

bbox = [-105.283263,39.972809,-105.266569,39.987640] # NCAR, boulder, CO. bbox from http://bboxfinder.com/
date_range = "2022-01-01/2022-12-31"
collection = "sentinel-2-l2a"                        # full id of collection
cloud_thresh = 30

search = catalog.search(
    collections = sentinel2_collections,
    bbox = bbox,
    datetime = date_range,
    query={"eo:cloud_cover": {"lt": cloud_thresh}}
)
items = search.item_collection()
print(f"Found {len(items)} items in the {collection}")

Found 64 items in the sentinel-2-l2a

We now have an ItemCollection with the data that we need. Let’s look at one of the items in the collection and explore what assets it has.

first_item = items.items[0]
all_bands = list(first_item.assets.keys())
print("Assets available:")
print(*all_bands, sep=', ')

Assets available:
AOT, B01, B02, B03, B04, B05, B06, B07, B08, B09, B11, B12, B8A, SCL, WVP, visual, preview, safe-manifest, granule-metadata, inspire-metadata, product-metadata, datastrip-metadata, tilejson, rendered_preview

Seems like there are a lot of assets associated with the item - you can read about them here. We are interested in the assets that start with a ‘B’, which are the bands associated with the different spectral bands in which the MultiSpectral Instrument (MSI) of the satellite captures observations. Specifically, the RGB – or Red, Green and Blue - bands that we need to create a “True color” image are as follows:

band	corresponds to
B04	Red
B03	Green
B02	Blue

We will use the odc.stac.stac_load function to load in the assets that start with the alphabet ‘B’. This function will return a lazily-loaded xr.DataSet (using dask). For plotting purposes it is better if we have the data as a xr.DataArray instead with the bands as a dimension. We can do that using .to_array(dim=<dim_name>) method of a dataset.

bands_of_interest = [b for b in all_bands if b.startswith('B')]

da = odc.stac.stac_load(
    items,
    bands=bands_of_interest,
    bbox=bbox,
    chunks={},  # <-- use Dask
).to_array(dim='band')
da

<xarray.DataArray (band: 12, time: 64, y: 166, x: 144)> Size: 73MB
dask.array<stack, shape=(12, 64, 166, 144), dtype=float32, chunksize=(1, 1, 166, 144), chunktype=numpy.ndarray>
Coordinates:
  * y            (y) float64 1kB 4.426e+06 4.426e+06 ... 4.425e+06 4.425e+06
  * x            (x) float64 1kB 4.758e+05 4.758e+05 ... 4.772e+05 4.772e+05
    spatial_ref  int32 4B 32613
  * time         (time) datetime64[ns] 512B 2022-01-26T17:56:21.024000 ... 20...
  * band         (band) object 96B 'B01' 'B02' 'B03' 'B04' ... 'B11' 'B12' 'B8A'

Prepare the data for visualization

On January 25th, 2022, the European Space Agency (data provider for Sentinel-2 satellite) made a change in their processing pipeline to address some issues that you can read about here if interested. For the purpose of this notebook, we will process the newer dataset such that it is harmonized with the old processing pipeline - in simple words, we will make sure that the data has the same statistical properties so that we can visualize them seamlessly.

# from https://planetarycomputer.microsoft.com/dataset/sentinel-2-l2a#Baseline-Change
def harmonize_to_old(data):  
    """
    Harmonize new Sentinel-2 data to the old baseline.

    Parameters
    ----------
    data: xarray.DataArray
        A DataArray with four dimensions: time, band, y, x

    Returns
    -------
    harmonized: xarray.DataArray
        A DataArray with all values harmonized to the old
        processing baseline.
    """
    cutoff = datetime.datetime(2022, 1, 25)
    offset = 1000
    bands = ["B01","B02","B03","B04","B05","B06","B07","B08","B8A","B09","B10","B11","B12"]

    old = data.sel(time=slice(cutoff))

    to_process = list(set(bands) & set(data.band.data.tolist()))
    new = data.sel(time=slice(cutoff, None)).drop_sel(band=to_process)

    new_harmonized = data.sel(time=slice(cutoff, None), band=to_process).clip(offset)
    new_harmonized -= offset

    new = xr.concat([new, new_harmonized], "band").sel(band=data.band.data.tolist())
    return xr.concat([old, new], dim="time")

da = harmonize_to_old(da)
da

<xarray.DataArray (band: 12, time: 64, y: 166, x: 144)> Size: 73MB
dask.array<getitem, shape=(12, 64, 166, 144), dtype=float32, chunksize=(1, 1, 166, 144), chunktype=numpy.ndarray>
Coordinates:
  * y            (y) float64 1kB 4.426e+06 4.426e+06 ... 4.425e+06 4.425e+06
  * x            (x) float64 1kB 4.758e+05 4.758e+05 ... 4.772e+05 4.772e+05
    spatial_ref  int32 4B 32613
  * time         (time) datetime64[ns] 512B 2022-01-26T17:56:21.024000 ... 20...
  * band         (band) object 96B 'B01' 'B02' 'B03' 'B04' ... 'B11' 'B12' 'B8A'

Now that we have a harmonized dataset, we still need to process the data as follows:

Sentinel-2 L2A provides the Surface Reflectance (SR) data, which usually ranges from 0 (no reflection) to 1.0 (complete reflection). However, the actual values in the loaded dataset ranges from 0 to ~10,000. These data values need to be scaled to 0.0-1.0 by dividing the data by 10,000. More details can be found in section 2.3.10 of this document.

We will then explicitly trigger the dask computation using the compute() method and load the result into memory. This is to reduce repeated calls to retrieve data from MSPC. By loading the processed This wouldn’t have been possible if the dataset was large.

da = da / 1e4   # Scale data values from 0:10000 to 0:1.0
da = da / da.max(dim='band')  # additionally scale from 0-max -> 0-1 for visual quality
da = da.compute()

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))
/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))
/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

/home/runner/miniconda3/envs/interactive-sentinel-2-cookbook-dev/lib/python3.12/site-packages/dask/core.py:127: RuntimeWarning: invalid value encountered in divide
  return func(*(_execute_task(a, cache) for a in args))

We have now processed the data so that we can visualize it!

Let’s look at the Blue, Green and Red bands.

def plot_band(band, cmap):
    return da.sel(band=band).isel(time=0).hvplot(
        x='x', y='y', data_aspect=1, 
        cmap=cmap, geo=True, tiles='ESRI', 
        crs=crs.epsg(items[0].properties['proj:epsg']), rasterize=True,
        title=f"band: {band}, cmap: {cmap}",
        clabel='surface reflectance [0.0-1.0]'
    ).opts(
        frame_width=300,
        xlabel='longitude',
        ylabel='latitude',
        
    )

(plot_band('B04', 'Blues') + plot_band('B03', 'Greens') + plot_band('B02', 'Reds')).cols(2)

Let us make a dashboard composed of 4 different interactive plots showing the RGB view of the satellite observation for four different seasons. We need a function that will take a time input and does the following tasks:

plot an interactive RGB image of the data and overlay it on a map of the world.
provide a date slider widget which can be used to interact with the plot.
only set the default value of the date slider to the time, but allow the user to slide through the length of the entire dataset.

Using this function, we will be able to compose the dashboard.

def rgb_during(time):
    season_names = {
        1: 'Winter',
        2: 'Spring',
        3: 'Summer',
        4: 'Fall'
    }
    da_rgb = da.sel(band=['B04', 'B03', 'B02'])
    start_date = pd.to_datetime(da_rgb['time'].min().data).to_pydatetime()
    end_date = pd.to_datetime(da_rgb['time'].max().data).to_pydatetime()
    closest_date = pd.to_datetime(da_rgb.sel(time=time, method='nearest').time.data).to_pydatetime()
    dt_slider = pnw.DateSlider(name='Date', start=start_date, end=end_date, value=closest_date)
    
    def get_obs_on(t):
        season_key = [month%12 // 3 + 1 for month in range(1, 13)][t.month-1]
        season = season_names[season_key]
        return da.sel(band=['B04', 'B03', 'B02']).sel(time=t, method='nearest').transpose('y', 'x', 'band').hvplot.rgb(
            x='x', y='y', bands='band', 
            geo=True, tiles='ESRI', crs=crs.epsg(items[0].properties['proj:epsg']), 
            rasterize=True, title=f"{season}: {t.strftime('%Y-%m-%d')}"
        ).opts(
            frame_width=300,
            xlabel='longitude',
            ylabel='latitude',
        )
        
    
    return pn.panel(pn.Column(
                pn.bind(get_obs_on, t=dt_slider), 
                pn.Row(
                    pn.Spacer(width=60),
                    dt_slider,
                )
            ))

rgb_during('2022-02-01')

Let’s now compose a dashboard using panel.

winter = '2022-01-15'
spring = '2022-04-30'
summer = '2022-08-01'
fall = '2022-09-15'

winter_plot = rgb_during(winter)
spring_plot = rgb_during(spring)
summer_plot = rgb_during(summer)
fall_plot = rgb_during(fall)

pn.Column(
    pn.Row(winter_plot, spring_plot),
    pn.Row(summer_plot, fall_plot)
)

Summary

In this recipe, we looked at how to access Sentinel-2 satellite data over a region of interest and create an interactive dashboard to visualize the data.

What’s next?

We plotted the RGB or True color image of our region of interest using a subset of all the bands available. We can further calculate useful indices, such as the Normalized Difference Snow Index (NDSI) or the Normalized Difference Vegetation Index.

Resources and references

Authored by Pritam Das (@pritamd47), June 2023 during Project Pythia cookoff 2023.
This notebook takes a lot of inspiration from the Landsat ML Cookbook by Demetris Roumis.
This notebook uses concepts and code illustrated in the Reading Data from the STAC API.