Spaghetti Plots

Overview

Spaghetti plots are a tool typically used to visualize movement. Essentially they are many line plots displayed on the same axes. By drawing the same path at different times or from different forecasts, we can see the patterns and chaos associated with the plotted variable.

Spaghetti Hurricane Plot
Spaghetti Contour Plot

Prerequisites

Concepts	Importance	Notes
Matplotlib	Necessary
Cartopy	Necessary

Time to learn: 10 minutes

import numpy as np
import xarray as xr
import datetime

import matplotlib as mpl
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import matplotlib.pylab as pl

import cartopy.crs as ccrs
import cartopy.feature as cfeature

import geocat.viz as gv
import geocat.datafiles as gdf

import tropycal.tracks as tracks

import warnings
warnings.filterwarnings('ignore')

Spaghetti Hurricane Plot

Visualizing the predicted path of an incoming hurricane is both complicated and important. There are many plots that meteorologists are trained to read, but when shared with the public can be confusing or alarming. There are strengths and weaknesses to each hurricane visualization approach. The cone of uncertainty, for example, is often misinterpreted to suggest the hurricane growth in time rather than widening of path possibilities. Spaghetti plots on the other hand, clearly show hurricane paths but show them as equal to each other.

In this example we will plot some forecasted paths from the 2012 North-Atlantic storm Hurricane Sandy. Each forecast is from the Global Ensemble Forecast System (GEFS) provided by the National Centers for Environmental Prediction at NOAA.

We’ll use the package tropycal to easily access HURDAT2 and IBTrACS reanalysis data and operational National Hurricane Center (NHC) Best Track data. tropycal has a lot of great features for real time hurricane visualization, but since this Cookbook is comparatively static we’re using a past hurricane and only using this package to access the data. Our plotting will be done with matplotlib and cartopy.

Read in Data

First, to grab our hurricane data from tropycal we need to specify a basin:

basin = tracks.TrackDataset(basin='north_atlantic')

--> Starting to read in HURDAT2 data

--> Completed reading in HURDAT2 data (2.02 seconds)

Find your storm by name and year:

storm = basin.get_storm(('sandy',2012))

sandy_ds = storm.to_xarray()
sandy_ds

<xarray.Dataset> Size: 5kB
Dimensions:    (time: 45)
Coordinates:
  * time       (time) datetime64[ns] 360B 2012-10-21T18:00:00 ... 2012-10-31T...
Data variables:
    extra_obs  (time) int64 360B 0 0 0 0 0 0 0 0 0 0 0 ... 0 0 1 1 0 0 0 0 0 0 0
    special    (time) <U1 180B '' '' '' '' '' '' '' '' ... '' '' '' '' '' '' ''
    type       (time) <U2 360B 'LO' 'LO' 'LO' 'TD' 'TS' ... 'EX' 'EX' 'EX' 'EX'
    lat        (time) float64 360B 14.3 13.9 13.5 13.1 ... 40.4 40.7 41.1 41.5
    lon        (time) float64 360B -77.4 -77.8 -78.2 -78.6 ... -79.8 -80.3 -80.7
    vmax       (time) int64 360B 25 25 25 30 35 40 40 ... 70 55 50 40 35 35 30
    mslp       (time) int64 360B 1006 1005 1003 1002 1000 ... 986 992 993 995
    wmo_basin  (time) <U14 3kB 'north_atlantic' ... 'north_atlantic'
Attributes:
    id:              AL182012
    operational_id:  AL182012
    name:            SANDY
    year:            2012
    season:          2012
    basin:           north_atlantic
    source_info:     NHC Hurricane Database
    source:          hurdat
    ace:             13.6675

xarray.Dataset

Dimensions:
- time: 45

Coordinates: (1)

time

(time)

datetime64[ns]

2012-10-21T18:00:00 ... 2012-10-...

array(['2012-10-21T18:00:00.000000000', '2012-10-22T00:00:00.000000000',
       '2012-10-22T06:00:00.000000000', '2012-10-22T12:00:00.000000000',
       '2012-10-22T18:00:00.000000000', '2012-10-23T00:00:00.000000000',
       '2012-10-23T06:00:00.000000000', '2012-10-23T12:00:00.000000000',
       '2012-10-23T18:00:00.000000000', '2012-10-24T00:00:00.000000000',
       '2012-10-24T06:00:00.000000000', '2012-10-24T12:00:00.000000000',
       '2012-10-24T18:00:00.000000000', '2012-10-24T19:00:00.000000000',
       '2012-10-25T00:00:00.000000000', '2012-10-25T05:25:00.000000000',
       '2012-10-25T06:00:00.000000000', '2012-10-25T09:00:00.000000000',
       '2012-10-25T12:00:00.000000000', '2012-10-25T18:00:00.000000000',
       '2012-10-26T00:00:00.000000000', '2012-10-26T06:00:00.000000000',
       '2012-10-26T12:00:00.000000000', '2012-10-26T18:00:00.000000000',
       '2012-10-27T00:00:00.000000000', '2012-10-27T06:00:00.000000000',
       '2012-10-27T12:00:00.000000000', '2012-10-27T18:00:00.000000000',
       '2012-10-28T00:00:00.000000000', '2012-10-28T06:00:00.000000000',
       '2012-10-28T12:00:00.000000000', '2012-10-28T18:00:00.000000000',
       '2012-10-29T00:00:00.000000000', '2012-10-29T06:00:00.000000000',
       '2012-10-29T12:00:00.000000000', '2012-10-29T18:00:00.000000000',
       '2012-10-29T21:00:00.000000000', '2012-10-29T23:30:00.000000000',
       '2012-10-30T00:00:00.000000000', '2012-10-30T06:00:00.000000000',
       '2012-10-30T12:00:00.000000000', '2012-10-30T18:00:00.000000000',
       '2012-10-31T00:00:00.000000000', '2012-10-31T06:00:00.000000000',
       '2012-10-31T12:00:00.000000000'], dtype='datetime64[ns]')

Data variables: (8)

extra_obs

(time)

int64

0 0 0 0 0 0 0 0 ... 1 0 0 0 0 0 0 0

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0,
       0])

special

(time)

<U1

'' '' '' '' '' ... '' '' '' '' ''

array(['', '', '', '', '', '', '', '', '', '', '', '', '', 'L', '', 'L',
       '', 'T', '', '', '', '', '', '', '', '', '', '', '', '', '', '',
       '', '', '', '', 'S', 'L', '', '', '', '', '', '', ''], dtype='<U1')

type

(time)

<U2

'LO' 'LO' 'LO' ... 'EX' 'EX' 'EX'

array(['LO', 'LO', 'LO', 'TD', 'TS', 'TS', 'TS', 'TS', 'TS', 'TS', 'TS',
       'HU', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU',
       'HU', 'HU', 'TS', 'TS', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU', 'HU',
       'HU', 'HU', 'HU', 'EX', 'EX', 'EX', 'EX', 'EX', 'EX', 'EX', 'EX',
       'EX'], dtype='<U2')

lat

(time)

float64

14.3 13.9 13.5 ... 40.7 41.1 41.5

array([14.3, 13.9, 13.5, 13.1, 12.7, 12.6, 12.9, 13.4, 14. , 14.7, 15.6,
       16.6, 17.7, 17.9, 18.9, 20. , 20.1, 20.9, 21.7, 23.3, 24.8, 25.7,
       26.4, 27. , 27.5, 28.1, 28.8, 29.7, 30.5, 31.3, 32. , 32.8, 33.9,
       35.3, 36.9, 38.3, 38.8, 39.4, 39.5, 39.9, 40.1, 40.4, 40.7, 41.1,
       41.5])

lon

(time)

float64

-77.4 -77.8 -78.2 ... -80.3 -80.7

array([-77.4, -77.8, -78.2, -78.6, -78.7, -78.4, -78.1, -77.9, -77.6,
       -77.3, -77.1, -76.9, -76.7, -76.6, -76.4, -76. , -76. , -75.7,
       -75.5, -75.3, -75.9, -76.4, -76.9, -77.2, -77.1, -76.9, -76.5,
       -75.6, -74.7, -73.9, -73. , -72. , -71. , -70.5, -71. , -73.2,
       -74. , -74.4, -74.5, -76.2, -77.8, -78.9, -79.8, -80.3, -80.7])

vmax

(time)

int64

25 25 25 30 35 ... 50 40 35 35 30

array([ 25,  25,  25,  30,  35,  40,  40,  40,  45,  55,  60,  65,  75,
        75,  85, 100, 100,  95,  95,  90,  75,  70,  65,  65,  60,  60,
        70,  70,  65,  65,  65,  65,  70,  80,  85,  80,  75,  70,  70,
        55,  50,  40,  35,  35,  30])

mslp

(time)

int64

1006 1005 1003 1002 ... 992 993 995

array([1006, 1005, 1003, 1002, 1000,  998,  998,  995,  993,  990,  987,
        981,  972,  971,  964,  954,  954,  960,  966,  963,  965,  968,
        970,  971,  969,  968,  956,  960,  960,  959,  954,  952,  950,
        947,  945,  940,  943,  945,  946,  960,  978,  986,  992,  993,
        995])

wmo_basin

(time)

<U14

'north_atlantic' ... 'north_atla...

array(['north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic',
       'north_atlantic', 'north_atlantic', 'north_atlantic'], dtype='<U14')

Indexes: (1)

time

PandasIndex

PandasIndex(DatetimeIndex(['2012-10-21 18:00:00', '2012-10-22 00:00:00',
               '2012-10-22 06:00:00', '2012-10-22 12:00:00',
               '2012-10-22 18:00:00', '2012-10-23 00:00:00',
               '2012-10-23 06:00:00', '2012-10-23 12:00:00',
               '2012-10-23 18:00:00', '2012-10-24 00:00:00',
               '2012-10-24 06:00:00', '2012-10-24 12:00:00',
               '2012-10-24 18:00:00', '2012-10-24 19:00:00',
               '2012-10-25 00:00:00', '2012-10-25 05:25:00',
               '2012-10-25 06:00:00', '2012-10-25 09:00:00',
               '2012-10-25 12:00:00', '2012-10-25 18:00:00',
               '2012-10-26 00:00:00', '2012-10-26 06:00:00',
               '2012-10-26 12:00:00', '2012-10-26 18:00:00',
               '2012-10-27 00:00:00', '2012-10-27 06:00:00',
               '2012-10-27 12:00:00', '2012-10-27 18:00:00',
               '2012-10-28 00:00:00', '2012-10-28 06:00:00',
               '2012-10-28 12:00:00', '2012-10-28 18:00:00',
               '2012-10-29 00:00:00', '2012-10-29 06:00:00',
               '2012-10-29 12:00:00', '2012-10-29 18:00:00',
               '2012-10-29 21:00:00', '2012-10-29 23:30:00',
               '2012-10-30 00:00:00', '2012-10-30 06:00:00',
               '2012-10-30 12:00:00', '2012-10-30 18:00:00',
               '2012-10-31 00:00:00', '2012-10-31 06:00:00',
               '2012-10-31 12:00:00'],
              dtype='datetime64[ns]', name='time', freq=None))

Attributes: (9)
id :
AL182012
operational_id :
AL182012
name :
SANDY
year :
2012
season :
2012
basin :
north_atlantic
source_info :
NHC Hurricane Database
source :
hurdat
ace :
13.6675

And we can grab any of a number of forecasts:

forecasts = storm.get_operational_forecasts()
print(forecasts.keys())

dict_keys(['CARQ', 'CMC', 'NAM', 'NGX', 'UKX', 'AC00', 'AEM2', 'AEMN', 'AP01', 'AP02', 'AP03', 'AP04', 'AP05', 'AP06', 'AP07', 'AP08', 'AP09', 'AP10', 'AP12', 'AP13', 'AP14', 'AP15', 'AP16', 'AP17', 'AP18', 'AP20', 'AVN2', 'AVNO', 'BAMD', 'BAMM', 'BAMS', 'CEMN', 'CLIP', 'CLP5', 'CMC2', 'COT2', 'COTC', 'DSHP', 'FIM9', 'FM92', 'G012', 'GFD2', 'GFDE', 'GFDL', 'GFDT', 'GFT2', 'GHM2', 'GP01', 'GPM2', 'GPMN', 'HWE2', 'HWF2', 'HWFE', 'HWRF', 'ICON', 'IV15', 'IVCN', 'IVCR', 'LBAR', 'LGEM', 'NAM2', 'NGX2', 'OFCP', 'OFP2', 'SHF5', 'SHFR', 'SHIP', 'TCLP', 'TV15', 'TVCA', 'TVCC', 'TVCE', 'TVCN', 'UWN2', 'UWN8', 'XTRP', 'ZGFS', 'AEMI', 'AP11', 'AP19', 'AVNI', 'CMCI', 'COTI', 'FM9I', 'GFDI', 'GFTI', 'GHMI', 'GPMI', 'HWFI', 'NAMI', 'NGXI', 'OFPI', 'RI25', 'SPC3', 'UKXI', 'DRCL', 'GFE2', 'MRCL', 'MRFO', 'UKX2', 'UKM', 'AHW4', 'G01I', 'OFCL', 'OCD5', 'BCD5', 'OCS5', 'BCS5', 'OFCI', 'UKMI', 'AHW2', 'EGRR', 'FSSE', 'RI30', 'RI35', 'RYOC', 'UKM2', 'AHWI', 'EGRI', 'TCOA', 'EGR2', 'UWNI', 'HWEI', 'APSU', 'APSI', 'APS2', 'OFC2'])

Each key represents a forecast model, we’ll select the GFS AP01 forecast which has many initializations. These initializations are named by time in YYYYMMDDHH format:

forecasts_AP01 = forecasts['AP01']
print(forecasts_AP01.keys())

dict_keys(['2012102112', '2012102118', '2012102200', '2012102206', '2012102212', '2012102218', '2012102300', '2012102306', '2012102312', '2012102318', '2012102400', '2012102406', '2012102412', '2012102418', '2012102500', '2012102506', '2012102512', '2012102518', '2012102600', '2012102606', '2012102612', '2012102618', '2012102700', '2012102706', '2012102712', '2012102718', '2012102800', '2012102806', '2012102812', '2012102818', '2012102900', '2012102906', '2012102912', '2012102918', '2012103000', '2012103006', '2012103012', '2012103018', '2012103100'])

Spaghetti Plot of One Esemble Member

Let’s set up our Cartopy grid to plot one ensemble member of the hurricane model.

These steps might be familiar to you but we:

create our axes with a Plate Caree projection
add land features
add grid lines
edit our gridline labels to not duplicate on all four sides
edit our gridline label fontsize

# Set up Cartopy Projection with land features
ax = plt.axes(projection=ccrs.PlateCarree())
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Add Gridlines to right and bottom
gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True,
                  linewidth=.25, color='gray', alpha=0.5, linestyle='--')
gl.xlabels_top = False
gl.ylabels_left = False
gl.xlabel_style = {'size': 8,}
gl.ylabel_style = {'size': 8,}

../_images/0d83e35d9be19cf540c17b9c96c17f1e0b0e3f19d1091b1c47bc52dfc0b9ff00.png

Looking at GFS Ensemble Member Forecast AP01, we can make a spaghetti plot of each of these initializations.

The crux of the visualization is that we loop through and plot each initialization (in for i in forecasts_AP01), plot the true hurricane path, and add a legend (plt.legend()).

# Set up Cartopy Projection with land features
ax = plt.axes(projection=ccrs.PlateCarree())
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Add Gridlines to right and bottom
gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True,
                  linewidth=.25, color='gray', alpha=0.5, linestyle='--')
gl.xlabels_top = False
gl.ylabels_left = False
gl.xlabel_style = {'size': 8,}
gl.ylabel_style = {'size': 8,}

# Spaghetti Plot of AP01 forecasts
forecasts_AP01 = forecasts['AP01']
for i in forecasts_AP01:
    # We're naming this line even though it is over-written each loop,
    # so that we can reference the last line in the legend
    # (as they all share the same formatting)
    forecast_path = plt.plot(forecasts_AP01[i]['lon'],
                            forecasts_AP01[i]['lat'],
                            color='cornflowerblue',
                            linewidth=0.5)

# Plot the real storm path in a thicker black line
true_path = plt.plot(sandy_ds.lon,
    sandy_ds.lat,
    color='k',
    linewidth=1) # Make it thicker than the ensemble paths

# Add a legend with only one forecast_path and the true_path
plt.legend([true_path[0], forecast_path[0]], ['True Path', 'GFS AP01 Forecasts'])

plt.title('Hurricane Sandy (2012)');

../_images/4f39ed234999a40234d9407af3e40d691c0ce2d2f27823b454817c1c0f168c94.png

This plot is a great example of a spaghetti plot, but is it super useful? Is it confusing? Each line looks like it carries the same weight, when some of these possible paths are from hours before Sandy hit the Northeastern United States and others are from days before.

Maybe it is better to show the user some indication of how the forecast for this ensemble converged on the true path with later and later initialization times.

Spaghetti Plot of One Esemble Member with Temporal Colormapping

Some additions to look out for:

we grab the time information from the initialization name using datetime
normalize a colormap by the time information with cmap = mpl.colors.ListedColormap(plt.cm.autumn_r(normalized_times))
loop through the colormap as we loop through the time steps within the ensemble member
add a colorbar with plt.colorbar(plt.cm.ScalarMappable(cmap=cmap)), where ScalarMappable is used to map scalar data to RGBA.

# Set up Cartopy Projection with land features
ax = plt.axes(projection=ccrs.PlateCarree())
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Add Gridlines to right and bottom
gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True,
                  linewidth=.25, color='gray', alpha=0.5, linestyle='--')
gl.xlabels_top = False
gl.ylabels_left = False
gl.xlabel_style = {'size': 8,}
gl.ylabel_style = {'size': 8,}

# Spaghetti Plot of AP01 forecasts
forecasts_AP01 = forecasts['AP01']

# Get time information from initialization name
format = '%Y%m%d%H'
times = [datetime.datetime.strptime(i, format) for i in list(forecasts_AP01.keys())]
normalized_times = [(i - times[0]) / (times[-1] - times[0]) for i in times]

# Create a color list for forecast iteration
cmap = mpl.colors.ListedColormap(plt.cm.autumn_r(normalized_times))

c = 0
for i in forecasts_AP01:
    plt.plot(forecasts_AP01[i]['lon'],
        forecasts_AP01[i]['lat'],
        color=cmap(c),
        linewidth=0.5)
    c += 1

# Plot the real storm path
true_path = plt.plot(sandy_ds.lon,
    sandy_ds.lat,
    color='red', # Selecting a color matching one of the cmap extremes
    linewidth=1,
    label='True Path') # The easiest way to add a plot to the legend is with the label kwarg

# Add a legend with only one the true_path
# Forecasted paths will be shown in a colorbar
plt.legend()

plt.title('Hurricane Sandy')

# Add colorbar
cbar = plt.colorbar(plt.cm.ScalarMappable(cmap=cmap), ax=ax, orientation='horizontal', shrink=0.8, pad=0.075)
cbar.set_label('GFS AP01 Forecasts', labelpad=6)

# Set tick locations and labels for every 4th tick
# i.e. once a day (a new initialiation every 6 hours)
tick_indices = range(0, len(times), 4)
cbar.set_ticks([normalized_times[i] for i in tick_indices])
cbar.set_ticklabels([times[i].strftime('%d') for i in tick_indices], fontsize=8)
cbar.ax.text(1.02, 0.5, 'OCT-2012', va='top', ha='left', transform=cbar.ax.transAxes);

../_images/4a459ba4c8a2db5e39916b186033685b2a14ba67ea986a717e209147c2554f0b.png

Now we can see that as the storm progressed, the AP01 GFS Forecast Ensemble Member converges on Sandy’s true path as the storm progresses through October, 2012.

Alternatively, we may want to plot the possible hurricane paths from multiple GFS Forecast Ensemble members from the same iteration timestamp as a spaghetti plot.

Spaghetti Plot of All Esemble Members at One Point in Time

First, we need to grab all of the relevant forecast keys to GFS models (the ones that are titled AP## from 0 to 20):

# List of valid AP## keys from 0 to 20
GFS_keys = ['AP' + str(i).zfill(2) for i in range(1, 21)]

# Arbitrarily selected midnight on October 27, 2012 to plot all forecasts at
time = '2012102700'

# Set up Cartopy Projection with land features
ax = plt.axes(projection=ccrs.PlateCarree())
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Add Gridlines to right and bottom
gl = ax.gridlines(crs=ccrs.PlateCarree(), draw_labels=True,
                  linewidth=.25, color='gray', alpha=0.5, linestyle='--')
gl.xlabels_top = False
gl.ylabels_left = False
gl.xlabel_style = {'size': 8,}
gl.ylabel_style = {'size': 8,}

# Spaghetti Plot of forecasts
for i in range(20):
    ap = forecasts[GFS_keys[i]]
    forecast_path = plt.plot(ap[time]['lon'],
        ap[time]['lat'],
        color='cornflowerblue',
        linewidth=0.5)

# Plot the real storm path in a thicker black line
true_path = plt.plot(sandy_ds.lon, sandy_ds.lat, color='k', linewidth=1)

# Add a legend with only one forecast_path and the true_path
plt.legend([true_path[0], forecast_path[0]],
    ['True Path', 'AP01 - AP20'],
    loc='lower right')

plt.title('Hurricane Sandy - GFS Forecasts from Oct-27-2012');

../_images/bf520ce00098942d5c0c47ce93db3b6b01a95914f734885c36932e1da6623fb4.png

Hurricane Sandy hit the Northeast on October 29, 2012. From this spaghetti plot we can see that by the 27th most ensemble members of the GFS forecast predicted a similar behavior for the storm.

There is more analysis that could be done on hurriane trajectories. We have covered some plotting customization that might be useful for your analysis and data visualization.

Spaghetti Contour Plot

In this example we will read in the geopotential height datafile HGT500_MON_1958-1997.nc from using geocat-datafiles. Then we will look at different timesteps of the HGT geopotential height variable at the 5500 gpm level, plotting this contour’s locations through time. This example is adapted from GeoCAT’s NCL_conOncon_5 script.

Read in data:

ds = xr.open_dataset(gdf.get("netcdf_files/HGT500_MON_1958-1997.nc"),
                     decode_times=False)

ds

Downloading file 'netcdf_files/HGT500_MON_1958-1997.nc' from 'https://github.com/NCAR/GeoCAT-datafiles/raw/main/netcdf_files/HGT500_MON_1958-1997.nc' to '/home/runner/.cache/geocat'.

<xarray.Dataset> Size: 20MB
Dimensions:  (time: 480, lat: 73, lon: 144)
Coordinates:
  * time     (time) int64 4kB 0 1 2 3 4 5 6 7 ... 473 474 475 476 477 478 479
  * lat      (lat) float32 292B -90.0 -87.5 -85.0 -82.5 ... 82.5 85.0 87.5 90.0
  * lon      (lon) float32 576B 0.0 2.5 5.0 7.5 10.0 ... 350.0 352.5 355.0 357.5
Data variables:
    yrmon    (time) float64 4kB ...
    HGT      (time, lat, lon) float32 20MB ...
Attributes:
    conventions:    None
    history:        NCEP/NCAR REANALYSIS MONTHLY MEAN SUBSETS\nftp://ncardata...
    source:         NCEP Reanalysis; ds090.2
    title:          500mb Geopotential Height: 1958-1997
    source_mss:     /SHEA/HVL/HGT_1958-1997.nc:  500 mb extracted
    creation_date:  creation date: Tue Aug  7 16:31:48 MDT 2001

xarray.Dataset

Dimensions:
- time: 480
- lat: 73
- lon: 144

Coordinates: (3)

time
(time)
int64
0 1 2 3 4 5 ... 475 476 477 478 479
long_name :
elapsed time
units :
months since 1958-1-1 00:00:00
```
array([  0,   1,   2, ..., 477, 478, 479], shape=(480,))
```

lat

(lat)

float32

-90.0 -87.5 -85.0 ... 87.5 90.0

short_name :: lat
long_name :: latitude
units :: degrees_north

array([-90. , -87.5, -85. , -82.5, -80. , -77.5, -75. , -72.5, -70. , -67.5,
       -65. , -62.5, -60. , -57.5, -55. , -52.5, -50. , -47.5, -45. , -42.5,
       -40. , -37.5, -35. , -32.5, -30. , -27.5, -25. , -22.5, -20. , -17.5,
       -15. , -12.5, -10. ,  -7.5,  -5. ,  -2.5,   0. ,   2.5,   5. ,   7.5,
        10. ,  12.5,  15. ,  17.5,  20. ,  22.5,  25. ,  27.5,  30. ,  32.5,
        35. ,  37.5,  40. ,  42.5,  45. ,  47.5,  50. ,  52.5,  55. ,  57.5,
        60. ,  62.5,  65. ,  67.5,  70. ,  72.5,  75. ,  77.5,  80. ,  82.5,
        85. ,  87.5,  90. ], dtype=float32)

lon

(lon)

float32

0.0 2.5 5.0 ... 352.5 355.0 357.5

short_name :: lon
long_name :: longitude
units :: degrees_east

array([  0. ,   2.5,   5. ,   7.5,  10. ,  12.5,  15. ,  17.5,  20. ,  22.5,
        25. ,  27.5,  30. ,  32.5,  35. ,  37.5,  40. ,  42.5,  45. ,  47.5,
        50. ,  52.5,  55. ,  57.5,  60. ,  62.5,  65. ,  67.5,  70. ,  72.5,
        75. ,  77.5,  80. ,  82.5,  85. ,  87.5,  90. ,  92.5,  95. ,  97.5,
       100. , 102.5, 105. , 107.5, 110. , 112.5, 115. , 117.5, 120. , 122.5,
       125. , 127.5, 130. , 132.5, 135. , 137.5, 140. , 142.5, 145. , 147.5,
       150. , 152.5, 155. , 157.5, 160. , 162.5, 165. , 167.5, 170. , 172.5,
       175. , 177.5, 180. , 182.5, 185. , 187.5, 190. , 192.5, 195. , 197.5,
       200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5,
       225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5,
       250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5,
       275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5,
       300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5,
       325. , 327.5, 330. , 332.5, 335. , 337.5, 340. , 342.5, 345. , 347.5,
       350. , 352.5, 355. , 357.5], dtype=float32)

Data variables: (2)
- yrmon
  (time)
  float64
  ...
  short_name :
  YYYYMM
  long_name :
  Year-Month
  units :
  current date as 6 digit integer (YYYYMM)
```
[480 values with dtype=float64]
```
- HGT
  (time, lat, lon)
  float32
  ...
  lev :
  500.0
  short_name :
  HGT
  long_name :
  Geopotential Height
  units :
  gpm
```
[5045760 values with dtype=float32]
```

Indexes: (3)

time

PandasIndex

PandasIndex(Index([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,
       ...
       470, 471, 472, 473, 474, 475, 476, 477, 478, 479],
      dtype='int64', name='time', length=480))

lat

PandasIndex

PandasIndex(Index([-90.0, -87.5, -85.0, -82.5, -80.0, -77.5, -75.0, -72.5, -70.0, -67.5,
       -65.0, -62.5, -60.0, -57.5, -55.0, -52.5, -50.0, -47.5, -45.0, -42.5,
       -40.0, -37.5, -35.0, -32.5, -30.0, -27.5, -25.0, -22.5, -20.0, -17.5,
       -15.0, -12.5, -10.0,  -7.5,  -5.0,  -2.5,   0.0,   2.5,   5.0,   7.5,
        10.0,  12.5,  15.0,  17.5,  20.0,  22.5,  25.0,  27.5,  30.0,  32.5,
        35.0,  37.5,  40.0,  42.5,  45.0,  47.5,  50.0,  52.5,  55.0,  57.5,
        60.0,  62.5,  65.0,  67.5,  70.0,  72.5,  75.0,  77.5,  80.0,  82.5,
        85.0,  87.5,  90.0],
      dtype='float32', name='lat'))

lon

PandasIndex

PandasIndex(Index([  0.0,   2.5,   5.0,   7.5,  10.0,  12.5,  15.0,  17.5,  20.0,  22.5,
       ...
       335.0, 337.5, 340.0, 342.5, 345.0, 347.5, 350.0, 352.5, 355.0, 357.5],
      dtype='float32', name='lon', length=144))

Attributes: (6)
conventions :
None
history :
NCEP/NCAR REANALYSIS MONTHLY MEAN SUBSETS ftp://ncardata.ucar.edu/pub/reanalysis/monthly/presa/
source :
NCEP Reanalysis; ds090.2
title :
500mb Geopotential Height: 1958-1997
source_mss :
/SHEA/HVL/HGT_1958-1997.nc: 500 mb extracted
creation_date :
creation date: Tue Aug 7 16:31:48 MDT 2001

Initial Spaghetti Plot on North Polar Stereographic Projection

Again, first let’s set up our Cartopy axes. This time setting our projection to NorthPolarStereo.

# Set up Cartopy Map Projection
fig = plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.NorthPolarStereo())

gv.set_map_boundary(ax, [-180, 180], [0, 40], south_pad=1)
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Set draw_labels to False so that we can manually manipulate it
gl = ax.gridlines(ccrs.PlateCarree(),
                  draw_labels=False,
                  linestyle="--",
                  linewidth=1,
                  color='darkgray',
                  zorder=2)

../_images/edaf31b4e6085910de719fb76c7d8a53cd42a2835ec57cbb7aed5b875d5f7265.png

Then let’s add our data to this plot.

We will iterate through every 12th timestep

handling any artifacts of the global wrapping at 0 or 360 degrees with gv.xr_add_cyclic_longitudes(p, "lon")
and calling a contour plot on a single level.

# Set up Cartopy Map Projection
fig = plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.NorthPolarStereo())

gv.set_map_boundary(ax, [-180, 180], [0, 40], south_pad=1)
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Set draw_labels to False so that we can manually manipulate it
gl = ax.gridlines(ccrs.PlateCarree(),
                  draw_labels=False,
                  linestyle="--",
                  linewidth=1,
                  color='darkgray',
                  zorder=2)

# Iterate through the 19 timesteps, plotting the data
n = 19
for x in range(n):

    # Get a slice of data at the 12*x timestep
    p = ds.HGT.isel(time=12*x)

    # Use geocat-viz utility function to handle the no-shown-data artifact
    # of 0 and 360-degree longitudes
    slon = gv.xr_add_cyclic_longitudes(p, "lon")

    # Plot contour data at pressure level 5500 for the 12*x timestep
    p = slon.plot.contour(ax=ax,
                          transform=ccrs.PlateCarree(),
                          linewidths=0.5,
                          levels=[5500],
                          colors='blue',
                          add_labels=False)

../_images/6201cad0ee5dedf722bb75b0a4b16e789b28177e43553ee576803504e4aa0d7c.png

Adding Directional Labels to Polar Stereographic Projection

Adding labels to a map projection that aren’t lat/lon coordinates is less than intuitive. In this example we manually add labels and select their locations so that you can see NESW labels.

For this we use mticker.FixedLocator() to manipulare gridline spacing, manipulate East and West tick labels separately, and specify tick locations with ax.text(transform=ccrs.Geodetic()).

# Generate a figure
fig = plt.figure(figsize=(8, 8))

# Create an axis with a polar stereographic projection
ax = plt.axes(projection=ccrs.NorthPolarStereo())

# Add land feature to map
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Set map boundary to include latitudes between 0 and 40 and longitudes
# between -180 and 180 only
gv.set_map_boundary(ax, [-180, 180], [0, 40], south_pad=1)

# Set draw_labels to False so that you can manually manipulate it later
gl = ax.gridlines(ccrs.PlateCarree(),
                  draw_labels=False,
                  linestyle="--",
                  linewidth=1,
                  color='darkgray',
                  zorder=2)

# Manipulate latitude and longitude gridline numbers and spacing
gl.ylocator = mticker.FixedLocator(np.arange(0, 90, 15))
gl.xlocator = mticker.FixedLocator(np.arange(-180, 180, 30))

# Manipulate longitude labels (0, 30 E, 60 E, ..., 30 W, etc.)
ticks = np.arange(0, 210, 30)
etick = ['0'] + [
    r'%dE' % tick for tick in ticks if (tick != 0) & (tick != 180)
] + ['180']
wtick = [r'%dW' % tick for tick in ticks if (tick != 0) & (tick != 180)]
labels = etick + wtick
xticks = np.arange(0, 360, 30)
yticks = np.full_like(xticks, -5)  # Latitude where the labels will be drawn
for xtick, ytick, label in zip(xticks, yticks, labels):
    if label == '180':
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='top',
                transform=ccrs.Geodetic())
    elif label == '0':
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='bottom',
                transform=ccrs.Geodetic())
    else:
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='center',
                transform=ccrs.Geodetic())

# Iterate through 18 different timesteps
for x in range(19):

    # Get a slice of data at the 12*x+1 timestep
    p = ds.HGT.isel(time=12 * x + 1)

    # Use geocat-viz utility function to handle the no-shown-data artifact
    # of 0 and 360-degree longitudes
    slon = gv.xr_add_cyclic_longitudes(p, "lon")

    # Plot contour data at pressure level 5500 for the 12*x+1 timestep
    p = slon.plot.contour(ax=ax,
                          transform=ccrs.PlateCarree(),
                          linewidths=0.5,
                          levels=[5500],
                          colors='blue',
                          add_labels=False)

# Use geocat.viz.util convenience function to add titles
gv.set_titles_and_labels(ax,
                         maintitle=r"$\bf{Spaghetti}$" + " " + r"$\bf{Plot}$",
                         lefttitle=slon.long_name,
                         righttitle='5500 '+slon.units)

# Make tight layout
plt.tight_layout()

../_images/4c784f7d6b2c7b6bab0267f4b24212df38501b7fad055daaffe585f24c8678db.png

Now in this example, there isn’t necessarily a temporal progression of geopotential height, but to be sure let’s add a colormap component to each of our loops.

This is also useful because for your data visualization application there might be, and the commands are slightly different for a contour plot as for a line plot in the above example.

Contour Spaghetti Plot Temporal Colorbar Manipulation

Let’s update add a discrete colorbar that has yearly ticklabels. One challenge addressed in this example is setting the ticklabels to be in the center of each discrete color box.

New code lines here are:

creating a discrete colormap with plt.get_cmap('winter_r', n) and color bounds with np.linspace()
specifying the color in each contour call with colors=[cmap(bounds)[x]]
adjusting the time unit for the colorbar ticks
adding a colorbar for the normalized colormap, calling the orientation, shrink, and pad keyword arguments to make it display well
setting colorbar tick location to be at color midpoints with cbar.set_ticks(), yet forcing their labels to be years (not year midpoints) with cbar.set_ticklabels()

# Set up Cartopy Map Projection
fig = plt.figure(figsize=(8, 8))
ax = plt.axes(projection=ccrs.NorthPolarStereo())

gv.set_map_boundary(ax, [-180, 180], [0, 40], south_pad=1)
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Set draw_labels to False so that we can manually manipulate it
gl = ax.gridlines(ccrs.PlateCarree(),
                  draw_labels=False,
                  linestyle="--",
                  linewidth=1,
                  color='darkgray',
                  zorder=2)

# Manipulate latitude and longitude gridline numbers and spacing
gl.ylocator = mticker.FixedLocator(np.arange(0, 90, 15))
gl.xlocator = mticker.FixedLocator(np.arange(-180, 180, 30))

# Manipulate longitude labels (0, 30 E, 60 E, ..., 30 W, etc.)
ticks = np.arange(0, 210, 30)
etick = ['0'] + [
    r'%dE' % tick for tick in ticks if (tick != 0) & (tick != 180)
] + ['180']
wtick = [r'%dW' % tick for tick in ticks if (tick != 0) & (tick != 180)]
labels = etick + wtick
xticks = np.arange(0, 360, 30)
yticks = np.full_like(xticks, -5)  # Latitude where the labels will be drawn
for xtick, ytick, label in zip(xticks, yticks, labels):
    if label == '180':
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='top',
                transform=ccrs.Geodetic())
    elif label == '0':
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='bottom',
                transform=ccrs.Geodetic())
    else:
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='center',
                transform=ccrs.Geodetic())

# Create a color list for each of the 19 contours
n = 19
cmap = plt.get_cmap('winter_r', n) # the `, n` makes the colormap display discretized
bounds = np.linspace(0, 1, n)

# Iterate through the timesteps
for x in range(n):

    # Get a slice of data at the 12*x timestep
    p = ds.HGT.isel(time=12*x)

    # Handle wrapping artifacts
    slon = gv.xr_add_cyclic_longitudes(p, "lon")

    # Plot contour data at pressure level 5500 for the 12*x timestep
    p = slon.plot.contour(ax=ax,
                          transform=ccrs.PlateCarree(),
                          linewidths=0.5,
                          levels=[5500],
                          colors=[cmap(bounds)[x]], # set colors to use our new cmap
                          add_labels=False)

# Add a colorbar
# The default time unit is in months since 1958, years is more intuitive
year_0 = 1958
year_n = (ds.time.isel(time=12*n) / 12).astype(int) + year_0

norm = plt.Normalize(vmin=year_0, vmax=year_n)
cbar = plt.colorbar(plt.cm.ScalarMappable(cmap=cmap, norm=norm),
    ax=ax,
    orientation='vertical',
    shrink=0.8, # Shrink to the approximate size of the map
    pad = 0.1) # Pad so colorbar doesn't overlap with directional label

cbar.set_ticks(np.arange(year_0+0.5, year_n)) # Set tick locations to be at color midpoints
cbar.set_ticklabels(np.arange(year_0, year_n)) # Set tick labels to be years (despite their location value being year + 0.5)
cbar.set_label('Time (years)')

# Use geocat.viz.util convenience function to add titles
gv.set_titles_and_labels(ax,
                         maintitle=r"$\bf{Spaghetti}$" + " " + r"$\bf{Plot}$",
                         lefttitle=slon.long_name,
                         righttitle='5500 '+slon.units)

# Make tight layout
plt.tight_layout();

../_images/85442bd4a912fecb69dcd0827795c5582f117a8068d637b5d1dc4ce29b782a8b.png

Contour Spaghetti Plot with Hand-Picked Colors

If you want your plot to be visually appealing it might be worth selecting different colors for each contour plot in the for-loop, however these do not have to be sequentially ordered or time-aware. It is actually simplest to hand-pick colors for each loop. In this iteration of the plot we hand pick colors in a colorlist and plot the first time step on its own to demonstrate plotting one loop unlike the others.

# Generate a figure
fig = plt.figure(figsize=(8, 8))

# Create an axis with a polar stereographic projection
ax = plt.axes(projection=ccrs.NorthPolarStereo())

# Add land feature to map
ax.add_feature(cfeature.LAND, facecolor='lightgray')

# Set map boundary to include latitudes between 0 and 40 and longitudes
# between -180 and 180 only
gv.set_map_boundary(ax, [-180, 180], [0, 40], south_pad=1)

# Set draw_labels to False so that you can manually manipulate it later
gl = ax.gridlines(ccrs.PlateCarree(),
                  draw_labels=False,
                  linestyle="--",
                  linewidth=1,
                  color='darkgray',
                  zorder=2)

# Manipulate latitude and longitude gridline numbers and spacing
gl.ylocator = mticker.FixedLocator(np.arange(0, 90, 15))
gl.xlocator = mticker.FixedLocator(np.arange(-180, 180, 30))

# Manipulate longitude labels (0, 30 E, 60 E, ..., 30 W, etc.)
ticks = np.arange(0, 210, 30)
etick = ['0'] + [
    r'%dE' % tick for tick in ticks if (tick != 0) & (tick != 180)
] + ['180']
wtick = [r'%dW' % tick for tick in ticks if (tick != 0) & (tick != 180)]
labels = etick + wtick
xticks = np.arange(0, 360, 30)
yticks = np.full_like(xticks, -5)  # Latitude where the labels will be drawn
for xtick, ytick, label in zip(xticks, yticks, labels):
    if label == '180':
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='top',
                transform=ccrs.Geodetic())
    elif label == '0':
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='bottom',
                transform=ccrs.Geodetic())
    else:
        ax.text(xtick,
                ytick,
                label,
                fontsize=12,
                horizontalalignment='center',
                verticalalignment='center',
                transform=ccrs.Geodetic())

# Get slice of data at the 0th timestep - plot this contour line separately
# because it will be thicker than the other contour lines
p = ds.HGT.isel(time=0)

# Use geocat-viz utility function to handle the no-shown-data
# artifact of 0 and 360-degree longitudes
slon = gv.xr_add_cyclic_longitudes(p, "lon")

# Plot contour data at pressure level 5500 at the first timestep
p = slon.plot.contour(ax=ax,
                      transform=ccrs.PlateCarree(),
                      linewidths=1.5,
                      levels=[5500],
                      colors='black',
                      add_labels=False)

# Create a color list for each of the next 18 contours
colorlist = [
    "crimson", "green", "blue", "yellow", "cyan", "hotpink", "crimson",
    "skyblue", "navy", "lightyellow", "mediumorchid", "orange", "slateblue",
    "palegreen", "magenta", "springgreen", "pink", "forestgreen", "violet"
]

# Iterate through 18 different timesteps
for x in range(18):

    # Get a slice of data at the 12*x+1 timestep
    p = ds.HGT.isel(time=12 * x + 1)

    # Use geocat-viz utility function to handle the no-shown-data artifact
    # of 0 and 360-degree longitudes
    slon = gv.xr_add_cyclic_longitudes(p, "lon")

    # Plot contour data at pressure level 5500 for the 12*x+1 timestep
    p = slon.plot.contour(ax=ax,
                          transform=ccrs.PlateCarree(),
                          linewidths=0.5,
                          levels=[5500],
                          colors=colorlist[x],
                          add_labels=False)

# Use geocat.viz.util convenience function to add titles
gv.set_titles_and_labels(ax,
                         maintitle=r"$\bf{Spaghetti}$" + " " + r"$\bf{Plot}$",
                         lefttitle=slon.long_name,
                         righttitle='5500 '+slon.units)

# Make tight layout
plt.tight_layout()

../_images/e8e0a93702d6dc1db7d1add062b3bfa39ea2be8f38a975024e2a4a48d09fcf69.png

Summary

Spaghetti plots are many lines drawn on the same figure. They have pros and cons. They are visually stunning but can be confusing, so it is important to make sure your data visualization conveys accurate information either by manipulating color or linewidth. Since the manipulation of spaghetti plots have their own considerations, this chapter shows several design choices that you can use to jumpstart your visualization needs.

What’s next?

Next up let’s discuss Animation.

Resources and references

Tropycal documentation
GeoCat-examples visualization gallery
GeoCAT-viz documentation