Taylor Diagrams

Overview

Taylor diagrams are a visual way of representing a statistical summary of how at least two datasets compare, where all plotted datasets are statistically compared to the same reference dataset (typically climate observations). Taylor diagrams are radial plots, with distance from the origin determined by a normalized standard deviation of your dataset (normalized by dividing it by the standard deviation of the reference or observational dataset) and the angle determined by the correlation coefficient between your dataset and the reference.

Taylor diagrams are popular for displaying climatological data because the normalization of variances helps account for the widely varying numerical values of geoscientific variables such as temperature or precipitation.

This notebook explores how to create and customize Taylor diagrams using geocat-viz. See the more information on geocat-viz.TaylorDiagram.

Creating a Simple Taylor Diagram
Necessary Statistical Analysis
Plotting Different Ensemble Members
Plotting Multiple Models
Plotting Multiple Variables
Plotting Bias
Variants

Prerequisites

Concepts	Importance	Notes
Matplotlib	Necessary

Time to learn: 10 minutes

import matplotlib.pyplot as plt
import numpy as np
import xarray as xr

import cftime

import geocat.viz as gv
import geocat.datafiles as gdf

Downloading file 'registry.txt' from 'https://github.com/NCAR/GeoCAT-datafiles/raw/main/registry.txt' to '/home/runner/.cache/geocat'.

Creating a Simple Taylor Diagram

Before getting into the data computation necessary to create a Taylor diagram, let’s demonstrate how to make the simplest Taylor Diagram plot. Here we are using sample data with a normalized standard deviation of 0.6 and a correlation coefficient of 0.24.

# Create figure and Taylor Diagram instance
fig = plt.figure(figsize=(12, 12))
taylor = gv.TaylorDiagram(fig=fig, label='REF')

# Draw diagonal dashed lines from origin to correlation values
# Also enforces proper X-Y ratio
taylor.add_xgrid(np.array([0.6, 0.9]))

# Add a model dataset of one point
taylor.add_model_set(stddev=[.6], corrcoef=[.24]);

plt.title("Simple Taylor Diagram", size=26, pad=45); # Need to move title up

../_images/0e638fbfbdab626ceeaff1cdc4c21ac7f1b36aea42be5129a43a76562eb56d6f.png

Necessary Statistical Analysis

To make understanding a Taylor Diagram more meaningful or intuitive, let’s use some real data. Here we are going to use ERA5 reanalysis data as our observational dataset. CMIP5 temperature data from various representative concentration pathways (RCPs) and ensemble members as our model data.

Because these dataset can be so large, some data pre-processing has been done already to the datasets used in this example.

ERA5 and CMIP5 data have been spatially averaged (removing latitudinal and longitudinal dimensions)
ERA5 and CMIP5 data have been indexed to only include the year 2022
All ensembles from a given CMIP5 RCP model have been combined into one dataset.
Temperature and pressure variables from ERA5 have been combined into one dataset.

Perform the statistical calculations

We need to compute the standard deviation for both our ERA5 observed temperature data and our CMIP5 RCP8.5 modeled temperature.
Find the correlation coefficient between them.
Then, divide the model standard deviation by the observed standard deviation to normalize it around the value 1.

In the next cell we will perform this calculation for all ensemble members.

temp_rcp85_std = []
temp_rcp85_corr = []

std_temp_obsv = float(era5_temp.std().values)

for em in list(tas_rcp85.data_vars): # for each ensemble member
    std = float(tas_rcp85[em].std().values)
    std_norm = std / std_temp_obsv

    corr= float(xr.corr(era5_temp, tas_rcp85[em]).values)

    temp_rcp85_std.append(std_norm)
    temp_rcp85_corr.append(corr)

Plotting Different Ensemble Members

One application of a Taylor Diagram application is to plot the same variable from different ensembles of the same climate model.

This Taylor diagram differs from our simple example in that we’ve specified more keyword arguments in our taylor.add_model_set() call, specifying how we want our dots to be drawn. We’ve also added a legend of ensemble members with taylor.add_model_name().

# Create figure and Taylor Diagram instance
fig = plt.figure(figsize=(12, 12))
taylor = gv.TaylorDiagram(fig=fig, label='REF')
ax = plt.gca()

# Draw diagonal dashed lines from origin to correlation values
# Also enforces proper X-Y ratio
taylor.add_xgrid(np.array([0.6, 0.9]))

# Add model sets for p and t datasets
taylor.add_model_set(
    temp_rcp85_std,
    temp_rcp85_corr,
    fontsize=20,  # specify font size
    xytext=(-5, 10),  # marker label location, in pixels
    color='red', # specify marker color
    marker='o', # specify marker shape
    facecolors='none', # specify marker fill
    s=100)  # marker size

# Add legend of ensemble names
namearr = list(tas_rcp85.data_vars)
taylor.add_model_name(namearr, fontsize=16)

# Add figure title
plt.title("RCP85 Temperature", size=26, pad=45);

../_images/ead29352717552e2ecb7523251415935133450cc9b41f86989558ba274621b41.png

Plotting Multiple Models

Another potential use case for a Taylor diagram is to plot multiple models. Here we compare RCP2.6, RCP4.5, and RCP8.5 together.

Because it isn’t meaningful to compare ensemble members across model runs (the nature of the perturbations isn’t reliably similar across RCPs or labs), we will look at the first ensemble r1i1p1 for all models. For your analysis, you might find it more meaningful to average across ensemble members, but we’ll keep it simple for this plotting example.

Of course, you could still chose to display more information on one graph, but there is no real conection between the first ensemble of one model versus another.

In this final example, we’ll add another layer of complexity to our Taylor Diagram plot with contour lines of constant root mean squared error (RMSE).

# Open RCP26 and RCP45 files
tas_rcp26 = xr.open_dataset(gdf.get('netcdf_files/tas_Amon_CanESM2_rcp26_2022_xyav.nc'))
tas_rcp26['time'] = tas_rcp26.indexes['time'].to_datetimeindex()

tas_rcp45 = xr.open_dataset(gdf.get('netcdf_files/tas_Amon_CanESM2_rcp45_2022_xyav.nc'))
tas_rcp45['time'] = tas_rcp45.indexes['time'].to_datetimeindex()

Downloading file 'netcdf_files/tas_Amon_CanESM2_rcp26_2022_xyav.nc' from 'https://github.com/NCAR/GeoCAT-datafiles/raw/main/netcdf_files/tas_Amon_CanESM2_rcp26_2022_xyav.nc' to '/home/runner/.cache/geocat'.

/tmp/ipykernel_2556/3755510889.py:3: RuntimeWarning: Converting a CFTimeIndex with dates from a non-standard calendar, 'noleap', to a pandas.DatetimeIndex, which uses dates from the standard calendar.  This may lead to subtle errors in operations that depend on the length of time between dates.
  tas_rcp26['time'] = tas_rcp26.indexes['time'].to_datetimeindex()
Downloading file 'netcdf_files/tas_Amon_CanESM2_rcp45_2022_xyav.nc' from 'https://github.com/NCAR/GeoCAT-datafiles/raw/main/netcdf_files/tas_Amon_CanESM2_rcp45_2022_xyav.nc' to '/home/runner/.cache/geocat'.

/tmp/ipykernel_2556/3755510889.py:6: RuntimeWarning: Converting a CFTimeIndex with dates from a non-standard calendar, 'noleap', to a pandas.DatetimeIndex, which uses dates from the standard calendar.  This may lead to subtle errors in operations that depend on the length of time between dates.
  tas_rcp45['time'] = tas_rcp45.indexes['time'].to_datetimeindex()

# Perform statistical analysis to create our standard deviation and correlation coefficient lists
temp_rcp26_std = float(tas_rcp26['r1i1p1'].std().values) 
temp_rcp26_std_norm = temp_rcp26_std / std_temp_obsv
temp_rcp26_corr = float(xr.corr(era5_temp, tas_rcp26['r1i1p1']).values)

temp_rcp45_std = float(tas_rcp45['r1i1p1'].std().values)
temp_rcp45_std_norm = temp_rcp45_std / std_temp_obsv
temp_rcp45_corr = float(xr.corr(era5_temp, tas_rcp45['r1i1p1']).values)

temp_std = [temp_rcp26_std_norm, temp_rcp45_std_norm, temp_rcp85_std[0]]
temp_corr = [temp_rcp26_corr, temp_rcp45_corr, temp_rcp85_corr[0]]

# Create figure and Taylor Diagram instance
fig = plt.figure(figsize=(12, 12))
taylor = gv.TaylorDiagram(fig=fig, label='REF')
ax = plt.gca()

# Draw diagonal dashed lines from origin to correlation values
# Also enforces proper X-Y ratio
taylor.add_xgrid(np.array([0.6, 0.9]))

# Add model set for temp dataset
taylor.add_model_set(
    temp_std,
    temp_corr,
    fontsize=20,
    xytext=(-5, 10),  # marker label location, in pixels
    color='red',
    marker='o',
    facecolors='none',
    s=100)  # marker size

#gv.util.set_axes_limits_and_ticks(ax, xlim=[0,2])

namearr = ['rcp26', 'rcp45', 'rcp85']
taylor.add_model_name(namearr, fontsize=16)

# Add figure title
plt.title("CMIP5 Temperature - First Ensemble Member", size=26, pad=45)

# Add constant centered RMS difference contours.
taylor.add_contours(levels=np.arange(0, 1.1, 0.25),
                 colors='lightgrey',
                 linewidths=0.5);

../_images/fd6425ba2f3f5303def59a117df195c91383b59d0b76d9ba86152078e9cddd7f.png

Based on these three RCPs it looks like RCP8.5 has the closest correlation to our observed climate behavior, but RCP2.6 has a closer standard deviation to what we experience. Based on your selected data, scientific interpretations may vary.

Plotting Multiple Variables

A Taylor Diagram can support multiple model sets, you simply need to call taylor.add_model_set() multiple times. By adding the label kwarg and calling taylor.add_legend() you can add a label distinguishing between the two sets.

Since we’ve already demonstrated the statistical analysis necessary to perform Taylor Diagrams, the following example will be using sample data.

Here we make sample data for 7 common climate model variables, for two different models.

# Create sample data

# Model A
a_sdev = [1.230, 0.988, 1.092, 1.172, 1.064, 0.966, 1.079]  # normalized standard deviation
a_ccorr = [0.958, 0.973, 0.740, 0.743, 0.922, 0.982, 0.952]  # correlation coefficient

# Model B
b_sdev = [1.129, 0.996, 1.016, 1.134, 1.023, 0.962, 1.048]  # normalized standard deviation
b_ccorr = [0.963, 0.975, 0.801, 0.814, 0.946, 0.984, 0.968]  # correlation coefficient

# Sample Variable List
var_list = ['Surface Pressure', '2m Temp', 'Dew Point Temp', 'U Wind', 'V Wind', 'Precip', 'Cloud Cov']

# Create figure and TaylorDiagram instance
fig = plt.figure(figsize=(10, 10))
taylor = gv.TaylorDiagram(fig=fig, label='REF')

# Draw diagonal dashed lines from origin to correlation values
# Also enforces proper X-Y ratio
taylor.add_xgrid(np.array([0.6, 0.9]))

# Add models to Taylor diagram
taylor.add_model_set(a_sdev,
                  a_ccorr,
                  color='red',
                  marker='o',
                  label='Model A', # add model set legend label
                  fontsize=16)

taylor.add_model_set(b_sdev,
                  b_ccorr,
                  color='blue',
                  marker='o',
                  label='Model B',
                  fontsize=16)

# Add model name
taylor.add_model_name(var_list, fontsize=16)

# Add figure legend
taylor.add_legend(fontsize=16)

# Add constant centered RMS difference contours.
taylor.add_contours(levels=np.arange(0, 1.1, 0.25),
                 colors='lightgrey',
                 linewidths=0.5);

../_images/d25a627a7e0d530c742aec995563eeb56f193b0ec95c62c4411588ba141b8b24.png

Plotting Bias

We can add another layer of information to the Taylor Diagram by changing the marker size and shape depending on a third variable. Most commonly this is done to demonstrate bias, a statistical definition of the difference between the observed and estimated values.

We do this by adding a bias_array kwarg to the add_model_set() method. Doing so necessitates removing the marker specification, since they are overriden with up or down arrows of varrying sizes. Bias values are in percentages.

Indicate the meaning of these new bias symbols with a third legend with the call add_bias_legend().

# Sample corresponding bias data.

# Case A
a_bias = [2.7, -1.5, 17.31, -20.11, 12.5, 8.341, -4.7]  # bias (%)

# Case B
b_bias = [1.7, 2.5, -17.31, 20.11, 19.5, 7.341, 9.2]

# Create figure and TaylorDiagram instance
fig = plt.figure(figsize=(10, 10))
taylor = gv.TaylorDiagram(fig=fig, label='REF')

# Draw diagonal dashed lines from origin to correlation values
# Also enforces proper X-Y ratio
taylor.add_xgrid(np.array([0.6, 0.9]))

# Add models to Taylor diagram
taylor.add_model_set(a_sdev,
                  a_ccorr,
                  percent_bias_on=True, # indicate marker and size to be plotted based on bias_array
                  bias_array=a_bias, # specify bias array
                  color='red',
                  label='Model A',
                  fontsize=16)

taylor.add_model_set(b_sdev,
                  b_ccorr,
                  percent_bias_on=True,
                  bias_array=b_bias,
                  color='blue',
                  label='Model B',
                  fontsize=16)

# Add model name
taylor.add_model_name(var_list, fontsize=16)

# Add figure legend
taylor.add_legend(fontsize=16)

# Add bias legend
taylor.add_bias_legend()

# Add constant centered RMS difference contours.
taylor.add_contours(levels=np.arange(0, 1.1, 0.25),
                 colors='lightgrey',
                 linewidths=0.5);

../_images/6776296af8a4037f39ec3f020c6095bf137ae06c41a4fe04faeb0684082b6796.png

Variants

Taylor Diagram’s can be altered in the following variations (not all of which are supported yet by GeoCAT-viz, please consider this feature request form). Coming soon:

Supporting display of negative correlations by extending the diagram into a second quandrant to the left.
Supporting automatic notations connecting related points, say the same variable in two different models to see how it moves towards truth.

Summary

Taylor Diagrams allow you to display and compare statistical information about several models, variables, ensembles, or other dataset categorizations on a single plot. They are commonly used in climate analysis. With these tools under your belt, you’re ready to include stronger data visualizations in your research.

What’s next?

Let’s look at the meteorology specialty plots Skew T Diagrams.

Resources and references

Karl E. Taylor - “Summarizing multiple aspects of model performance in a single diagram”, AGU 2001
Plotting with GeoCAT Tutorial
NCL Graphics: Taylor Diagrams