Accessing Argo Data¶

Overview¶

Building upon previous notebook, Introduction to Argo, we next explore how to access Argo data using various methods.

These methods are described in more detail on their respective websites, linked below. Our goal here is to provide a brief overview of some of the different tools available.

GO-BGC Toolbox
Argopy, a dedicated Python package
Argovis for API-based queries

After going through this notebook, you will be able to retrieve Argo data of interest within a certain time frame, geographical location, or by platform identifier. There are many other ways of working with Argo data, so we encourage users to explore what applications work best for their needs. Further information on Argo access can be found on the Argo website.

Prerequisites¶

Label the importance of each concept explicitly as helpful/necessary.

Concepts	Importance	Notes
Intro to Numpy	Necessary
Intro to NetCDF	Necessary	Familiarity with metadata structure
Intro to Xarray	Necessary

Time to learn: 20 min

Imports¶

Begin your body of content with another --- divider before continuing into this section, then remove this body text and populate the following code cell with all necessary Python imports up-front:

# Import packages
import sys
import os
import numpy as np
import pandas as pd
import scipy
import xarray as xr
from datetime import datetime, timedelta

import requests
import time
import urllib3
import shutil

import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
import seaborn as sns
from cmocean import cm as cmo

from argovisHelpers import helpers as avh

1. Downloading with the GO-BGC Toolbox¶

In the previous notebook, Introduction to Argo, we saw how Argo synthetic profile (‘sprof’) data is stored in netcdf4 format.

Using the GDAC function allows you to subset and download Sprof’s for multiple floats. We recommend this tool for users who only need a few profilesd in a specific area of interest. Considerations:

Easy to use and understand
Downloads float data as individual .nc files to your local machine (takes up storage space)
Must download all variables available (cannot subset only variables of interest)

The two major functions below are courtesy of the GO-BGC Toolbox (Ethan Campbell). A full tutorial is available in the Toolbox.

# # Base filepath. Need for Argo GDAC function.z
# root = '/Users/sangminsong/Library/CloudStorage/OneDrive-UW/Code/2024_Pythia/'
# profile_dir = root + 'SOCCOM_GO-BGC_LoResQC_LIAR_28Aug2023_netcdf/'

# # Base filepath. Need for Argo GDAC function.
root = '../data/'
profile_dir = root + 'bgc-argo/'

1.0 GO-BGC Toolbox Functions¶

# Function to download a single file (From GO-BGC Toolbox)
def download_file(url_path,filename,save_to=None,overwrite=False,verbose=True):
    """ Downloads and saves a file from a given URL using HTTP protocol.

    Note: If '404 file not found' error returned, function will return without downloading anything.
    
    Arguments:
        url_path: root URL to download from including trailing slash ('/')
        filename: filename to download including suffix
        save_to: None (to download to root Google Drive GO-BGC directory)
                 or directory path
        overwrite: False to leave existing files in place
                   or True to overwrite existing files
        verbose: True to announce progress
                 or False to stay silent
    
    """
    urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)

    if save_to is None:
      save_to = root #profile_dir  # EDITED HERE

    try:
      if filename in os.listdir(save_to):
          if not overwrite:
              if verbose: print('>>> File ' + filename + ' already exists. Leaving current version.')
              return
          else:
              if verbose: print('>>> File ' + filename + ' already exists. Overwriting with new version.')

      def get_func(url,stream=True):
          try:
              return requests.get(url,stream=stream,auth=None,verify=False)
          except requests.exceptions.ConnectionError as error_tag:
              print('Error connecting:',error_tag)
              time.sleep(1)
              return get_func(url,stream=stream)

      response = get_func(url_path + filename,stream=True)

      if response.status_code == 404:
          if verbose: print('>>> File ' + filename + ' returned 404 error during download.')
          return
      with open(save_to + filename,'wb') as out_file:
          shutil.copyfileobj(response.raw,out_file)
      del response
      if verbose: print('>>> Successfully downloaded ' + filename + '.')

    except:
      if verbose: print('>>> An error occurred while trying to download ' + filename + '.')

# Function to download and parse GDAC synthetic profile index file (GO-BGC Toolbox)
def argo_gdac(lat_range=None,lon_range=None,start_date=None,end_date=None,sensors=None,floats=None,
              overwrite_index=False,overwrite_profiles=False,skip_download=False,
              download_individual_profs=False,save_to=None,verbose=True):
  """ Downloads GDAC Sprof index file, then selects float profiles based on criteria.
      Either returns information on profiles and floats (if skip_download=True) or downloads them (if False).

      Arguments:
          lat_range: None, to select all latitudes
                     or [lower, upper] within -90 to 90 (selection is inclusive)
          lon_range: None, to select all longitudes
                     or [lower, upper] within either -180 to 180 or 0 to 360 (selection is inclusive)
                     NOTE: longitude range is allowed to cross -180/180 or 0/360
          start_date: None or datetime object
          end_date:   None or datetime object
          sensors: None, to select profiles with any combination of sensors
                   or string or list of strings to specify required sensors
                   > note that common options include PRES, TEMP, PSAL, DOXY, CHLA, BBP700,
                                                      PH_IN_SITU_TOTAL, and NITRATE
          floats: None, to select any floats matching other criteria
                  or int or list of ints specifying floats' WMOID numbers
          overwrite_index: False to keep existing downloaded GDAC index file, or True to download new index
          overwrite_profiles: False to keep existing downloaded profile files, or True to download new files
          skip_download: True to skip download and return: (, ,
                                                            )
                         or False to download those profiles
          download_individual_profs: False to download single Sprof file containing all profiles for each float
                                     or True to download individual profile files for each float
          save_to: None to download to Google Drive "/GO-BGC Workshop/Profiles" directory
                   or string to specify directory path for profile downloads
          verbose: True to announce progress, or False to stay silent

  """
  # Paths
  url_root = 'https://www.usgodae.org/ftp/outgoing/argo/'
  dac_url_root = url_root + 'dac/'
  index_filename = 'argo_synthetic-profile_index.txt'
  if save_to is None: save_to = root

  # Download GDAC synthetic profile index file
  download_file(url_root,index_filename,overwrite=overwrite_index)

  # Load index file into Pandas DataFrame
  gdac_index = pd.read_csv(root + index_filename,delimiter=',',header=8,parse_dates=['date','date_update'],
                          date_parser=lambda x: pd.to_datetime(x,format='%Y%m%d%H%M%S'))

  # Establish time and space criteria
  if lat_range is None:  lat_range = [-90.0,90.0]
  if lon_range is None:  lon_range = [-180.0,180.0]
  elif lon_range[0] > 180 or lon_range[1] > 180:
    if lon_range[0] > 180: lon_range[0] -= 360
    if lon_range[1] > 180: lon_range[1] -= 360
  if start_date is None: start_date = datetime(1900,1,1)
  if end_date is None:   end_date = datetime(2200,1,1)

  float_wmoid_regexp = r'[a-z]*/[0-9]*/profiles/[A-Z]*([0-9]*)_[0-9]*[A-Z]*.nc'
  gdac_index['wmoid'] = gdac_index['file'].str.extract(float_wmoid_regexp).astype(int)
  filepath_main_regexp = '([a-z]*/[0-9]*/)profiles/[A-Z]*[0-9]*_[0-9]*[A-Z]*.nc'
  gdac_index['filepath_main'] = gdac_index['file'].str.extract(filepath_main_regexp)
  filepath_regexp = '([a-z]*/[0-9]*/profiles/)[A-Z]*[0-9]*_[0-9]*[A-Z]*.nc'
  gdac_index['filepath'] = gdac_index['file'].str.extract(filepath_regexp)
  filename_regexp = '[a-z]*/[0-9]*/profiles/([A-Z]*[0-9]*_[0-9]*[A-Z]*.nc)'
  gdac_index['filename'] = gdac_index['file'].str.extract(filename_regexp)

  # Subset profiles based on time and space criteria
  gdac_index_subset = gdac_index.loc[np.logical_and.reduce([gdac_index['latitude'] >= lat_range[0],
                                                            gdac_index['latitude'] <= lat_range[1],
                                                            gdac_index['date'] >= start_date,
                                                            gdac_index['date'] <= end_date]),:]
  if lon_range[1] >= lon_range[0]:    # range does not cross -180/180 or 0/360
    gdac_index_subset = gdac_index_subset.loc[np.logical_and(gdac_index_subset['longitude'] >= lon_range[0],
                                                             gdac_index_subset['longitude'] <= lon_range[1])]
  elif lon_range[1] < lon_range[0]:   # range crosses -180/180 or 0/360
    gdac_index_subset = gdac_index_subset.loc[np.logical_or(gdac_index_subset['longitude'] >= lon_range[0],
                                                            gdac_index_subset['longitude'] <= lon_range[1])]

  # If requested, subset profiles using float WMOID criteria
  if floats is not None:
    if type(floats) is not list: floats = [floats]
    gdac_index_subset = gdac_index_subset.loc[gdac_index_subset['wmoid'].isin(floats),:]

  # If requested, subset profiles using sensor criteria
  if sensors is not None:
    if type(sensors) is not list: sensors = [sensors]
    for sensor in sensors:
      gdac_index_subset = gdac_index_subset.loc[gdac_index_subset['parameters'].str.contains(sensor),:]

  # Examine subsetted profiles
  wmoids = gdac_index_subset['wmoid'].unique()
  wmoid_filepaths = gdac_index_subset['filepath_main'].unique()

  # Just return list of floats and DataFrame with subset of index file, or download each profile
  if not skip_download:
    downloaded_filenames = []
    if download_individual_profs:
      for p_idx in gdac_index_subset.index:
        download_file(dac_url_root + gdac_index_subset.loc[p_idx]['filepath'],
                      gdac_index_subset.loc[p_idx]['filename'],
                      save_to=save_to,overwrite=overwrite_profiles,verbose=verbose)
        downloaded_filenames.append(gdac_index_subset.loc[p_idx]['filename'])
    else:
      for f_idx, wmoid_filepath in enumerate(wmoid_filepaths):
        download_file(dac_url_root + wmoid_filepath,str(wmoids[f_idx]) + '_Sprof.nc',
                      save_to=save_to,overwrite=overwrite_profiles,verbose=verbose)
        downloaded_filenames.append(str(wmoids[f_idx]) + '_Sprof.nc')
    return wmoids, gdac_index_subset, downloaded_filenames
  else:
    return wmoids, gdac_index_subset

1.1 Using GDAC function to access Argo subsets¶

# Get all floats from chosen period
lat_bounds = [-70,-45]  # used to be -70 to -30
lon_bounds = [10,70]    # used to be 10, 60

# Try using more time buffer, 2 years. 
start_yd = datetime(2017,4,20) # datetime(2019,4,30)  
end_yd = datetime(2021,7,30) # datetime(2019,7,19)  

# dont download, just get wmoids
wmoids, gdac_index = argo_gdac(lat_range=lat_bounds,lon_range=lon_bounds,
                               start_date=start_yd,end_date=end_yd,
                               sensors=None,floats=None,
                               overwrite_index=True,overwrite_profiles=False,
                               skip_download=True,download_individual_profs=False,
                               save_to=profile_dir,verbose=True)

# download specific float #5906030 
# wmoids, gdac_index, downloaded_filenames \
#                    = argo_gdac(lat_range=None,lon_range=None,
#                                start_date=None,end_date=None,
#                                sensors=None,floats=5906030,
#                                overwrite_index=True,overwrite_profiles=False,
#                                skip_download=False,download_individual_profs=False,
#                                save_to=profile_dir,verbose=True)

>>> Successfully downloaded argo_synthetic-profile_index.txt.

/tmp/ipykernel_3764/1987350156.py:44: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
  gdac_index = pd.read_csv(root + index_filename,delimiter=',',header=8,parse_dates=['date','date_update'],

# DSdict = {}
# for filename in os.listdir(profile_dir):
#     if filename.endswith(".nc"):
#         fp = profile_dir + filename
#         single_dataset = xr.open_dataset(fp, decode_times=False)
#         DSdict[filename[0:7]] = single_dataset
# # DSdict['5906030']

2. Using the Argopy Python Package¶

argopy is a python package that facilitates access and manipulation of Argo data from all available data sources. The documentation is available here.

The package allows you to use python to select subsets of Argo data, including data from:

a) All available data within a “box” (geospatial area and timeframe)
b) A specific float
c) A specific float profile

The code here is adapted from the argopy documentation and associated examples.

Imports¶

from argopy import DataFetcher  # This is the class to work with Argo data
from argopy import ArgoIndex  #  This is the class to work with Argo index
from argopy import ArgoNVSReferenceTables  # This is the class to retrieve data from Argo reference tables
from argopy import ArgoColors  # This is a class with usefull pre-defined colors
from argopy.plot import scatter_map, scatter_plot  # This is a function to easily make maps 

# Make a fresh start
import argopy
argopy.reset_options()
argopy.clear_cache()
argopy.set_options(cachedir='cache_bgc')

#
import numpy as np
import matplotlib as mpl
from matplotlib import pyplot as plt
import cmocean
import xarray as xr
xr.set_options(display_expand_attrs = False)

<xarray.core.options.set_options at 0x7fc78a915400>

import logging
logging.getLogger("matplotlib").setLevel(logging.ERROR)
logging.getLogger("pyproj").setLevel(logging.ERROR)
logging.getLogger("fsspec").setLevel(logging.ERROR)
logging.getLogger("parso").setLevel(logging.ERROR)
logging.getLogger("asyncio").setLevel(logging.ERROR)
DEBUGFORMATTER = '%(asctime)s [%(levelname)s] [%(name)s] %(filename)s:%(lineno)d: %(message)s'
logging.basicConfig(
    level=logging.DEBUG,
    format=DEBUGFORMATTER,
    datefmt='%I:%M:%S %p',
    handlers=[logging.FileHandler("nb-docs.log", mode='w')]
)

a) Fetching data for all profiles within a geographic box¶

Define the geographic region you want to investigate within the BOX variable:

# Format: [lon_min, lon_max, lat_min, lat_max, pres_min, pres_max, datim_min, datim_max]
BOX = [-56, -45, 54, 60, 0, 2000, '2022-01', '2023-01']

Retrieve the data:¶

argopy works by constructing a “Fetcher” object, named “f” here. When we define f, we specify the kinds of data we want, and also how we want to process it.

Input arguments:

ds: specifies what Argo dataset to retrieve
1. “phy”: physical Argo data (Temperature, Salinity, Pressure)
2. “bgc”: biogeochemical data. Note that BGC data can only be retrieved in expert mode (real-time, no QC) as of now (2024-06-13)
mode: specifies the level of data QC you want
1. “expert”: returns all Argo data. This is raw data with no QC or postprocessing
2. “standard”: this includes real-time data that has undergone automated QC and is probably good quality, but has not been checked by a human
3. “resesarch”: this is the most trustworthy data, and only includes delayed mode data that has undergone QC and and been checked by a human expert
parallel: if True, parallelizes the data retrieval process to speed it up
progress: if True, will display a progress bar of data retrieval
cache: I’m not sure what this does
chunks_maxsize: specifies how to chunk the data request into smaller domains

Once “f” is defined, we can specify f.region(BOX) and load our data.

Construct a fetcher object¶

%%time
# f = DataFetcher(ds='bgc', mode='expert', params='all', parallel=True, progress=True).region(BOX).load()  # Fetch everything !
f = DataFetcher(ds='phy', mode='research', params='all',
                parallel=True, progress=True, cache=False,
                chunks_maxsize={'time': 30},
               )
f = f.region(BOX).load()
f

Extract the data from the fetcher object**¶

Once the data is loaded, we can extract our data as an xarray dataset. Using f.data, the default output is a 1D array of all measurements across all profiles.

ds_points = f.data

ds_points

Converting to a 2D array of profiles

Using the dataset.argo.point2profile() method, we can turn the 1d array into a 2D array, grouped by individual profiles.

Note that each N_PROF is unique, although the dataset does not include identifying metadata for the profiles, such as WMO number.

ds_profiles = ds_points.argo.point2profile();

ds_profiles

Extract float metadata from the fetcher object¶

Float metadata, including the float’s unique WMO number, and each profile’s cycle number, are retrieved as a pandas dataframe using f.index

f.index

Basic data visualization¶

argopy includes some built-in data visualization functions

Here is the default map function, which plots each float’s trajectory, colored by the float.

scatter_map(ds_profiles);

/home/runner/micromamba/envs/sklearn-argo-dev/lib/python3.9/site-packages/argopy/plot/plot.py:437: UserWarning: More than one N_LEVELS found in this dataset, scatter_map will use the first level only
  warnings.warn(

/home/runner/micromamba/envs/sklearn-argo-dev/lib/python3.9/site-packages/cartopy/io/__init__.py:241: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/50m_physical/ne_50m_land.zip
  warnings.warn(f'Downloading: {url}', DownloadWarning)

We can also zoom out and see where the data globally

fig, ax = scatter_map(ds_profiles,
                   figsize=(10,6),
                   set_global=True,
                   markersize=2,
                   markeredgecolor=None,
                   legend_title='Floats WMO',
                   cmap='Set2')

/home/runner/micromamba/envs/sklearn-argo-dev/lib/python3.9/site-packages/argopy/plot/plot.py:437: UserWarning: More than one N_LEVELS found in this dataset, scatter_map will use the first level only
  warnings.warn(

b) Fetching data for a specific float(s)¶

This works much the same as the example above. However, instead of requesting a BOX with spatiotemporal bounds, we request a specific float by its unique WMO number. For multiple floats, simply pass a list of WMO numbers

%%time
f = DataFetcher(ds='phy', mode='research', params='all',
                parallel=True, progress=True, cache=False,
                chunks_maxsize={'time': 30},
               )

# We use the f.float() method to fetch data from a specific float (using its WMO#, here 5904673)
# To request multiple floats, simply pass a list of multiple WMO numbers, e.g. [5904673, 5904672]
f = f.float(5904673).load();
f

Extracting and manipulating the data¶

As before, we extract the data as a 1D array of measurements in xarray, and convert that into a 2D array of profiles using ds.argo.point2profile()

ds_points = f.data
ds_profiles = ds_points.argo.point2profile();

Visualizing the data¶

This float is from the Southern Ocean’s Pacific sector

fig, ax = scatter_map(ds_profiles,
                   figsize=(10,6),
                   set_global=True,
                   markersize=2,
                   markeredgecolor=None,
                   legend_title='Floats WMO',
                   cmap='Set2')

/home/runner/micromamba/envs/sklearn-argo-dev/lib/python3.9/site-packages/argopy/plot/plot.py:437: UserWarning: More than one N_LEVELS found in this dataset, scatter_map will use the first level only
  warnings.warn(

Now let’s plot the float’s temperature profiles over its trajectory

# We use xarray's built-in plotting function on the temperature data array
# We transpose it so that the vertical dimension (N_LEVELS) is on the y-axis
ds_profiles.TEMP.transpose().plot() 
plt.gca().invert_yaxis() # Invert the y-axis so the ocean's surface is at the top

c) Fetching data for a specific float profile(s)¶

Let’s narrow it down even further, and request a single profile using the fetcher’s f.profile() method, and passing the float’s unique WMO number and the profile number. To request multiple profiles, simply pass a list of profile numbers

%%time
f = DataFetcher(ds='phy', mode='research', params='all',
                parallel=True, progress=True, cache=False,
                chunks_maxsize={'time': 30},
               )

# We use the f.profile() method to fetch data from a specific profile using the float WMO number and profile number
# To request multiple profiles, simply pass a list of multiple profile numbers, e.g. (5904673,[1,2,3])
f = f.profile(5904673,30).load();
f

Extracting and manipulating the data¶

Once again, we extract the data as a 1D array of measurements in xarray, and convert that into a 2D array of profiles using ds.argo.point2profile()

ds_points = f.data
ds_profiles = ds_points.argo.point2profile();

Visualizing the data¶

Let’s plot the vertical temperature and salinity profiles

plt.plot(ds_profiles.sel(N_PROF=0).TEMP.data,ds_profiles.sel(N_PROF=0).PRES.data) # Plot Temperature versus Pressure (i.e. depth)
plt.gca().invert_yaxis() # Invert the axis to put the surface at the top
plt.xlabel('Temperature (C)')
plt.ylabel('Pressure (dbar)')
plt.title('Temperature profile: Float 5904673, Profile 30')

plt.plot(ds_profiles.sel(N_PROF=0).PSAL.data,ds_profiles.sel(N_PROF=0).PRES.data) # Plot Temperature versus Pressure (i.e. depth)
plt.gca().invert_yaxis() # Invert the axis to put the surface at the top
plt.xlabel('Practical Salinity (psu)')
plt.ylabel('Pressure (dbar)')
plt.title('Salinity profile: Float 5904673, Profile 30')

3. Querying Data with Argovis¶

Argovis provides an API that allows us to interact with Argo data while only downloading the exact subsets of data needed for analysis. Our examples here are modified from the tutorial notebooks released by Argovis. We showcase only a few of the functionalities, but more information can be found in the previous link.

The introduction published by Argovis:

“Argovis is a REST API and web application for searching, downloading, co-locating and visualizing oceanographic data, including Argo array data, ship-based profile data, data from the Global Drifter Program, tropical cyclone data, and several gridded products. Our API is meant to be integrated into living documents like Jupyter notebooks and analyses intended to update their consumption of Argo data in near-real-time, and our web frontend is intended to make it easy for students and educators to explore data about Earth’s oceans at will.”

Argovis should be cited as:

Tucker, T., D. Giglio, M. Scanderbeg, and S.S.P. Shen: Argovis: A Web Application for Fast Delivery, Visualization, and Analysis of Argo Data. J. Atmos. Oceanic Technol., 37, 401–416, Tucker et al. (2020)

Getting started with `argovisHelpers`¶

From the Argovis tutorial:

In order to allocate Argovis’s limited computing resources fairly, users are encouraged to register and request a free API key. This works like a password that identifies your requests to Argovis. To do so:
Visit https://argovis-keygen.colorado.edu/
Fill out the form under New Account Registration
An API key will be emailed to you shortly.
Treat this API key like a password - don’t share it or leave it anywhere public. If you ever forget it or accidentally reveal it to a third party, see the same website above to change or deactivate your token.
Put your API key in the quotes in the variable below before moving on:

API_ROOT='https://argovis-api.colorado.edu/'
API_KEY='de6ee72a54bc5ca29dee5c801cab13fa4a354985'

Getting Argo data documents¶

Before actually getting Argo measurements, we can query information about the profile (including pointers to the metadata).

argoSearch = {
    'startDate': '2013-05-01T00:00:00Z',
    'endDate': '2023-05-01T00:00:00Z',
    'center': '-22.5,0',
    'radius': 100
}

argoProfiles = avh.query('argo', options=argoSearch, apikey=API_KEY, apiroot=API_ROOT)
argoProfiles[0]

{'_id': '1901820_256',
 'geolocation': {'type': 'Point', 'coordinates': [-22.75594, -0.2218]},
 'basin': 1,
 'timestamp': '2023-04-09T18:34:30.001Z',
 'date_updated_argovis': '2025-01-31T06:52:23.062Z',
 'source': [{'source': ['argo_core'],
   'url': 'ftp://ftp.ifremer.fr/ifremer/argo/dac/aoml/1901820/profiles/D1901820_256.nc',
   'date_updated': '2025-01-30T14:58:43.000Z'}],
 'cycle_number': 256,
 'geolocation_argoqc': 1,
 'profile_direction': 'A',
 'timestamp_argoqc': 1,
 'vertical_sampling_scheme': 'Primary sampling: averaged [nominal 2 dbar binned data sampled at 0.5 Hz from a SBE41CP]',
 'data_info': [['pressure',
   'pressure_argoqc',
   'salinity',
   'salinity_argoqc',
   'temperature',
   'temperature_argoqc'],
  ['units', 'data_keys_mode'],
  [['decibar', 'D'],
   [None, None],
   ['psu', 'D'],
   [None, None],
   ['degree_Celsius', 'D'],
   [None, None]]],
 'metadata': ['1901820_m0']}

argoProfiles[0]['_id']

'1901820_256'

Note that the first object in argoProfiles is a single vertical Argo “profile”. The first 7 digits of argoProfiles[0]['_id'] refer to a float’s WMO unique identification number. The last three digits are the profile number.

In the above example, we are looking at data from the 256th profile from float WMO #1901820.

We can get more information about this particular float by querying argo/meta.

metaOptions = {
    'id': argoProfiles[0]['metadata'][0]
}
argoMeta = avh.query('argo/meta', options=metaOptions, apikey=API_KEY, apiroot=API_ROOT)
argoMeta

[{'_id': '1901820_m0',
  'data_type': 'oceanicProfile',
  'data_center': 'AO',
  'instrument': 'profiling_float',
  'pi_name': ['BRECK OWENS', ' STEVEN JAYNE', ' P.E. ROBBINS'],
  'platform': '1901820',
  'platform_type': 'S2A',
  'fleetmonitoring': 'https://fleetmonitoring.euro-argo.eu/float/1901820',
  'oceanops': 'https://www.ocean-ops.org/board/wa/Platform?ref=1901820',
  'positioning_system': 'GPS',
  'wmo_inst_type': '854'}]

We can also specify all of the profiles taken from the same float with WMO ID 1901820.

platformSearch = {
    'platform': argoMeta[0]['platform']
}

platformProfiles = avh.query('argo', options=platformSearch, apikey=API_KEY, apiroot=API_ROOT)
print(len(platformProfiles))

Making `data` queries¶

Now, we want to retrieve actual measurements. We can use any number of identifiers.

Below, we are specifying float WMO 4901283 and profile #003. The data variable can be:

A comma separated list of variable names, e.g. 'temperature, doxy'
'all', meaning get all available variables.

dataQuery = {
    'id': '4901283_003',
    'data': 'all'
}
profile = avh.query('argo', options=dataQuery, apikey=API_KEY, apiroot=API_ROOT)
# avh.data_inflate(profile[0])[0:10]

We can query float profiles within larger bounds:

dataQuery = {
    'startDate': '2020-01-01T00:00:00Z',
    'endDate': '2024-01-01T00:00:00Z',
    'polygon': [[-150,-30],[-155,-30],[-155,-35],[-150,-35],[-150,-30]],
    'data': 'doxy'
}

profiles = avh.query('argo', options=dataQuery, apikey=API_KEY, apiroot=API_ROOT)

inflated_data = avh.data_inflate(profiles[0])
inflated_data[0:10]

[{'doxy': 241.86821, 'pressure': 2.11},
 {'doxy': 241.885345, 'pressure': 3.91},
 {'doxy': 241.832428, 'pressure': 5.91},
 {'doxy': 241.808228, 'pressure': 7.91},
 {'doxy': 241.847519, 'pressure': 9.91},
 {'doxy': 241.818069, 'pressure': 11.91},
 {'doxy': 241.78212, 'pressure': 13.91},
 {'doxy': 241.881287, 'pressure': 15.91},
 {'doxy': 241.853104, 'pressure': 17.91},
 {'doxy': 241.866272, 'pressure': 19.91}]

Querying within geospatial bounds¶

qs = {
    'startDate': '2017-08-01T00:00:00Z',
    'endDate': '2017-09-01T00:00:00Z',
    'box': [[-20,70],[20,72]]
}

profiles = avh.query('argo', options=qs, apikey=API_KEY, apiroot=API_ROOT)
latitudes = [x['geolocation']['coordinates'][1] for x in profiles]
print(min(latitudes))
print(max(latitudes))

70.017
71.957

Subsection to the second section¶

a quick demonstration¶

of further and further¶

header levels¶

as well $m = a * t / h$ text! Similarly, you have access to other $\LaTeX$ equation functionality via MathJax (demo below from link),

\begin{align} \dot{x} & = \sigma(y-x) \\ \dot{y} & = \rho x - y - xz \\ \dot{z} & = -\beta z + xy \end{align}

(1)

Check out any number of helpful Markdown resources for further customizing your notebooks and the Jupyter docs for Jupyter-specific formatting information. Don’t hesitate to ask questions if you have problems getting it to look just right.

Last Section¶

If you’re comfortable, and as we briefly used for our embedded logo up top, you can embed raw html into Jupyter Markdown cells (edit to see):

Info

Your relevant information here!

Feel free to copy this around and edit or play around with yourself. Some other admonitions you can put in:

Success

We got this done after all!

Warning

Be careful!

Danger

Scary stuff be here.

We also suggest checking out Jupyter Book’s brief demonstration on adding cell tags to your cells in Jupyter Notebook, Lab, or manually. Using these cell tags can allow you to customize how your code content is displayed and even demonstrate errors without altogether crashing our loyal army of machines!

Summary¶

Add one final --- marking the end of your body of content, and then conclude with a brief single paragraph summarizing at a high level the key pieces that were learned and how they tied to your objectives. Look to reiterate what the most important takeaways were.

What’s next?¶

Let Jupyter book tie this to the next (sequential) piece of content that people could move on to down below and in the sidebar. However, if this page uniquely enables your reader to tackle other nonsequential concepts throughout this book, or even external content, link to it here!

Resources and references¶

Finally, be rigorous in your citations and references as necessary. Give credit where credit is due. Also, feel free to link to relevant external material, further reading, documentation, etc. Then you’re done! Give yourself a quick review, a high five, and send us a pull request. A few final notes:

Kernel > Restart Kernel and Run All Cells... to confirm that your notebook will cleanly run from start to finish
Kernel > Restart Kernel and Clear All Outputs... before committing your notebook, our machines will do the heavy lifting
Take credit! Provide author contact information if you’d like; if so, consider adding information here at the bottom of your notebook
Give credit! Attribute appropriate authorship for referenced code, information, images, etc.
Only include what you’re legally allowed: no copyright infringement or plagiarism

Thank you for your contribution!

References¶

Tucker, T., Giglio, D., Scanderbeg, M., & Shen, S. S. P. (2020). Argovis: A Web Application for Fast Delivery, Visualization, and Analysis of Argo Data. Journal of Atmospheric and Oceanic Technology, 37(3), 401–416. 10.1175/jtech-d-19-0041.1

Argo Foundations

Introduction to Argo Observations

Scikit-learn Workflows on Argo

Regression Modeling on Argo using Scikit-learn

Accessing Argo Data

Accessing Argo Data¶

Overview¶

Prerequisites¶

Imports¶

1. Downloading with the GO-BGC Toolbox¶

1.0 GO-BGC Toolbox Functions¶

1.1 Using GDAC function to access Argo subsets¶

2. Using the Argopy Python Package¶

Imports¶

a) Fetching data for all profiles within a geographic box¶

Retrieve the data:¶

Construct a fetcher object¶

Extract the data from the fetcher object**¶

Extract float metadata from the fetcher object¶

Basic data visualization¶

b) Fetching data for a specific float(s)¶

Extracting and manipulating the data¶

Visualizing the data¶

c) Fetching data for a specific float profile(s)¶

Extracting and manipulating the data¶

Visualizing the data¶

3. Querying Data with Argovis¶

Getting started with argovisHelpers¶

Getting Argo data documents¶

Making data queries¶

Querying within geospatial bounds¶

Subsection to the second section¶

a quick demonstration¶

of further and further¶

header levels¶

Last Section¶

Summary¶

What’s next?¶

Resources and references¶

Getting started with `argovisHelpers`¶

Making `data` queries¶