Introduction to xcmor#
This notebook is a brief introduction to xcmor’s current capabilities.
import xarray as xr
import xcmor
# For this notebooks, it's nicer if we don't show the array values by default
xr.set_options(display_expand_data=False)
<xarray.core.options.set_options at 0x7fd5f0133f70>
xcmor works best when xarray keeps attributes by default.
xr.set_options(keep_attrs=True)
<xarray.core.options.set_options at 0x7fd5c0653490>
We use an example dataset with a 2D temperature field. Let’s load a regular gridded dataset:
from xcmor.datasets import reg_ds
reg_ds
<xarray.Dataset> Size: 256B
Dimensions: (x: 2, y: 2, time: 3)
Coordinates:
lon (x) float64 16B 80.17 80.68
lat (y) float64 16B 42.25 42.21
* time (time) datetime64[ns] 24B 2014-09-06 2014-09-07 2014-09-08
reference_time datetime64[ns] 8B 2014-09-05
Dimensions without coordinates: x, y
Data variables:
temperature (x, y, time) float64 96B 29.11 18.2 22.83 ... 16.15 26.63
precipitation (x, y, time) float64 96B 5.68 9.256 0.7104 ... 4.615 7.805
Attributes:
description: Weather related data.Also, let’s load some example cmor tables, e.g.
from xcmor.tests.tables import coords, dataset, mip_amon
These tables are just some subsets of the original CMIP6 CMOR tables. Now, we can use those tables to rewrite variable attributes acoording to CF conventions and the CMIP6 data request using
ds_cmor = xcmor.cmorize(
reg_ds.rename({"temperature": "tas"}).tas,
mip_table=mip_amon,
coords_table=coords,
dataset_table=dataset,
)
ds_cmor
Downloading data from 'https://raw.githubusercontent.com/cf-convention/cf-convention.github.io/master/Data/cf-standard-names/current/src/cf-standard-name-table.xml' to file '/home/docs/.cache/pooch/47dae451573e45041e8665efd76db5a9-cf-standard-name-table.xml'.
SHA256 hash of downloaded file: 3763238cb01f8aa60c78d8ae9716069dbdbbbb9e479cafb0cee5a9585c83fe14
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
2024-03-08 14:34:59,642 - xcmor.rules - WARNING - converting tas from float64 to float32 (rules.py:22)
2024-03-08 14:34:59,716 - xcmor.xcmor - INFO - adding coordinate: height (xcmor.py:249)
2024-03-08 14:34:59,718 - xcmor.xcmor - DEBUG - added coordinates: ['lon', 'lat', 'time', 'height'] (xcmor.py:350)
2024-03-08 14:34:59,719 - xcmor.xcmor - DEBUG - dropping coordinates: ['reference_time'] (xcmor.py:354)
2024-03-08 14:35:00,114 - xcmor.xcmor - DEBUG - setting time units: days since 2014-09-06T00:00:00 (xcmor.py:45)
/home/docs/checkouts/readthedocs.org/user_builds/xcmor/conda/stable/lib/python3.10/site-packages/cf_xarray/accessor.py:671: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
unused_keys = set(attribute.keys()) - set(inverted)
/home/docs/checkouts/readthedocs.org/user_builds/xcmor/conda/stable/lib/python3.10/site-packages/cf_xarray/accessor.py:672: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
for key, value in attribute.items():
<xarray.Dataset> Size: 112B
Dimensions: (lon: 2, lat: 2, time: 3)
Coordinates:
height float64 8B 2.0
* lon (lon) float64 16B 80.17 80.68
* lat (lat) float64 16B 42.25 42.21
* time (time) datetime64[ns] 24B 2014-09-06 2014-09-07 2014-09-08
Data variables:
tas (time, lat, lon) float32 48B 29.11 22.6 32.93 ... 14.17 7.182 26.63
Attributes: (12/41)
activity_id: ISMIP6
branch_method: standard
branch_time_in_child: 59400.0
branch_time_in_parent: 59400.0
calendar: 360_day
comment:
... ...
source_type: AOGCM ISM AER
sub_experiment: none
sub_experiment_id: none
tracking_prefix: hdl:21.14100
variable_id: tas
version: 20240308The Cmorizer class#
xcmor comes with some pre-configures table options through the Cmorizer class. A simple example for CMIP6 would be:
from xcmor import Cmorizer
cmor = Cmorizer(project="CMIP6")
ds_out = cmor.cmorize(
reg_ds.rename(temperature="tas").tas, "Amon", cmor.tables["input_example"]
)
ds_out
Downloading data from 'https://raw.githubusercontent.com/PCMDI/cmip6-cmor-tables/master/Tables/CMIP6_input_example.json' to file '/home/docs/.cache/pooch/b1e6a504651492874b7eb0c52a902532-CMIP6_input_example.json'.
SHA256 hash of downloaded file: fdaa955f10fe95bbe62d1c0cf08810338a97d92a553abcb73bb33456b098bc5b
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Downloading data from 'https://raw.githubusercontent.com/PCMDI/cmip6-cmor-tables/master/Tables/CMIP6_Amon.json' to file '/home/docs/.cache/pooch/69da11734d03d2070269b48e6322d094-CMIP6_Amon.json'.
SHA256 hash of downloaded file: d3ac72751f401e551ab60ae94a1c2a24441563ad8e5c0be84687775f5b75ca1d
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Downloading data from 'https://raw.githubusercontent.com/PCMDI/cmip6-cmor-tables/master/Tables/CMIP6_coordinate.json' to file '/home/docs/.cache/pooch/914887963ad537da427851a882fab071-CMIP6_coordinate.json'.
SHA256 hash of downloaded file: 0536448a0e85a4a57122839f2d0f58701beb89cf2b5245f2ff34456c348bf8f0
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
Downloading data from 'https://raw.githubusercontent.com/PCMDI/cmip6-cmor-tables/master/Tables/CMIP6_CV.json' to file '/home/docs/.cache/pooch/1d0ce1bf51bbab9d775c345522ab3938-CMIP6_CV.json'.
SHA256 hash of downloaded file: f4881b452240ce2eb645fee42c6059ed236300abd8d22baea72d9e9483d6ffd3
Use this value as the 'known_hash' argument of 'pooch.retrieve' to ensure that the file hasn't changed if it is downloaded again in the future.
2024-03-08 14:35:00,745 - xcmor.rules - WARNING - converting tas from float64 to float32 (rules.py:22)
2024-03-08 14:35:00,816 - xcmor.xcmor - INFO - adding coordinate: height (xcmor.py:249)
2024-03-08 14:35:00,818 - xcmor.xcmor - DEBUG - added coordinates: ['lon', 'lat', 'time', 'height'] (xcmor.py:350)
2024-03-08 14:35:00,819 - xcmor.xcmor - DEBUG - dropping coordinates: ['reference_time'] (xcmor.py:354)
2024-03-08 14:35:01,170 - xcmor.xcmor - DEBUG - setting time units: days since 2014-09-06T00:00:00 (xcmor.py:45)
2024-03-08 14:35:01,172 - xcmor.xcmor - DEBUG - for attribute 'activity' --> add value 'Ice Sheet Model Intercomparison Project for CMIP6' (xcmor.py:422)
2024-03-08 14:35:01,173 - xcmor.xcmor - INFO - attribute 'experiment_id' has value 'piControl-withism' and requires attribute 'experiment' to be set to 'preindustrial control with interactive ice sheet' (xcmor.py:446)
2024-03-08 14:35:01,173 - xcmor.xcmor - WARNING - attribute 'experiment_id' has value 'piControl-withism' but attribute 'parent_activity_id' has value 'CMIP' which is not in the list of expected values: ['no parent'] (xcmor.py:441)
2024-03-08 14:35:01,174 - xcmor.xcmor - WARNING - attribute 'experiment_id' has value 'piControl-withism' but attribute 'parent_experiment_id' has value 'historical' which is not in the list of expected values: ['no parent'] (xcmor.py:441)
2024-03-08 14:35:01,175 - xcmor.xcmor - DEBUG - for attribute 'frequency_info' --> add value 'monthly mean samples' (xcmor.py:432)
2024-03-08 14:35:01,176 - xcmor.xcmor - DEBUG - for attribute 'grid_label_info' --> add value 'data reported on a model's native grid' (xcmor.py:432)
2024-03-08 14:35:01,177 - xcmor.xcmor - DEBUG - for attribute 'institution' --> add value 'Program for Climate Model Diagnosis and Intercomparison, Lawrence Livermore National Laboratory, Livermore, CA 94550, USA' (xcmor.py:422)
2024-03-08 14:35:01,178 - xcmor.xcmor - WARNING - attribute 'source_id' has value 'PCMDI-test-1-0' but attribute 'source' is set to 'PCMDI-test 1.0 (1989)' but CV requires 'PCMDI-test 1.0 (1989):
aerosol: none
atmos: Earth1.0-gettingHotter (360 x 180 longitude/latitude; 50 levels; top level 0.1 mb)
atmosChem: none
land: Earth1.0
landIce: none
ocean: BlueMarble1.0-warming (360 x 180 longitude/latitude; 50 levels; top grid cell 0-10 m)
ocnBgchem: none
seaIce: Declining1.0-warming (360 x 180 longitude/latitude)'! (xcmor.py:450)
2024-03-08 14:35:01,178 - xcmor.xcmor - DEBUG - for attribute 'sub_experiment' --> add value 'none' (xcmor.py:422)
2024-03-08 14:35:01,179 - xcmor.xcmor - WARNING - value 'CF-1.7 CMIP-6.2...' for 'Conventions' not in ['^CF-1.7 CMIP-6.[0-2]\\( UGRID-1.0\\)\\{0,\\}$'] (xcmor.py:391)
2024-03-08 14:35:01,180 - xcmor.xcmor - WARNING - creation_date not found (xcmor.py:389)
2024-03-08 14:35:01,181 - xcmor.xcmor - WARNING - value '01.00.33...' for 'data_specs_version' not in ['^[[:digit:]]\\{2,2\\}\\.[[:digit:]]\\{2,2\\}\\.[[:digit:]]\\{2,2\\}$'] (xcmor.py:391)
2024-03-08 14:35:01,182 - xcmor.xcmor - WARNING - value '1...' for 'forcing_index' not in ['^\\[\\{0,\\}[[:digit:]]\\{1,\\}\\]\\{0,\\}$'] (xcmor.py:391)
2024-03-08 14:35:01,182 - xcmor.xcmor - WARNING - further_info_url not found (xcmor.py:389)
2024-03-08 14:35:01,183 - xcmor.xcmor - WARNING - value '1...' for 'initialization_index' not in ['^\\[\\{0,\\}[[:digit:]]\\{1,\\}\\]\\{0,\\}$'] (xcmor.py:391)
2024-03-08 14:35:01,184 - xcmor.xcmor - WARNING - value 'CMIP6 model data produced by Lawrence Livermore PC...' for 'license' not in ['^CMIP6 model data produced by .* is licensed under a Creative Commons .* License (https://creativecommons\\.org/.*)\\. *Consult https://pcmdi\\.llnl\\.gov/CMIP6/TermsOfUse for terms of use governing CMIP6 output, including citation requirements and proper acknowledgment\\. *Further information about this data, including some limitations, can be found via the further_info_url (recorded as a global attribute in this file).*\\. *The data producers and data providers make no warranty, either express or implied, including, but not limited to, warranties of merchantability and fitness for a particular purpose\\. *All liabilities arising from the supply of the information (including any liability arising in negligence) are excluded to the fullest extent permitted by law\\.$'] (xcmor.py:391)
2024-03-08 14:35:01,185 - xcmor.xcmor - WARNING - value '1...' for 'physics_index' not in ['^\\[\\{0,\\}[[:digit:]]\\{1,\\}\\]\\{0,\\}$'] (xcmor.py:391)
2024-03-08 14:35:01,185 - xcmor.xcmor - WARNING - value '3...' for 'realization_index' not in ['^\\[\\{0,\\}[[:digit:]]\\{1,\\}\\]\\{0,\\}$'] (xcmor.py:391)
2024-03-08 14:35:01,186 - xcmor.xcmor - WARNING - value 'atmos atmosChem...' for 'realm' not in ['aerosol', 'atmos', 'atmosChem', 'land', 'landIce', 'ocean', 'ocnBgchem', 'seaIce'] (xcmor.py:391)
2024-03-08 14:35:01,187 - xcmor.xcmor - WARNING - value 'AOGCM ISM AER...' for 'source_type' not in ['AER', 'AGCM', 'AOGCM', 'BGC', 'CHEM', 'ISM', 'LAND', 'OGCM', 'RAD', 'SLAB'] (xcmor.py:391)
2024-03-08 14:35:01,187 - xcmor.xcmor - WARNING - tracking_id not found (xcmor.py:389)
2024-03-08 14:35:01,188 - xcmor.xcmor - WARNING - variant_label not found (xcmor.py:389)
/home/docs/checkouts/readthedocs.org/user_builds/xcmor/conda/stable/lib/python3.10/site-packages/cf_xarray/accessor.py:671: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
unused_keys = set(attribute.keys()) - set(inverted)
/home/docs/checkouts/readthedocs.org/user_builds/xcmor/conda/stable/lib/python3.10/site-packages/cf_xarray/accessor.py:672: FutureWarning: The return type of `Dataset.dims` will be changed to return a set of dimension names in future, in order to be more consistent with `DataArray.dims`. To access a mapping from dimension names to lengths, please use `Dataset.sizes`.
for key, value in attribute.items():
<xarray.Dataset> Size: 112B
Dimensions: (lon: 2, lat: 2, time: 3)
Coordinates:
height float64 8B 2.0
* lon (lon) float64 16B 80.17 80.68
* lat (lat) float64 16B 42.25 42.21
* time (time) datetime64[ns] 24B 2014-09-06 2014-09-07 2014-09-08
Data variables:
tas (time, lat, lon) float32 48B 29.11 22.6 32.93 ... 14.17 7.182 26.63
Attributes: (12/49)
Conventions: CF-1.7 CMIP-6.2
activity: Ice Sheet Model Intercomparison Project for CMIP6
activity_id: ISMIP6
branch_method: standard
branch_time_in_child: 59400.0
branch_time_in_parent: 59400.0
... ...
sub_experiment: none
sub_experiment_id: none
table_id: Amon
tracking_prefix: hdl:21.14100
variable_id: tas
version: 20240308Let’s write this to NetCDF and use the compliance checker to find issues:
ds_out.to_netcdf("tas.nc")
!compliance-checker -t cf:1.7 tas.nc
Running Compliance Checker on the datasets from: ['tas.nc']
--------------------------------------------------------------------------------
IOOS Compliance Checker Report
Version 5.1.0
Report generated 2024-03-08T14:35:02Z
cf:1.7
http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html
--------------------------------------------------------------------------------
Corrective Actions
tas.nc has 2 potential issues
Warnings
--------------------------------------------------------------------------------
§2.6 Attributes
* §2.6.2 global attribute title should exist and be a non-empty string
* §2.6.2 comment global attribute should be a non-empty string
§7.2 Cell Measures
* Cell measure variable areacella referred to by tas is not present in dataset or external variables