The recent General Assembly of the EMODnet Chemistry was a good opportunity to present relevant ways in which the project pursues the principles of Open Science (OS). The University of Liège (ULiege) showed how the project implements open science thanks to open source software. Charles Troupin from ULiege referred in particular to EMODnet Chemistry's use of the netCDF format and the NCDatasets.jl package, which is described in a recent EMODnet news article. If you are interested in more details, please read the 2024 scientific paper in the Journal of Open Source Science
During the General Assembly (GA) of EMODnet Chemistry, which took place in Copenhagen on 23 and 24 September, the University of Liège (ULiege) provided information on the latest developments in open source software, which represents an important contribution of the EMODnet Chemistry partnership to Open Science.
According to the definition in the UNESCO Recommendation on Open Science, the latter is a broad construct that brings together various movements and practises aimed at making multilingual scientific knowledge openly available, accessible and reusable for everyone. The recommendation also states that Open Science rests on five pillars: Open scientific knowledge, open science infrastructures, open science communication, open engagement of societal actors and open dialogue with other knowledge systems.
In this context, ULiege, which has been contributing to EMODnet Chemistry since 2009, referred during the GA to the use of the netCDF format and the Julia package NCDatasets.jl by EMODnet Chemistry and other projects.
NetCDF stands for network Common Data Form (Rew and Davis, 1990). NetCDF files are self-describing, network-transparent and directly accessible. While netCDF is often known as a format, its primary definition is that of “an interface for scientific data access and a freely-distributed software library that provides an implementation of the interface”. That library defines a machine-independent format for representing scientific data. NetCDF has been created and is maintained by Unidata, a community cyber-infrastructure facility focused on Earth Systems Sciences, and founded in 1984. In ocean and atmospheric sciences, netCDF is one of the most widespread formats, since it allows users to store multidimensional data, along with a set of metadata called “global attributes”. NetCDF files are designed to store any type of data: points, time series, profiles, trajectories, gridded fields (2, 3 or even 4 dimensions). In the context of European oceanographic data, netCDF has become widely accepted. For example, EMODnet datasets from the field of chemistry, but also from other fields such as physics, biology, and bathymetry, are made available to users as netCDF files. In addition, Copernicus Marine products are supplied in NetCDF format as standard. Finally, it is worth mentioning that the SeaDataNet NetCDF format has been adopted as a UNESCO Ocean Best Practice.
To enable users to read, create and modify netCDF files, ULiege has developed the Julia package NCDatasets.jl. Julia is a high-level, dynamic programming language created in 2012. The idea of the developers was to have a language that is both easy and fast, as stated in this 2019 Nature paper: “Julia: Come for the syntax, stay for the speed”. In recent years, Julia has become increasingly used by the scientific community in general, but also in the Ocean Science community. For example, Oceananigans.jl, a package for simulating incompressible fluid dynamics in Cartesian and spherical coordinates; ArgoData.jl, for reading and processing data from the Argo profilers; STAC.jl, an implementation of the SpatioTemporal Asset Catalogues (STAC) client in Julia.
The features of the Julia package NCDatasets.jl and all details on this topic can be found in the peer-reviewed article published in the Journal of Open Source Science in 2024. The source code is available via the GitHub repository NCDatasets.jl.
“Reading and writing netCDF files is an essential step in handling oceanographic data and models. More than 60 registered Julia packages today have the NCDatasets.jl package as a direct or indirect dependency. The language solves the so-called ‘two-language problem’: when scientists need to solve a problem, they first create a solution in an easy-to-programme language such as Python, MATLAB or R. The code is then translated into a fast and less easy-to-programme language. Now both the prototype and the final implementation of the solution can be coded in Julia.” Charles Troupin of University of Liège (ULiege)
In the context of EMODnet, Julia is also the main language of the interpolation tool DIVAnd, which is available via the GitHub repository dedicated to the DIVAnd (Barth et al., 2014). This tool replaced and improved the previous version of the code, which was written in Fortran (https://github.com/gher-uliege/DIVA). DIVAnd has been used in EMODnet Chemistry as well as in EMODnet Biology, FAIR-EASE and Blue Cloud projects.
Visit the EMODnet Map Viewer to see how DIVA is regularly used to create the maps based on the eutrophication data, which are collected by the EMODnet Chemistry network. An example is provided in figure 1.
Figure 1. DIVA interpolated field of oxygen concentration at 10 m depth in May; the field is masked when the relative error is larger than 30%. © EMODnet Chemistry