Biology

EMODnet Biology provides free access to interoperable data on temporal and spatial distribution of marine species (angiosperms, benthos, birds, fish, macroalgae, mammals, phytoplankton, reptiles, zooplankton) and species traits from European regional seas, as defined by the EEA’s Europe’s seas’ dataset (Arctic Ocean, (North) Atlantic Ocean, Baltic Sea, Black Sea, Mediterranean Sea and North Sea).
EMODnet Biology is built upon the World Register of Marine Species (WoRMS ) and the European Ocean Biodiversity Information System (EurOBIS ), with tools and services developed in collaboration with Lifewatch ERIC and Lifewatch Marine.
Because EMODnet Biology is OGC compliant, it enables access to metadata descriptions of more than 1200 thematic biological datasets.
Due to being INSPIRE compliant these metadata records can also be found through the EU Open Data Portal.
Objectives
EMODnet Biology aims to provide a single access point to European marine biodiversity data and products. Through our interoperable products, created by assembling individual datasets from various sources, we contribute to the environmental state of ecosystems and sea basins’ assessments.
The main objective is to contribute to EMODnet’s operational service by maintaining and enhancing services for the European biodiversity data and products.
EMODnet Biology’s specific objectives are defined in the following tasks:
- maintaining and improving a method of access to data held in repositories;
- constructing products from one or more data sources that provide users with information about the distribution and quality of parameters in time and space;
- maintaining and improving procedures for machine-to-machine connections for all data and data products;
- ensuring coherence with efforts of regional sea conventions and other relevant local actors;
- engaging with EU reporting mechanisms;
- exploiting opportunities for interoperability with data distributed by non-EU organisations;
- actively participating in the INSPIRE Directive28 and Digital Earth29 processes and ensuring compliance.
Background
Europe’s seas and oceans are home to a staggering abundance and diversity of life, from large charismatic species such as seals, whales and dolphins, to the microscopic marine algae that form the base of the marine food chain. More than 36,000 known species of marine plants and animals are found in Europe, and understanding their geographic distribution, abundance and seasonal, annual or decadal variation is key to detecting changes in the marine ecosystem and for assessing ecosystem health of maritime basins. Unfortunately, measuring or observing marine life on a large scale is difficult.
Marine biodiversity data are essential to measure and study the ecosystem health of maritime basins. These data are often collected with limited spatial and temporal scope and are scattered over different organisations in small datasets for a specific species group or habitat. In addition, as data are collected by multiple organisations, using different standards, technologies and conventions, it is challenging to combine them. Furthermore, a plethora of historical marine biodiversity datasets exist in the form of simple and unorganised printed documents or electronic files, on the hard disks or other media of electronic information storage of individual scientists and of marine institutes, research centres, academic departments, ministries, port authorities, public or private libraries. These types of data, which are not stored on a remote server (e.g. in the cloud or a research repository), are considered to be at permanent risk of being lost to future use. It is these datasets, however, which provide the historical context for present observations, facilitating the establishment of reference conditions for monitoring and management.
History of EMODnet Biology
The Maritime Policy Blue Book, welcomed by the European Council in 2007, announced that the European Commission would take steps to set up a European Marine Observation and Data Network to improve access to high quality marine data for private bodies, public authorities and researchers.
The consortium comprises 23 organisations, with four partners qualified as IODE Associated Data Units (ADU) and ten others as IODE National Oceanographic Data Centres (NODC). The partnership is geographically distributed and includes all European OBIS nodes (EurOBIS, MedOBIS, OBIS UK, OBIS Black Sea). Collaboration with other international initiatives is also ensured via the consortium with the inclusion of e.g., IODE and ICES.
This phase will see an expansion of our web services, the implementation of management practices for other types of data like -omics and images and the migration to the Central Portal. Several new data products will be created and made freely available and a closer engagement with our stakeholders will ensure that our work will better answer their requirements.
Partnership
The reach and breadth of the EMODnet Biology consortium represents a high-level of connectivity at the national, regional and global scales. An overview and details of the Phase IV partnership can be found in the Partnership page.
As part of an exercise to better understand the linkages we have with key stakeholders and initiatives, we have undertaken to map the connections and touchpoints.
Following a targeted questionnaire to EMODnet Biology partners, we undertook work to utilise the HighCharts application (see below) to provide a visualisation of the key linkages with UN Decade Programmes, Regional Sea Conventions, ICES Working Groups and the other EMODnet thematic lots. There exist a multitude of other connections, however these are excluded for the purposes of clarity. The intention is to periodically update and refine the network map to ensure the current connectivity is reflected.
Work Packages (WP)
Currently Biology is in it's forth phase (2021-2023) and has divided their work in five workpackages, as follow:
- WP1: Coordination (lead partner: VLIZ);
- WP2: Access to marine biological data (lead partner: VLIZ & HCMR);
- WP3: Data product creation (lead partner: U Sheffield);
- WP4: Uptake, outreach and communication (lead partner: Marine Biological Association - MBA);
- WP5: System architecture (lead partner: VLIZ).

Partners
Lead: VLIZ
Coordination Board (WP leaders): HCMR, MBA, U Sheffield
Objectives
WP1 will focus on managing the project in order for all the deliverables and objectives to be met according to the timelines defined in the tender and also in the previous section. Clear communication and a structured approach are key aspects to the successful implementation of what is proposed in this document. The tasks addressed in this Work Package are:
- Task 5: Contributing content to dedicated spaces in Central Portal;
- Task 8: Monitor quality/performance and deal with user feedback.
Methodology & activities
The activities will be performed by the project leader, VLIZ, with the support of the Coordination Board (HCMR, University of Sheffield and MBA) and will be performed throughout the project’s life (24 months).
The main activities include:
- General project coordination and supervision;
- Budget Management;
- Responsibility to deliver the reporting deliverables to the Commission;
- Organisation of project meetings;
- Liaise with external organisations on behalf of the project consortium;
- Operate the helpdesk.
Some of the activities mentioned above require a joint approach with the other Work Packages, in general terms, the project coordinator is responsible for:
- Act as the intermediary between the consortium and the European Commission;
- Ensure appropriate communication and, where viable, collaboration between the consortium and the remaining lots;
- Organise project meetings;
- Chair the Coordination Board meetings;
- Participate in the EMODnet Steering Committee and EMODnet Technical Coordination Group meetings;
- Participate, on behalf of the consortium, in events (workshops, webinars, discussion groups, etc.) relevant to the project;
- Liaise with the EMODnet Secretariat on behalf of the consortium.
Output (Deliverables)
- D1.1: Maintenance of operational Helpdesk service with contact details published online (M0-M6);
- D1.2: Quarterly progress reports (M3-M24);
- D1.3: Interim report (M12);
- D1.4: Final report (M24);
- D1.5: Attend the Steering Committee and Technical Working Group meetings (M0-M24).
Partners
Lead: VLIZ & HCMR.
WP partners: MBA, IEO, IMR, Aarhus University, SYKE, MARIS, NIMRD, IPMA, SMHI, ICES, Deltares, IH Cantabria, Ifremer, NIOZ, OGS, ILVO, UkrSCES.
Objectives
The main objective for WP2 is covered in Task 1: Maintain and improve a common method of access to data held in repositories. The data covered by this proposal will primarily include the following groups: macroalgae, angiosperms, benthos, birds, fish, mammals, phytoplankton and zooplankton in European seas, more specifically defined in six regions: Arctic, Atlantic, Baltic Sea, Black Sea, Mediterranean Sea and North Sea, including their coastal and estuarine zones. Data from other regions are also covered within the proposal, even though it is not the main focus of the work proposed.
WP2 will continue to use the standards, vocabularies and data formats from the previous phases, thereby providing consistency and ensuring interoperability for providers and users:
- The OBIS-ENV Darwin Core (DwC) format, not only allowing the inclusion of presence/absence data of marine biodiversity data, but also the storage of additional measurements or facts sampled alongside with the biological sampling;
- The World Register of Marine Species (WoRMS), the authoritative and comprehensive list of names of marine organisms worldwide;
- The Marine Regions Gazetteer, a standard list of marine georeferenced place names and areas;
- The BODC controlled vocabularies, lists of standardised terms that cover a broad spectrum of disciplines of relevance to the oceanographic and wider community.
All of the above allow interoperability with other systems - e.g. the Ocean Biodiversity Information System (OBIS) and the Global Biodiversity Information Facility (GBIF) - and remove ambiguities when interpreting the data.
Methodology & activities
The data management activities will focus on an even further automatisation of the data flow, compared to what was accomplished in Phase III. This will be suitable for all partners and sub-contractor that already generate DarwinCore files in a (semi-)automated way based on their local structured and maintained databases. All partners/sub-contractors are required to adhere to the DarwinCore EventCore format, which was introduced in the previous phase. Where partners/sub-contractors have absence data available for datasets that were submitted in previous phases, they will be encouraged to update their data with this additional information, which has a high value for the creation of data products in relation to time-evolution and migration.
The online training course developed in the framework of the data training for data grant partners in Phase III will be kept available online through the OceanTeacher platform and – whenever relevant and needed – updates will be made. All partners and sub-contractors have access to this information and they are urged to take the necessary time to get familiar with the content of this course, especially when they are new to the project, or new staff members have joined. Another aim for this online course is to improve the data literacy of not only data providers but also users.
When providing new data or updates to existing data through IPT, each partner/sub-contractor will use the online available tools to quality check the data they provide, and – to the highest extent possible – implement the agreed standards, including the linkage with the BODC controlled vocabularies for the Extended Measurements or Facts (eMoF) data they provide. Partners and sub-contractor providing their data through web-services or other channels will be advised by the Data Management Team on best practices for quality control, standardisation and the use of controlled vocabularies.
Based on the results of the questionnaire to identify historical data within the Consortium, a selection of these resources will be made available to volunteers for digitisation. Within the feasibility for recognition of specific ecological traits and/or sampling devices/methodologies in text, dictionaries will drive such term recognition process. The dictionaries will be created semi-automatically, based on existing collections of terms (e.g. control vocabularies).
Output (Deliverables)
- D2.1 Inventory of possible historical data resources within the consortium (M6);
- D2.2 Technical implementation of data flows for the new project partners/sub-contractor (M6);
- D2.3 Report on efforts undertaken in rescuing historical data through citizen science (M18);
- D2.4 At least 3 linkages with databases/initiatives outside of the original Consortium, resulting in extra data/information available via the Portal (M20);
- D2.5 Feasibility study for recognition of specific ecological traits and/or sampling devices/methodologies in text (M22);
- D2.6 Report on the standardisation and integration of the proposed new and updated datasets (M24).
Partners
Lead: U Sheffield.
WP partners: VLIZ, MBA, SMHI, ICES, IH Cantabria, University of Liège, CEFAS, Deltares, NIOZ.
Objectives
Data products take the data supplied to WP2 (Access to Marine Biological Data) and turn them into the outputs which address the needs and questions of end users identified in WP4 (Uptake, Outreach and Communication). This work is at the interface of academic and government science and research software engineering.
In previous phases of EMODnet Biology, a range of products have been developed which adhere to FAIR principles, and which illustrate both the potential and limitations of EMODnet data. These products serve scientific user communities and are also clearly linked to the needs of other users, for instance their relevance to EOVs (Essential Ocean Variables) has been documented as part of the Product Stories. In addition, in the previous phases, libraries of code and instructions have been provided along with the products, facilitating their reuse. However, the technical bar to reusing and repurposing the products remains high.
The primary objective of WP3 in this next phase of EMODnet Biology will be to improve the software engineering of the product ecosystem to provide more complete and integrated tools to address the specific questions of the user community, for example in the form of self-contained and documented packages or browser-based apps. Alongside this, addressing scientific objectives will continue to be an important component of WP3, for instance developing methodologies to robustly visualise and assess changes in species abundance and extent over time and space.
We also recognise that answering user needs frequently requires integrating data from across EMODnet lots (e.g. EMODnet Chemistry, EMODnet Human Activities, EMODnet Physics and EMODnet Seabed Habitats), as well as from other relevant sources (human activities layers and datasets, helping to showcase localised effects).
In phase III the interoperability of different data types has been addressed by matching species occurrence data to environmental data from EMODnet and elsewhere, and by integrating data across EMODnet Biology and Seabed Habitats lots. A major focus of the new phase will be to increase the linkages between different data sources, providing products which fetch and process data required to address a set of specific user questions.
Methodology & activities
This work package will build on the methodology developed during the Phase III project. Noteworthy advances here included the application of Machine Learning methods to create gridded maps of species abundance in space and through time.
The zooplankton product has been adopted as the plankton Operational Oceanographic Products and Services (OOPS) by ICES as part of their Ecosystem Overviews which describe the trends in pressures and state of regional ecosystems. Considerable progress has also been made in creating robust maps of both species’ presence and absence, using EurOBIS data for a range of functional groups, including >1300 benthic taxa, produced and published following the Phase III workshop. Quantifying absence as well as presence is a crucial step in deriving indices of change which are robust to variable sampling effort.
Other products have acted as proof of concept for linking EMODnet biology data to data from elsewhere in EMODnet (e.g. Seabed Habitats) and from external sources (e.g. environmental data such as sea temperature or pH, and species traits such as fish living modes). The linkage of data from different thematic EMODnet portals reveals new insights and creates added-value products. Some of this work has used High Performance Computing facilities, e.g. by developing parallelised workflows to efficiently run products for very large numbers of species (for example, matching millions of occurrence records for thousands of species to gridded sea temperature products).
In Phase IV we will expand the taxonomic, geographic and temporal extent of presence-absence maps, refining Machine Learning interpolation and distribution modelling approaches, and extending and improving linkages to other EMODnet Lots and to other relevant external data sources (e.g. Copernicus environmental data) and products.
A key development will be to structure products around specific questions and evidence needs of the user community, collaborating closely with WP4, in particular using the outputs of the proposed WP4 questionnaire to capture specific user stories and requirements, as well as targeting priorities identified in the Phase III London Workshop from groups including the Regional Sea Conventions, the EEA (European Environment Agency), ICES (International Council for the Exploration of the Sea), MARS (Marine Research Institutes and Stations), MBON (Marine Biodiversity Observation Network) and the MSP (Marine Spatial Planning) community. Aligning products with relevant EBVs (Essential Biodiversity Variables) and EOVs (Essential Ocean Variables) will also be done in close cooperation with WP4.
Activities will include annual intensive workshops, using the productive model adopted in Phase III to progress product development on targeted themes (e.g. temporal trends, climate drivers, migration routes). Online collaboration will be facilitated with quarterly WP3 community focused calls including training and instruction (e.g. on effective use of GitHub and version control which will facilitate ongoing collaborative work on individual products) as well as discussions of product development.
Output (Deliverables)
- D3.1 Quarterly WP3 community calls; call leader or other nominated team member to produce summary report of each call for publication on EMODnet website (M3-M24);
- D3.2 Annual intensive workshops, in person with online participation options. Workshops involve collaborative product development on one or more targeted themes derived from WP4 user needs questionnaire (M12-?);
- D3.3 Publish R package to link EMODnet Biology data with data from other EMODnet sources (M24);
- D3.4 Develop method to use Phase III presence-absence maps to display time series of distribution change (M12);
- D3.5 Produce position paper outlining questions that can be addressed using EMODnet data, together with remaining gaps, and strategies for filling these (M24);
- D3.6 Add/update data product metadata in the EMODnet Biology catalogue (M24).
Partners
Lead: Marine Biological Association (MBA)
WP partners: VLIZ, University of Sheffield, SYKE, OGS, ILVO, ICES, CEFAS
Objectives
This work package is aimed at ensuring that the highest level of integration and interoperability are achieved through the consultation of community needs and the analysis of existing systems and infrastructures. The outcomes of this Work Package will inform the development of products within WP3 and guide future data collation and integration activities, including those outlined in WP2, thus covering :
- Task 6: Ensure the involvement of regional sea conventions;
- Task 7: Contribute to the implementation of EU legislation and broader initiatives for open data.
The landscape of initiatives, projects, legislative and statutory bodies is complex and evolving. To better understand how EMODnet can deliver essential and operational data, tools, products and services we need to understand our place and connections within this landscape. Through this Work Package we will illustrate our position and the nature and stability of the interconnections.
By engaging with key data consumers and liaison with WP3 and the system architecture (WP5), within WP4 we will provide the conduit to shape the delivery mechanism for products to support statutory reporting requirements. Through the capture and refinement of user ‘stories’ we will help steer the delivery of targeted data products to meet specific requirements. It is essential that this activity is undertaken with close collaboration with the other EMODnet thematic lots to ensure end-users receive the data and products in a low-friction manner.
Methodology & activities
Following the event to showcase EMODnet Biology data products to be held at the end of Phase III we will begin close collaboration with the other EMODnet lots to define a targeted questionnaire to capture specific user stories and requirements. The key targets will include the Regional Sea Conventions (RSC), the European Environment Agency (EEA), International Council for the Exploration of the Sea (ICES), The European Network of Marine Stations (MARS), Marine Biodiversity Observation Network (MBON), industry and trade associations, educational and Ocean Literacy initiatives (including EMSEA- the European Marine Science Educators Association) and the Marine Spatial Planning community. The outcomes of this activity will contribute to WP3.
Working across the whole EMODnet Biology partnership we will map the connectivity, level of interaction, nature and stability of EMODnet Biology’s involvement in the wide range of regional, continental and global projects, partnerships and initiatives. This activity will allow the identification of areas where the EMODnet Biology community can best target engagement and promote interactions. Taking input from previous deliverables and the available network we will identify, map and visualise the interactions to inform future engagement priorities.
We will endeavour to increase our engagement with each of the RSC to more effectively capture the requirements, whilst also providing clear examples of what is currently available and how it can be integrated into RSC operational requirements. Key RSC working groups and committees will be identified and the potential for EMODnet partners to be embedded within them will be explored.
Building on D5.5: Transatlantic data integration and product development workshop and existing collaborations with the MBON community we will establish EMODnet Biology, consortium members and other key participants as the European Node of the MBON network. Through close consultation and inclusion of other key partners and stakeholders including the Marine Research Stations Network (MARS), relevant European Strategy Forum on Research Infrastructures (ESFRI) Research Infrastructures, EurOBIS and the EOOS community we will ensure a cohesive and inclusive network is in place.
We will actively engage with the wider stakeholder landscape based on those interactions mapped through D4.2. By contributing to the project, initiative and organisational newsletters and engagement channels we will increase the reach of EMODnet Biology to new areas and consolidate and inform on the new and innovative developments taking place.
Output (Deliverables)
- D4.1: Questionnaire to inform cross-lot product development (M3);
- D4.2: EMODnet Biology connectivity ‘map’ of projects, institutes, initiatives and networks to inform targeted engagement (M12);
- D4.3: EMODnet Biology participation in each of the RSC’s to inform and advise of available data products and mechanisms to access and influence the development of data, products, tools and services (M12);
- D4.4: “Launch” of the European MBON node (M24);
- D4.5: Creation of engaging and informative use-cases for EMODnet Biology to illustrate uptake and utility of data products across a range of stakeholders across the quadruple helix of engagement. (M6,12,18,24).
Partners
Lead: VLIZ
Objectives
The main objective for WP5 is defined in Task 3, to develop a complete and robust machine to machine (M2M) interface to transfer data and products in bulk, which is easily accessible for other machines and initiatives, like the EMODnet Central Portal, the European Open Science Cloud (EOSC), Digital Earth and other EU initiatives. This will allow that all the data, metadata and data products that are created and mobilised during previous phases, as well as phase IV of the project, will be made available (in a way that follows the FAIR principles).
The EMODnet Biology M2M interface will be maintained throughout the duration of the project and its users, usability and effectiveness will be monitored.
We will build further upon the Open Geospatial Consortium (OGC) webservices that have been developed during the previous EMODnet Biology phases. A set of new standards and infrastructure technologies will be added in order to meet the requirements described in this tender and discussed in detail under the M2M interface components.
Finally, within Task 9 the maintenance of the existing thematic web portal for EMODnet Biology will ensured for at least 6 months from the start of the contract.
Methodology & activities
In order to improve the required technical infrastructure to allow a M2M client interface to find, download and subset metadata, data and data products, in bulk, EMODnet Biology will build upon the developments of the webservice implemented in the previous phases of the project. Based on the Central Portal requirements (out of scope of this tender), a number of new technologies will be further analysed, tested and implemented, as will be summarised further ahead.
The current technologies that will be evaluated and upgraded are:
- PostgreSQL (raw occurrence records and data products);
- Geoserver (OGC webservices);
- Geonetwork (ISO19115 metadata).
New server technologies that will be evaluated and, if feasible, implemented include (but are not limited to):
- Open-source Project for a Network Data Access Protocol (OPeNDAP);
- ERDDAP;
- THREDDS Data Server (TDS);
- Other technologies, to be defined.
Output (Deliverables)
- D5.1: User portal operation and maintenance (M0-M6);
- D5.2: Webservices operation and maintenance (M0-M24);
- D5.3: Technology stack upgraded (M12);
- D5.4: Evaluation and implementation of bulk data transfer technologies (M24).
Key services
EMODnet Biology provides key services and products which allow users to search and visualise data and related data products:
- Occurrence data (species observations) as Web Feature Service (WFS); Additional measurements, linked to the occurrence data as Web Feature Service (WFS);
- Retrieve data of the gridded abundance data products as WFS/WMS;
- Using the AphiaID to query by (biological) taxonomy;
- Using the IMISDasID to query by metadata dataset;
- Examples of EMODnet Biology Data applications written in R.
The Integrated Publishing Toolkit (IPT) , a freely available open source web application using the Darwin Core standard, will make it easy to share biodiversity-related data and information with the EMODnet portal.
The Quality Check tool was created to support data providers in assessing the quality of their datasets prior to submission. It can be used either through an Rshiny application or as an R package.
Training originally set up for EMODnet Biology Partners, opened up for self-enrolment since the Summer 2020. It allows access to scientists & data managers to use the available material to learn how to format, standardise and quality control their biological data for submission to EMODnet Biology.
You can enrol and follow the course on OTGA (Ocean Teacher Global Academy): classroom.oceanteacher.org/enrol/index.php?id=430
The R package , partially funded by EMODnet Biology, is designed to make all EMODnet thematic lots’ raster data layers easily accessible in R. The package allows users to query information on and download data from all available EMODnet Web Coverage Service (WCS) endpoints directly into their R working environment.
The R package , created by EMODnet Biology partners, is designed to make all EMODnet thematic lots’ vector data layers easily accessible in R. The package allows users to query information on and download data from all available EMODnet Web Feature Service (WFS) endpoints directly into their R working environment.
BTrait was developed in the scope of EMODnet Biology to facilitates working with species density data, combined with species traits in R. It allows users to query the linked datasets (species density and trait data) and visualise it with an interactive shiny application.
Data and Products
EMODnet Biology provides access to data from a wide range of sources and actively pursues inclusion of new and historical data sets to the inventory based on careful assessment of the ease of use and fitness for purpose of the data and associated databases.
The databases feeding into EMODnet Biology contain data from all regional and sub-regional seas of Europe, as specified by the Marine Strategy Framework Directive.
To ensure interoperability, EMODnet Biology implements (and if necessary adapts) common standards and vocabularies defined and used by SeaDataNet, WoRMS (World Register of Marine Species), OBIS (Ocean Biodiversity Information System), INSPIRE, GBIF (Global Biodiversity Information Facility), Marine Regions and the Lifewatch infrastructure.
Data sources
The main data contributors are:
- International biogeographic datasets from EurOBIS (European Ocean Biogeographic data system);
- National monitoring programmes;
- International monitoring campaigns (databases storing data from multiples countries within the same regional European sea);
- International data aggregators;
- Data archaeology:
- datasets recovered from scientists’ personal files;
- excel spreadsheets;
- paper documents;
- other formats that would otherwise be lost or inaccessible.
Data product development
EMODnet Biology’s gridded map layers of species abundance for different time windows using geospatial modelling are made available to all users. In addition, we also create spatially distributed data products specifically relevant for Marine Strategy Framework Directive Descriptor 2 (non-indigenous species).
EMODnet Biology is currently working, on the development of the following data products:
- Implementation of a methodology to produce statistically optimized gridded map layers based on Data-Interpolating Variational Analysis (DIVA, ULg);
- Estimation of the accuracy of the gridding procedure by comparison with validated data;
- Complementation of the gridded map of averages with indications of the precision of the result based on the distribution of the basic data used to calculate the products;
- Production of spatial maps of quality indicators relevant for Marine Strategy Framework Directive.
Data infrastructure
EMODnet Biology’s data infrastructure and data flow is that of EurOBIS, submitted data undergo a series of quality control procedures before being made available online:
- Metadata;
- Required data fields, including Taxonomy, Position, Distance from land, Units;
The data infrastructure of EMODnet Biology is able to handle different data protocols and data standards for exchange of marine biodiversity data.
- World Register of Marine Species (WoRMS ) as taxonomic backbone;
- Darwin Core standard used by the Global Biodiversity Information Facility (GBIF) and the Ocean Biodiversity Information System (OBIS);
- Specific data format enabling National Oceanographic Data Centers (NODC’s) to make biological data accessible using the SeaDataNet infrastructure;
- Several OGC Webservices making accessible geospatial data:
- Catalogue Service of the Web (CSW) for metadata resources;
- Web Feature Service (WFS) to allow requests for geographic features across the Web;
- Web Coverage Service (WCS) to allow requests for gridded data across the web;
- Web Map Service (WMS) to allow requests for maps across the web.
- Using in house developed web services. In these cases, the available data are looked at in great detail and a mapping between the available data and the Darwin Core Scheme is made allowing to capture as much data and information as possible.
- Data submitted undergo a series of quality control procedures before being made available online.
- Metadata: the data management team will check whether the data and the supplied metadata match and that all necessary fields are filled in correctly and as completely as possible. If important information are missing, a notification will be sent to the data provider asking for its completion.
- Required data fields: if the required data fields are not properly filled, a notification will be sent to the data provider. The data will not be published until all required fields are complete.
- Taxonomy: all taxon names are linked to the World Register of Marine Species (WoRMS). Unmatched taxa are sent back to the data provider for a secondary check-up. Taxa with uncertain identifications are matched to the first suitable higher taxonomic level. Originally provided taxon names are stored in the database, as this allows the possibility to go back and revisit the information. When no taxon match can be made, the name is added to an 'annotation list': this list keeps track of the editors comments on why a taxon cannot be added to the World Register of Marine Species (see also the section on standards and quality control).
- Geography: all supplied coordinates are converted to the WSG84 coordinate system and expressed as decimal degrees. Furthermore, these coordinates are checked for positioning errors which can include sampling locations on land or in different regions than those included in the supplied metadata information. These errors can be due to accidental swapping of latitude and longitude or related to the use of the minus-sign. Any instances are communicated to the data provider, so the necessary corrections can be made.
- Depth: two checks are performed: (1) Is the documented depth-value possible, when compared with the General Bathymetric Chart of the Oceans (GEBCO) and (2) is the documented depth-value possible, when compared with the known depth range of the species?
- Units: if abundance and/or biomass data are supplied, the presence of the relevant units is checked. The absence of units, prevents data comparison between different datasets.
Data format
Three format options are presented to you upon downloading data:
- Basic Occurrence: provides the user with all data necessary to do temporal spatial analysis of the different taxa.
- Full Occurrence: provides additional information which may help interpret the basic data.
- Full Occurrence and Parameters: provides the user with all quantitative data and facts associated to the occurrence or the sample. These parameters include both abiotic measurements (e.g. temperature, grainsize), environmental facts (e.g. habitat), biotic measurements (e.g. abundance, length), biotic descriptors (e.g. lifestage, sex) and sampling descriptors (e.g. sampling instrument, surface area). The data are standardised: measurements have been recalculated to common units and facts and descriptors standardised using controlled vocabularies
For each of these options, several essential terms are delivered. The level of detail in each file depends on the chosen option.
The box 'Biology - Data format' below, contains a list of all terms that are delivered for each option.
Term | Definition | TDWG URI | Basic Occurence | Full Occurrence |
---|---|---|---|---|
datasetid | An identifier that refers to the metadata record of the dataset in the EMODnet Biology Catalog. | rs.tdwg.org/dwc/terms/datasetID | Yes | Yes |
datecollected | A date in full ISO format (YYYY-MM-DDTHH:MM:SS) generated by EMODnet Biology based on the data in the fields yearcollected, monthcollected, daycollected and timeofday. Unknown information is stored as the minimum value (e.g. YYYY-01-01T00:00:00 will be used if only the year is known). | Yes | Yes | |
decimallongitude | The geographic longitude (in decimal degrees, using WGS84 / EPSG:4326) of the geographic center of a Location. Positive values are east of the Greenwich Meridian, negative values are west of it. | rs.tdwg.org/dwc/terms/decimalLongitude | Yes | Yes |
decimallatitude | The geographic latitude (in decimal degrees, using WGS84 / EPSG:4326) of the geographic center of a Location. Positive values are north of the Equator, negative values are south of it. | rs.tdwg.org/dwc/terms/decimalLatitude | Yes | Yes |
coordinateuncertaintyinmeters | The horizontal distance (in meters) from the given decimallatitude and decimallongitude describing the smallest circle containing the whole of the Location. The value is left empty if the uncertainty is unknown, cannot be estimated, or is not applicable (because there are no coordinates). Zero is not a valid value for this term. | rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters | Yes | Yes |
scientificname | The full name of the lowest taxon level that the specimen(s) can be identified as a member of; includes genus, specific epithet, and subspecific epithet (zool.) or infraspecific rank abbreviation, and infraspecific epithet. | rs.tdwg.org/dwc/terms/scientificName | Yes | Yes |
aphiaid | A link generated by EMODnet Biology based on the scientificnameid that references the scientificname in the World Register of Marine Species. | Yes | Yes | |
scientificnameaccepted | The scientific name of the currently valid or accepted taxon, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | Yes | Yes | |
modified | The most recent date-time on which the record was changed. | purl.org/dc/terms/modified | Yes | |
institutioncode | The name (or acronym) in use by the institution having custody of the object(s) or information referred to in the record. | rs.tdwg.org/dwc/terms/institutionCode | Yes | |
collectioncode | The name, acronym, code, or initialis identifying the collection or data set from which the record was derived. | rs.tdwg.org/dwc/terms/collectionCode | Yes | |
eventid | An identifier for the set of information associated with an Event (something that occurs at a place and time). May be a global unique identifier or an identifier specific to the data set. | rs.tdwg.org/dwc/terms/eventID | Yes | |
seasoncollected | The season (winter, spring, summer, authum) generated by EMODnet Biology based on the data in the fields monthcollected and daycollected. | Yes | ||
yearcollected | The year during which the sample or observation occurred. | Yes | ||
startyearcollected | For samples or observations that were taken over a duration of time this term contains the start year of the collecting event. If the collection date is uncertain, this term can be used to store the minimum year for of the interval for which there is certainty. | Yes | ||
endyearcollected | For samples or observations that were taken over a duration of time this term contains the end year of the collecting event. If the exact year is uncertain, this term can be used to store the maximum year for of the interval for which there is certainty. | Yes | ||
monthcollected | The month of the year during which the sample or observation occurred. | Yes | ||
startmonthcollected | For samples or observations that were taken over a duration of time this term gives the start month of the collecting event. | Yes | ||
endmonthcollected | For samples or observations that were taken over a duration of time this term gives the end month of the collecting event. | Yes | ||
daycollected | The day of the year during which the sample or observation occurred. | Yes | ||
startdaycollected | For samples or observations that were taken over a duration of time this term gives the start day of the collecting event. | Yes | ||
enddaycollected | For samples or observations that were taken over a duration of time this term gives the end day of the collecting event. | Yes | ||
timeofday | The time of the day a specimen was collected expressed as decimal hours from midnight (e.g. 12.0 = mid day, 13.5 = 1:30pm). | Yes | ||
starttimeofday | For samples or observations that were taken over a duration of time this gives the start time of the day of the collecting event expressed as decimal hours from midnight (e.g. 12.0 = mid day, 13.5 = 1:30pm). | Yes | ||
endtimeofday | For samples/observations/record events that were taken over a duration of time this gives the end time of the day of the collecting event expressed as decimal hours from midnight (e.g. 12.0 = mid day, 13.5 = 1:30pm). | Yes | ||
timezone | Indicates the time zone for the timeofday measurement. An empty value indicates local time. | Yes | ||
waterbody | The name of the water body in which the sample or observation occurred. | rs.tdwg.org/dwc/terms/waterBody | Yes | |
country | The name of the country or major administrative unit in which the sample or observation occurred. | rs.tdwg.org/dwc/terms/country | Yes | |
stateprovince | The name of the next smaller administrative region than country (state, province, canton, department, region, etc.) in which the sample or observation occurred. | http://rs.tdwg.org/dwc/terms/stateProvince | Yes | |
county | The full, unabbreviated name of the next smaller administrative region than stateProvince (county, shire, department, etc.) in which the sample or observation occurred. | rs.tdwg.org/dwc/terms/county | Yes | |
recordnumber | An identifier given to the Occurrence at the time it was recorded. Often serves as a link between field notes and an Occurrence record, such as a specimen collector's number. | rs.tdwg.org/dwc/terms/recordNumber | Yes | |
fieldnumber | An identifier given to the event in the field. Often serves as a link between field notes and the Event. | rs.tdwg.org/dwc/terms/fieldNumber | Yes | |
startdecimallongitude | For samples or observations that are better represented as line features rather than point features (e.g. extended trawls or transects) this term indicates the starting longitude location from which the specimen was collected. | Yes | ||
enddecimallongitude | For samples or observations that are better represented as line features rather than point features (e.g. extended trawls or transects) this term indicates the end longitude location from which the specimen was collected. | Yes | ||
startdecimallatitude | For samples or observations that are better represented as line features rather than point features (e.g. extended trawls or transects) this term indicates the starting latitude location from which the specimen was collected. | Yes | ||
enddecimallatitude | For samples or observations that are better represented as line features rather than point features (e.g. extended trawls or transects) this term indicates the end latitude location from which the specimen was collected. | Yes | ||
georeferenceprotocol | A description or reference to the methods used to determine the spatial footprint, coordinates, and uncertainties. | rs.tdwg.org/dwc/terms/georeferenceProtocol | Yes | |
minimumdepthinmeters | The minimum distance in metres below the surface of the water at which the collection/record was made; all material collected was at least this deep. Positive below the surface, negative above (e.g. collecting above sea level in tidal areas). | rs.tdwg.org/dwc/terms/minimumDepthInMeters | Yes | |
maximumdepthinmeters | The maximum distance in metres below the surface of the water at which the collection/record was made; all material collected was at most this deep. Positive below the surface, negative above (e.g. collecting above sea level in tidal areas). | rs.tdwg.org/dwc/terms/maximumDepthInMeters | Yes | |
occurrenceid | An identifier for the Occurrence. In the absence of a persistent global unique identifier, one is constructed from a combination of identifiers in the record that will most closely make the occurrenceID globally unique. | rs.tdwg.org/dwc/terms/occurrenceID | Yes | |
scientificnameid | The LSID (life science identifier) which references the scientificname in the World Register of Marine Species. | rs.tdwg.org/dwc/terms/scientificNameID | Yes | |
taxonrank | The taxonomic rank of the scientificName, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/taxonRank | Yes | |
scientificnameauthorship | The authorship information for the scientificName of the currently valid or accepted taxon, formatted according to the conventions of the applicable nomenclaturalCode. | rs.tdwg.org/dwc/terms/scientificNameAuthorship | Yes | |
aphiaidaccepted | A link generated by EMODnet Biology based on the scientificnameid that references the currently valid or accepted name for the taxon in the World Register of Marine Species (http://www.marinespecies.org/). | Yes | ||
kingdom | The kingdom in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/kingdom | Yes | |
phylum | The phylum in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/phylum | Yes | |
class | The class in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/class | Yes | |
order | The order in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/order | Yes | |
family | The family in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/family | Yes | |
genus | The genus in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/genus | Yes | |
subgenus | The subgenus in which the scientificName of the currently valid or accepted taxon is classified, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/subgenus | Yes | |
specificepithet | The first or species epithet (species name) of the scientificName of the currently valid or accepted taxon, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/specificEpithet | & | Yes |
infraspecificepithet | The lowest or terminal infraspecific epithet (subspecies name) of the scientificName of the currently valid or accepted taxon, generated by EMODnet Biology by referencing the scientificnameid in the the World Register of Marine Species. | rs.tdwg.org/dwc/terms/infraspecificEpithet | Yes | |
occurrenceremarks | Comments or notes about the Occurrence. | rs.tdwg.org/dwc/terms/occurrenceRemarks | Yes | |
basisofrecord | The specific nature of the data record. | rs.tdwg.org/dwc/terms/basisOfRecord | Yes | |
typestatus | Indicates the kind of nomenclatural type that a specimen represents, for example holotype, syntype, paratype, lectotype, paralectotype, neotype, schizotype, allotype, hapantotype. | rs.tdwg.org/dwc/iri/typeStatus | Yes | |
catalognumber | An identifier (preferably unique) for the record within the data set or collection. | rs.tdwg.org/dwc/terms/catalogNumber | Yes | |
references | Gives the web address of the page where more information on this particular record (not on the whole dataset) can be found. | purl.org/dc/terms/references | Yes | |
recordedby | A list (concatenated and separated) of names of people, groups, or organizations responsible for recording the original Occurrence. The primary collector or observer, especially one who applies a personal identifier (recordNumber), should be listed first. | rs.tdwg.org/dwc/terms/recordedBy | Yes | |
identifiedby | A list (concatenated and separated) of names of people, groups, or organizations who assigned the Taxon to the subject. | rs.tdwg.org/dwc/terms/identifiedBy | Yes | |
yearidentified | The year on which the subject was identified as representing the Taxon. | Yes | ||
monthidentified | The month on which the subject was identified as representing the Taxon. | Yes | ||
dayidentified | The day on which the subject was identified as representing the Taxon. | Yes | ||
preparations | A preparation or preservation method for a specimen. | http://rs.tdwg.org/dwc/iri/preparations | Yes | |
samplingeffort | The amount of effort expended during an Event. | rs.tdwg.org/dwc/terms/samplingEffort | Yes | |
samplingprotocol | The method or protocol used during an Event. | rs.tdwg.org/dwc/iri/samplingProtocol | Yes |