Data Ingestion Protocol provides quality assurance for marine evidence
Data Ingestion Protocol provides quality assurance for marine evidence
Collecting high quality evidence is essential to JNCC’s role as advisor to the UK government on nature conservation. Our evidence supports policy and programme decisions; we must be able to provide assurances of the high-quality standard of the data that underpin it.
In the marine area, some of the evidence we need to support our work is gathered through offshore seabed surveys. A critical aspect of survey data management is quality checking the various datasets that have been acquired before they are used by our Marine teams and made publicly available.
The Data Ingestion Protocol (DIP) is the compilation of a set of checks – for data completeness, accuracy, consistency and validity – which validate marine survey data products for use both within JNCC and by others. This vital stage ensures high data quality standards are achieved and maintained throughout all the offshore surveys we process.
Validating survey data
Data collected on offshore surveys include grab samples, acoustic data, seabed video and still images. These are collected using equipment including drop-cameras, Hamon-grabs, multibeam echosounders (MBES) and sidescan sonar (SSS) systems. Each data type requires running through a specific process to address data quality.
The majority of checks forming the DIP have been automated using an open access statistical programming environment. This has provided an efficient means of processing high volumes of survey data. For example, an initial check returns error logs of any problematic files which can then be addressed by the specific data owners. As survey datasets usually range in terabytes of data, the automated nature of this process greatly reduces check times.
After checks have been executed, all changes to the dataset are recorded in audit logs. This step is crucial for when data is passed onto end-users, analysts or incorporated into monitoring reports. Well-managed auditing means all changes can be tracked and accounted for.
Once a dataset has been passed through the DIP and signed off, it is available to JNCC marine teams and external users.
Outputs
As well as ensuring the accuracy of our data, we are also committed to making our data available for others to use through centralised platforms and in standardised formats.
The final marine survey datasets can be accessed via the Marine Recorder benthic database. This database contains records of JNCC partnership surveys and surveys from the country agencies across the UK. It can be viewed through the Marine Recorder application or downloaded as an Access database which provides a regularly updated ‘snapshot’ of the data. Marine Recorder is fully compatible with the National Biodiversity Network (NBN) data model, enabling data to be contributed to the NBN Atlas.
The final stage of survey ingestion is the inclusion of the dataset into the archives of the Marine Environmental Data and Information Network (MEDIN), which provides secure long-term storage of and access to marine data. This provides enhanced accessibility of the data for end-users and appropriate archiving.