About

The PacMAN project

PacMAN is a 3-year (2020-2023) project funded by the Flanders Government, through the Flanders UNESCO Trust Fund (FUST) for the support of UNESCO's activities in the field of Science, and co-funded by the Richard Lounsbery Foundation. The project aims to develope an invasive species monitoring system as well as an early-warning decision-support tool for Pacific Island States.

Data flow

The PacMAN decision support system is fed by two primary data sources: species occurrence data from the Ocean Biodiversity Information System (OBIS), and species distribution summaries produced with the speedy Python package. The risk analysis algorithm uses the species distribution summary to assign a risk score to each potential detection.

Occurrence data can be submitted to OBIS in many ways, but in the case of the PacMAN sampling campaigns, metadata is managed on the PlutoF platform, and the PacMAN bioinformatics pipeline is used to process raw sequence data into species observations for OBIS.

The species distribution summaries include known distributions from OBIS, GBIF, and WoRMS. WoRMS distribution records may also contain information on whether a species is native or alien to a region, and if any impact has been recorded. Thermal ranges for species are modelled using environmental layers from Bio-ORACLE.

flowchart TB; gbif_snapshot[GBIF snapshot] --> speciesgrids(speciesgrids); obis_snapshot[OBIS snapshot] --> speciesgrids; speciesgrids --> speedy(speedy); worms_distributions[WoRMS distributions] --> speedy; Bio-ORACLE --> speedy; speedy --> pacmandetections(pacmandetections); obis_api[OBIS API] ==> pacmandetections; pacmandetections ==> pacman_portal(PacMAN decision support); pacman_campaigns[PacMAN campaigns] --> plutof[PlutoF]; pacman_campaigns[PacMAN campaigns] --> bioinformatics[PacMAN bioinformatics pipeline]; plutof --> OBIS; bioinformatics --> OBIS; OBIS ==> obis_api

Risk analysis

The PacMAN decision support system calculates a risk score for every potential detection of a species listed in the World Register of Introduced Marine Species (WRiMS). Risk scores are based on whether the species is recorded as being native or introduced in the area, and in absence of such information, on whether the detection occurred within the known thermal range of the species. If the species is known to have impact in parts of its introduced range, this is also taken into account.

flowchart LR; priority{On priority list?} --> |Yes| high0([High]) priority --> |No| native{Native?} native --> |Yes| none([None]); native --> |No| alien{Alien?}; alien --> |Yes| invasive{Invasive
or concern?}; invasive --> |Yes| high1([High]); invasive --> |No| impact1{Any impact?}; impact1 --> |Yes| high2([High]); impact1 --> |No| medium1([Medium]); alien --> |No| thermal{In thermal
range?}; thermal --> |No| low1([Low]); thermal --> |Yes| impact2{Any impact?}; impact2 --> |Yes| medium2([Medium]); impact2 --> |No| low2([Low]); style low1 fill:#abc493b3; style low2 fill:#abc493b3; style medium1 fill:#eda745b3; style medium2 fill:#eda745b3; style high0 fill:#f5425db3; style high1 fill:#f5425db3; style high2 fill:#f5425db3;

Identification confidence

This system integrates species occurrence data from a variety of sources. In the case of visual observations, the system will assign a high confidence level to the detection. In the case of eDNA data, the confidence level is based on the molecular marker used, and the sequence similarity to to the corresponding hit in the reference database. If a sequence based detection has multiple plausible identifications, this will also result in a lower confidence level.

flowchart LR; visualdna{Visual or eDNA?} --> |Visual| high1([High]); visualdna{Visual or eDNA?} --> |eDNA| edna; edna --> |No| medium1([Medium]); edna{< 10 reads or
> 2 names or
identity < 0.99?} --> |Yes| low1([Low]); style high1 fill:#abc493b3; style medium1 fill:#eda745b3; style low1 fill:#f5425db3;

Data submission

Data can be submitted to the decision support system by publishing to OBIS, see the OBIS manual for more information. To allow the system to take into account multiple possible identifications and corresponding identity scores, records need to encode this information in a structured way in the identificationRemarks field. Future iterations of the Darwin Core standard and the DNADerivedData extension to Darwin Core should address this issue.