In collaboration with the Knight Cancer Institute at the Oregon Health and Science University, we are using Wings workflows to annotate patient sequence variants obtained through clinical DNA sequencing. The diagram below shows a workflow for the annotation of identified genomic variants with: 1) the potential protein consequence resulting from the sequence variant annotated with the use of Bioconductor, 2) previously known mutations found in COSMIC, 3) known sequence variants curated within Ensembl and 4) a manually curated in-house database of variants from previous clinical samples. These biological data sources are continually being updated with new information and with corrections that could affect the patient’s annotations. Thus it is vital that the versions of each data source are captured as provenance records for the annotations of each patient’s data. Reproducibility is also important in order to ensure consistency across patient annotations.
Semantic workflows are crucial for this application:
The diagram below shows the high-level workflow, followed by the executable workflow indicating all the versions of data sources and codes automatically tracked by the system.
For more details, see: