BIOINFORMATICS AND COMPUTATIONAL BIOLOGY

VIBRANT (Virus Identification By iteRative ANoTation) is an automated software tool for the recovery and annotation of bacterial/archaeal viruses, determination of genome quality and completeness, and metabolic gene identification. Highlighting viral auxiliary metabolic genes (AMGs) and metabolic pathways further allows the software to serve as a platform for evaluating viral community function. VIBRANT’s method utilizes a hybrid neural network machine learning and protein similarity approach (KEGG, Pfam and VOG protein databases) to maximize identification of lytic viral genomes and integrated proviruses, including highly diverse viruses. It achieves high accuracy and recovery due to the use of a newly described “v-score” metric to quantify virus-association of each protein annotation. VIBRANT was designed for use with complex metagenomic samples but also functions to identify viruses from cultivated or simple systems. 

 

VIBRANT was designed to be fast, accurate and user-friendly. At minimum the only input required is a single file containing unknown sequences (genomes, MAGs, scaffolds). The outputs include a variety of useful sets of information in addition to the identified viruses in FASTA and GenBank formats. These additional outputs include the following: simple visualizations of AMG pathways and viral metrics (number, quality and sizes), spreadsheet files for AMG details (names, counts and pathways), protein annotation information (full annotations per database and best hit annotation per protein), identified circular viruses, and a summary spreadsheet of all identification metric information per virus. 

Check out VIBRANT on GitHub

Read our manuscript at Microbiome

METABOLIC (METabolic And BiogeOchemistry anaLyses In miCrobes) is a scalable software to study microbial metabolic traits and biogeochemical functional profiles of a microbiome/community based on microbial genomes. METABOLIC can help integrate genome-informed metabolism into metabolic and biogeochemical models.

METABOLIC annotates genomes and organizes metabolic characterization at the scale of individual genomes and the entire microbial community. Additional analyses can be conducted to study genome abundance, sequential metabolic transformations, metabolic energy flow patterns, and metabolic interactions and networks at community scales. User-friendly results are provided in the form of curated tables and diagrams. Finally, METABOLIC can enable visualization of microbial contributions to biogeochemical cycles.

Check out METABOLIC on GitHub

Read our manuscript at bioRxiv

PropagAtE (Prophage Activity Estimator) uses genomic coordinates of integrated prophage sequences and short sequencing reads to estimate if a given prophage was in the lysogenic (dormant) or lytic (active) stage of infection. Providing context to the infection stage of a prophage is imperative for accurate conclusions on its role in effecting its host and the microbial community.  Prophages are designated according to a genomic/scaffold coordinate file, either manually generated by the user or taken directly from a VIBRANT (at least v1.2.1) output. After read coverage processing (trimming scaffold ends, filtering aligned gaps/mismatches, remove outlier coverage values) the prophage:host read coverage ratio and corresponding effect size are used to estimate if the prophage was actively replicating its genome (significantly more prophage genome copies than host copies). PropagAtE is customizable to take in complete genomes or metagenomic scaffolds, along with raw Illumina (short) reads or instead take pre-aligned data files (SAM or BAM format). Threshold values are customizable but PropagAtE outputs clear “active” versus “dormant” estimations of given prophages with associated statistics.

Check out PropagAtE on GitHub

Read our manuscript at bioRxiv