User Guide

HaVOC com­mand line op­tions

Instructions

HAVoC can be run to analyze FASTQ files within a directory by typing the following in terminal:

sh HaVoC.sh [FASTQ directory]

The target directory must contain matching FASTQ files for forward (R1) and reverse (R2) reads, which can be either gzipped (*.fastq.gz) or uncompressed (*.fastq).

HAVoC needs to be in the same directory as NexteraPE-PE.fa and ref.fa.

The following options can be manually changed in the script file depending on your preferences:

havoc_options

 

Ex­ample 1:

Within the repository (https://bitbucket.org/auto_cov_pipeline/havoc) are all necessary files to run and test HAVoC. Along with HAVoC.sh, adapter file (NexteraPE-PE.fa) and the reference genome (ref.fa), there is a folder named "Example_FASTQs" containing two sets of files representing SARS-CoV-2 variants of concern:

  • S-Africa-variant-1_S123_L001_R1_001.fastq.gz
  • S-Africa-variant-1_S123_L001_R2_001.fastq.gz
  • UK-variant-1_S12_L001_R1_001.fastq.gz
  • UK-variant-1_S12_L001_R2_001.fastq.gz

These may be downloaded via the following command:

git clone https://bitbucket.org/auto_cov_pipeline/havoc.git

Note that downloading these may take time depending on your internet speed, as the FASTQ files are relatively large (200–400 MB).

Before starting HAVoC, make sure that all required bioinformatic tools are installed on your computer or server (refer to the Installation page). Depending on where the files are located, run the HAVoC pipeline with the following command:

sh HaVoC.sh Example_FASTQs/

HAVoC will be done within a couple of minutes and in the Example_FASTQs directory should be now two new folders:

  • S-Africa-variant-1
  • UK-variant-1

These result folders should contain the consensus sequnce (*_consensus.fa) and the lineage file (*_pangolin_lineage.csv) each, along with the original FASTQ files, several BAM files (produced during processing reads), VCF files (from variant calling), a BED file (used for masking low coverage regions), and a fastp.html and fastp.json (online report of fastp results).