What is ClusTRace?
ClusTRace is a bioinformatic pipeline for high level handling and phylogenetic/cluster analysis of virus sequences.
ClusTRace was created as an aid for COVID-19 transmission chain tracing in Helsinkin University, Finland.
- assigning lineages with Pangolin
- collecting sequences to multi-fasta files according to lineage
- filtering outlier sequences
- creating multiple sequence alignments (MSA)
- creating phylogenetic trees from MSAs
- cluster analysis
- extracting sequence clusters from the obtained phylogenetic trees
- extracting clusters at different mutation rates and with different methods supported in TreeCluster
- visualizing clusters with different colors/labels in phylogenetic trees
- summarizing clusters with spreadsheets, that depict cluster size, sequence composition, growth rate and support information
- variant calling
- calling nucleotide variants for lineage and/or cluster MSA(s)
- calling amino acid variants for lineage and/or cluster VCF(s)
- summarizing nucleotide and amino acid variants with spreadsheets
- visualizing lineage amino acid variants with interactive lollipop graphs (g3viz)
Plyusnin, I., Truong Nguyen, P.T., Sironen, T. et al. ClusTRace, a bioinformatic pipeline for analyzing clusters in virus phylogenies. BMC Bioinformatics 23, 196 (2022). https://doi.org/10.1186/s12859-022-04709-8
Publication pdf is available here.