Adriano de Bernardi Schneider (1) & Denis Jacob Machado (2)
(1) Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA;
(2) Laboratório de Anfíbios, Universidade de São Paulo, São Paulo, SP, Brazil.

Working phylogenetic hypothesis. Most relevant clades have sensitivity plots showing their prevalence among different scenarios.

Pipeline efficiency and gene prevalence after annotations.

Normalized match split distances among selected cladograms. (*) Bayesian inference was only performed for complete annotated datasets using translation-based alignment.

Without gene annotation, the alignment of the entire polyprotein might lead to spurious alignments in which nucleotides from different genes are put together into the same character.

A new annotation pipeline for Flaviviridae genomes, including: (i) prediction of putative protein coding sequences with GeneWise using reference protein sequences from UniProt and NCBI; (ii) validation of best matches using TransDecoder, BlastP, and hmmscan; (iii) pooling orthologous loci for translation alignment using PAM250; (iv) removal of outliers through the Tukey method. We joined genes and aligned them with different algorithms (ClustalW, Mafft, Muscle, Geneious translation-based).