# Introduction¶

Avocado is a variant caller built on top of Apache Spark to allow rapid variant calling on cluster/cloud computing environments. Avocado is built on ADAM’s APIs, and achieves variant calling accuracy that is similar to state-of-the-art tools while being able to drop variant calling latency to approximately 15 minutes when running on a 1,024 core cluster.

# Workflows Supported¶

./bin/avocado-submit

Using SPARK_SUBMIT=/usr/local/bin/spark-2.2.1-bin-hadoop2.7/bin/spark-submit

Choose one of the following commands:

biallelicGenotyper : Call variants under a biallelic model
discover : Discover variants in reads
jointer : Joint call and annotate variants.
mergeDiscovered : Merge variants discovered from reads of multiple samples
reassemble : Reassemble reads to canonicalize variants
trioGenotyper : Call variants in a trio under a biallelic model


The avocado-submit script follows the same conventions as the adam-submit command line, whose documentation can be found here. As a result, just like ADAM, Avocado can be deployed on a local machine, on AWS, an in-house cluster running YARN or SLURM, or using Toil.