Building Avocado from Source¶
You will need to have Apache Maven version 3.1.1 or later installed in order to build Avocado.
Note: The default configuration is for Hadoop 2.7.3. If building against a different version of Hadoop, please pass-Dhadoop.version=<HADOOP_VERSION>
to the Maven command.
git clone https://github.com/bigdatagenomics/avocado.git
cd avocado
export MAVEN_OPTS="-Xmx512m -XX:MaxPermSize=128m"
mvn clean package -DskipTests
Outputs
...
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 9.647s
[INFO] Finished at: Thu May 23 15:50:42 PDT 2013
[INFO] Final Memory: 19M/81M
[INFO] ------------------------------------------------------------------------
You might want to take a peek at the scripts/jenkins-test
script and
give it a run. We use this script to test that Avocado is
working correctly.
Running Avocado¶
Avocado is packaged as an überjar and includes all necessary dependencies, except for Apache Hadoop and Apache Spark.
You might want to add the following to your .bashrc
to make running
Avocado easier:
alias avocado-submit="${AVOCADO_HOME}/bin/avocado-submit"
$AVOCADO_HOME
should be the path to where you have checked AVOCADO out on
your local filesystem. The alias calls a script that wraps
the spark-submit
command to set up Avocado. You
will need to have the Spark binaries on your system; prebuilt binaries
can be downloaded from the Spark
website. Our continuous
integration setup
builds ADAM against Spark 2.0.0, Scala versions 2.10
and 2.11, and Hadoop versions 2.3.0 and 2.6.0.
Once this alias is in place, you can run Avocado by simply typing
avocado-submit
at the command line.
avocado-submit