Officially released versions of Apache Spark libraries are in maven http://search.maven.org/, so you can always add dependencies to them in your project and maven will download them. See how to make a maven project that uses Spark libraries at avulanov.blogspot.com/2014/07/how-to-create-scala-project-for-apache.html. I want to use the latest build of Spark in my maven project, moreover, my custom build of Spark. There are at least two options of doing this. First is building Apache Spark with install or running `install` target for a particular Spark project:
mvn -Dhadoop.version=1.2.1 -DskipTests clean install
- Compile your local version of Apache Spark with
mvn -Dhadoop.version=1.2.1 -DskipTests clean package
- Add the newly built libraries to your local maven repository. Note that you need to specify both jar and pom.xml file. The latter will later load dependencies for you. You can skip groupId, version and artifactId if you use maven install-file v.2.5 or later. See how to use 2.5 at http://stackoverflow.com/questions/25155639/how-do-i-force-maven-to-use-maven-install-plugin-version-2-5. The command below shows how to add spark-core:
mvn install:install-file -Dfile=/spark/core/target/spark-core_2.10-1.1.0-latest.jar -DpomFile=/spark/core/pom.xml -DgroupId=org.apache.spark -Dversion=1.1.0-latest -DartifactId=spark-core_2.10
- Reference the new version of this library in your pom.xml (1.1.0-latest)
- There might be a problem with imports and there versions, so try to run mvn install (I run it in Idea IDE). In my case maven didn't like that asm and fz4 dependencies didn't have versions specified. Specify them if needed.
No comments:
Post a Comment