- Install maven
- Generate maven project with
mvn archetype:generate -B \ -DarchetypeGroupId=net.alchim31.maven -DarchetypeArtifactId=scala-archetype-simple -DarchetypeVersion=1.5 \ -DgroupId=org.apache.spark -DartifactId=spark-myNewProject -Dversion=0.1-SNAPSHOT -Dpackage=org.apache.spark
- or with
mvn archetype:generate
- Add a new dependency to your pom.xml (check the latest version at http://search.maven.org/)
<dependency> <groupId>org.apache.spark</groupId> <artifactId>spark-core_2.10</artifactId> <version>1.0.2</version> </dependency>
- Install IntelliJ idea and Scala plugin for it (alternatively, Eclipse with Scala plugin, which didn't work for me)
- Open pom file in IntelliJ Idea
- Important:
- Be careful when including libraries in your project, because they may exist in Spark libraries with different versions and you will get java.lang.IncompatibleClassChangeError: Implementing class
- If you want to do create SparkContext and perform RDD operations in Windows, there is known bug with winutils from Hadoop. You need to get it and set HADOOP_HOME path for it.
Thursday, July 3, 2014
How to create a Scala project for Apache Spark
Subscribe to:
Posts (Atom)