Thursday, July 3, 2014

How to create a Scala project for Apache Spark

  • Install maven
  • Generate maven project with
  • mvn archetype:generate -B \
      -DarchetypeGroupId=net.alchim31.maven -DarchetypeArtifactId=scala-archetype-simple -DarchetypeVersion=1.5 \
      -DgroupId=org.apache.spark -DartifactId=spark-myNewProject -Dversion=0.1-SNAPSHOT -Dpackage=org.apache.spark
  • or with
  • mvn archetype:generate
  • Add a new dependency to your pom.xml (check the latest version at http://search.maven.org/)
  •       <dependency>
              <groupId>org.apache.spark</groupId>
              <artifactId>spark-core_2.10</artifactId>
              <version>1.0.2</version>
          </dependency>
    
  • Install IntelliJ idea and Scala plugin for it (alternatively, Eclipse with Scala plugin, which didn't work for me) 
  • Open pom file in IntelliJ Idea
  • Important:
    • Be careful when including libraries in your project, because they may exist in Spark libraries with different versions and you will get java.lang.IncompatibleClassChangeError: Implementing class
    • If you want to do create SparkContext and perform RDD operations in Windows, there is known bug with winutils from Hadoop. You need to get it and set HADOOP_HOME path for it.

2 comments:

  1. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in Apache Spark and Scala, kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Virtual Instructor led training on TECHNOLOGY. We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Sangita Mohanty
    MaxMunus
    E-mail: sangita@maxmunus.com
    Skype id: training_maxmunus
    Ph:(0) 9738075708 / 080 - 41103383
    http://www.maxmunus.com/

    ReplyDelete
  2. I really appreciate information shared above. It’s of great help. If someone want to learn Online (Virtual) instructor lead live training in APACHE SPARK , kindly contact us http://www.maxmunus.com/contact
    MaxMunus Offer World Class Virtual Instructor led training On APACHE SPARK . We have industry expert trainer. We provide Training Material and Software Support. MaxMunus has successfully conducted 100000+ trainings in India, USA, UK, Australlia, Switzerland, Qatar, Saudi Arabia, Bangladesh, Bahrain and UAE etc.
    For Demo Contact us.
    Saurabh Srivastava
    MaxMunus
    E-mail: saurabh@maxmunus.com
    Skype id: saurabhmaxmunus
    Ph:+91 8553576305 / 080 - 41103383
    http://www.maxmunus.com/


    ReplyDelete