- Download latest version of Spark from here. I selected 1.6.0 release, Pre-built for Hadoop 2.6 and later, and did a direct download.
- Navigate to the downloaded folder and do: tar -xvf spark-1.6.0.tgz
- Navigate to the extracted folder and do: ./bin/spark-shell
I selected writing in Scala. But you can do so in Python and Java. I had a version issue with sbt. I also had a problem connecting to the master (I used a cluster with a master and one worker), where I solved like:
cd spark-1.6.0-bin-hadoop2.6/conf pico spark-env.sh.template // and I appended SPARK_MASTER_IP=<your host IP> // mine is 192.168.1.2 source spark-env.sh.template
./sbin/start-master.sh // now open a browser and go to http://localhost:8080/ ./sbin/start-slave.sh spark://gsamaras:7077 sbt package bin/spark-submit --class "KMeans" --master spark://gsamaras:7077 target/scala-2.10/kmeans-project_2.10-1.0.jar
Have questions? Comments? Did you find a bug? Let me know! 😀
Page created by G. (George) Samaras (DIT)