1) Download Spark 1.4.0 from https://spark.apache.org/downloads.html
2)Check the dependencies Scala 2.11, Maven 3.3.3 using commands
scala -version & mvn -version
3)Now we need to build Spark using Apache Maven, run
mvn -DskipTests clean package
scala -version & mvn -version
3)Now we need to build Spark using Apache Maven, run
mvn -DskipTests clean package
4)Wait till build gets success and this process would take around 45mins.
5)Check scala shell and pyspark shell using commands
./bin/spark-shell (Scala)
./bin/pyspark (Python)
6)If you need to use Spark 1.4 with Ipython notebook
- Use this link to setup Spark on Ipython notebook ( http://advancedatascience.blogspot.com/2015/06/how-to-use-spark-on-ipython-notebook.html) and change the spark home directory path in this file \00-pyspark-setup.py