-
Notifications
You must be signed in to change notification settings - Fork 0
Setup (local)
Harish Chandra Thuwal edited this page Oct 16, 2017
·
1 revision
Download:
-
Clone repo:
git clone https://github.com/dufferzafar/github-analytics -
Create virutalenv:
venv -p /usr/bin/python3.5 env -
Install dependencies:
env/bin/pip install -r requirements.txt
-
wget https://d3kbcqa49mib13.cloudfront.net/spark-2.2.0-bin-hadoop2.7.tgz -
tar xzf spark-2.2.0-bin-hadoop2.7.tgz -
This is optional
sudo mv spark-2.2.0-bin-hadoop2.7.tgz /usr/local/spark -
Set environment variables i.e add following lines to .bashrc
-
export SPARK_HOME=/usr/local/spark(or path to extracted spark folder) export PATH=$PATH:$SPARK_HOME/bin
-
-
Run the command in the terminal to open spark with python shell
pyspark -
Run the command in the terminal to open spark with scala shell
spark-shell -
After running any of the above two commands the spark GUI can be accesed using
-
Sample SparkPi Program (to calculate digits of pi)
run-example org.apache.spark.examples.SparkPi