Ipython notebook is a powerful tool to learn python programming. In this post, I demonstrate how to setup a ipython notebook to to spark program in python.
- Install spark
suppose spark is install at directory ~/spark, then execute:1export SPARK_HOME="~/spark" - Install anaconda at ~/anaconda
123cd ~/anacondazip -r anaconda.zip .
This will compress all the anaconda files to a zip file
Run ipython notebook for pyspark using local model
- Now you can start a ipython notebook server in local model:
1234567IPYTHON_OPTS=“notebook –notebook-dir=${WORKSPACE_DIR} –ip=* \–config=${CONFIG_FILE} \–port=${PORT}” pyspark –master local[2] \–driver-memory 3G \–jars /..../hcatalog-support.jar \–conf spark.authenticate.secret=password \
CONFIG_FILE is the location of the jupyter_notebook_config file.