Skip to content

forage-pyspark

View on GitHub


forage

Getting local environment versions correct (jars etc)

spark version = 3.0.1 (/usr/local/Cellar/apache-spark/3.0.1/)
scala -version (= 2.12)
python --version (= 3.7.3)

brew install scala@2.12
If you have other scala versions then,
    brew link scala@2.12 --force

Run a simple test code

jdk 1.8.0_212  (switching jdk)
Version of spark-excel (maven) = com.crealytics:spark-excel_2.12:0.13.4
To simply run,
    spark-submit --packages com.crealytics:spark-excel_2.12:0.13.4 just_a_test.py

To run some tasks (spark-submit includes pyspark python module at runtime)

spark-submit --packages com.crealytics:spark-excel_2.12:0.13.4 big_data_tasks.py -f "path to excel" > output.log