Skip to content
Snippets Groups Projects
  • Nischol Antao's avatar
    f6592cdd
    1) Ran Code for questions 1-3 on pyspark in cluster mode, with multiple nodes.... · f6592cdd
    Nischol Antao authored
    1) Ran Code for questions 1-3 on pyspark in cluster mode, with multiple nodes. Measured and captured the difference in performance between running it on a single EC-2 instance, and running it on a cluster.
    
    2) Added some screenshots for the final report, to show the cluster configuration.
    
    3) Added ipython notebooks for performance metrics in local mode.
    
    4) Added json files for zeppelin notebooks
    
    5) Created new source files for Code pyspark code run in zeppelin notebooks, in cluster mode
    
    6) Added test results for question 3 when using hive to calculate the median data.
    
    7) Added R code from Rob for question 3 local exploration
    
    8) Renamed some of the local exploration files
    f6592cdd
    History
    1) Ran Code for questions 1-3 on pyspark in cluster mode, with multiple nodes....
    Nischol Antao authored
    1) Ran Code for questions 1-3 on pyspark in cluster mode, with multiple nodes. Measured and captured the difference in performance between running it on a single EC-2 instance, and running it on a cluster.
    
    2) Added some screenshots for the final report, to show the cluster configuration.
    
    3) Added ipython notebooks for performance metrics in local mode.
    
    4) Added json files for zeppelin notebooks
    
    5) Created new source files for Code pyspark code run in zeppelin notebooks, in cluster mode
    
    6) Added test results for question 3 when using hive to calculate the median data.
    
    7) Added R code from Rob for question 3 local exploration
    
    8) Renamed some of the local exploration files