Skip to content
Snippets Groups Projects
  • Andrew Or's avatar
    c32b1b16
    [SPARK-15417][SQL][PYTHON] PySpark shell always uses in-memory catalog · c32b1b16
    Andrew Or authored
    ## What changes were proposed in this pull request?
    
    There is no way to use the Hive catalog in `pyspark-shell`. This is because we used to create a `SparkContext` before calling `SparkSession.enableHiveSupport().getOrCreate()`, which just gets the existing `SparkContext` instead of creating a new one. As a result, `spark.sql.catalogImplementation` was never propagated.
    
    ## How was this patch tested?
    
    Manual.
    
    Author: Andrew Or <andrew@databricks.com>
    
    Closes #13203 from andrewor14/fix-pyspark-shell.
    c32b1b16
    History
    [SPARK-15417][SQL][PYTHON] PySpark shell always uses in-memory catalog
    Andrew Or authored
    ## What changes were proposed in this pull request?
    
    There is no way to use the Hive catalog in `pyspark-shell`. This is because we used to create a `SparkContext` before calling `SparkSession.enableHiveSupport().getOrCreate()`, which just gets the existing `SparkContext` instead of creating a new one. As a result, `spark.sql.catalogImplementation` was never propagated.
    
    ## How was this patch tested?
    
    Manual.
    
    Author: Andrew Or <andrew@databricks.com>
    
    Closes #13203 from andrewor14/fix-pyspark-shell.