Skip to content
  • hyukjinkwon's avatar
    c5857e49
    [SPARK-23446][PYTHON] Explicitly check supported types in toPandas · c5857e49
    hyukjinkwon authored
    ## What changes were proposed in this pull request?
    
    This PR explicitly specifies and checks the types we supported in `toPandas`. This was a hole. For example, we haven't finished the binary type support in Python side yet but now it allows as below:
    
    ```python
    spark.conf.set("spark.sql.execution.arrow.enabled", "false")
    df = spark.createDataFrame([[bytearray("a")]])
    df.toPandas()
    spark.conf.set("spark.sql.execution.arrow.enabled", "true")
    df.toPandas()
    ```
    
    ```
         _1
    0  [97]
      _1
    0  a
    ```
    
    This should be disallowed. I think the same things also apply to nested timestamps too.
    
    I also added some nicer message about `spark.sql.execution.arrow.enabled` in the error message.
    
    ## How was this patch tested?
    
    Manually tested and tests added in `python/pyspark/sql/tests.py`.
    
    Author: hyukjinkwon <gurwls223@gmail.com>
    
    Closes #20625 from HyukjinKwon/pandas_convertion_supported_type.
    c5857e49
    [SPARK-23446][PYTHON] Explicitly check supported types in toPandas
    hyukjinkwon authored
    ## What changes were proposed in this pull request?
    
    This PR explicitly specifies and checks the types we supported in `toPandas`. This was a hole. For example, we haven't finished the binary type support in Python side yet but now it allows as below:
    
    ```python
    spark.conf.set("spark.sql.execution.arrow.enabled", "false")
    df = spark.createDataFrame([[bytearray("a")]])
    df.toPandas()
    spark.conf.set("spark.sql.execution.arrow.enabled", "true")
    df.toPandas()
    ```
    
    ```
         _1
    0  [97]
      _1
    0  a
    ```
    
    This should be disallowed. I think the same things also apply to nested timestamps too.
    
    I also added some nicer message about `spark.sql.execution.arrow.enabled` in the error message.
    
    ## How was this patch tested?
    
    Manually tested and tests added in `python/pyspark/sql/tests.py`.
    
    Author: hyukjinkwon <gurwls223@gmail.com>
    
    Closes #20625 from HyukjinKwon/pandas_convertion_supported_type.
Loading