Skip to content
Snippets Groups Projects
  • Davies Liu's avatar
    c9e2ef52
    [SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of... · c9e2ef52
    Davies Liu authored
    [SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of serialization for Python DataFrame
    
    This PR fix the long standing issue of serialization between Python RDD and DataFrame, it change to using a customized Pickler for InternalRow to enable customized unpickling (type conversion, especially for UDT), now we can support UDT for UDF, cc mengxr .
    
    There is no generated `Row` anymore.
    
    Author: Davies Liu <davies@databricks.com>
    
    Closes #7301 from davies/sql_ser and squashes the following commits:
    
    81bef71 [Davies Liu] address comments
    e9217bd [Davies Liu] add regression tests
    db34167 [Davies Liu] Refactor of serialization for Python DataFrame
    c9e2ef52
    History
    [SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of...
    Davies Liu authored
    [SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of serialization for Python DataFrame
    
    This PR fix the long standing issue of serialization between Python RDD and DataFrame, it change to using a customized Pickler for InternalRow to enable customized unpickling (type conversion, especially for UDT), now we can support UDT for UDF, cc mengxr .
    
    There is no generated `Row` anymore.
    
    Author: Davies Liu <davies@databricks.com>
    
    Closes #7301 from davies/sql_ser and squashes the following commits:
    
    81bef71 [Davies Liu] address comments
    e9217bd [Davies Liu] add regression tests
    db34167 [Davies Liu] Refactor of serialization for Python DataFrame