-
- Downloads
[SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of...
[SPARK-7902] [SPARK-6289] [SPARK-8685] [SQL] [PYSPARK] Refactor of serialization for Python DataFrame This PR fix the long standing issue of serialization between Python RDD and DataFrame, it change to using a customized Pickler for InternalRow to enable customized unpickling (type conversion, especially for UDT), now we can support UDT for UDF, cc mengxr . There is no generated `Row` anymore. Author: Davies Liu <davies@databricks.com> Closes #7301 from davies/sql_ser and squashes the following commits: 81bef71 [Davies Liu] address comments e9217bd [Davies Liu] add regression tests db34167 [Davies Liu] Refactor of serialization for Python DataFrame
Showing
- python/pyspark/sql/context.py 2 additions, 3 deletionspython/pyspark/sql/context.py
- python/pyspark/sql/dataframe.py 3 additions, 13 deletionspython/pyspark/sql/dataframe.py
- python/pyspark/sql/tests.py 22 additions, 6 deletionspython/pyspark/sql/tests.py
- python/pyspark/sql/types.py 147 additions, 272 deletionspython/pyspark/sql/types.py
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala 12 additions, 0 deletions...cala/org/apache/spark/sql/catalyst/expressions/rows.scala
- sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala 2 additions, 3 deletionssql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/pythonUDFs.scala 104 additions, 18 deletions...ain/scala/org/apache/spark/sql/execution/pythonUDFs.scala
Loading
Please register or sign in to comment