-
- Downloads
[SPARK-21583][SQL] Create a ColumnarBatch from ArrowColumnVectors
## What changes were proposed in this pull request? This PR allows the creation of a `ColumnarBatch` from `ReadOnlyColumnVectors` where previously a columnar batch could only allocate vectors internally. This is useful for using `ArrowColumnVectors` in a batch form to do row-based iteration. Also added `ArrowConverter.fromPayloadIterator` which converts `ArrowPayload` iterator to `InternalRow` iterator and uses a `ColumnarBatch` internally. ## How was this patch tested? Added a new unit test for creating a `ColumnarBatch` with `ReadOnlyColumnVectors` and a test to verify the roundtrip of rows -> ArrowPayload -> rows, using `toPayloadIterator` and `fromPayloadIterator`. Author: Bryan Cutler <cutlerb@gmail.com> Closes #18787 from BryanCutler/arrow-ColumnarBatch-support-SPARK-21583.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/arrow/ArrowConverters.scala 75 additions, 1 deletion...rg/apache/spark/sql/execution/arrow/ArrowConverters.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/arrow/ArrowConvertersSuite.scala 28 additions, 1 deletion...ache/spark/sql/execution/arrow/ArrowConvertersSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/vectorized/ColumnarBatchSuite.scala 54 additions, 0 deletions...e/spark/sql/execution/vectorized/ColumnarBatchSuite.scala
Loading
Please register or sign in to comment