-
- Downloads
[SPARK-16350][SQL] Fix support for incremental planning in wirteStream.foreach()
## What changes were proposed in this pull request? There are cases where `complete` output mode does not output updated aggregated value; for details please refer to [SPARK-16350](https://issues.apache.org/jira/browse/SPARK-16350). The cause is that, as we do `data.as[T].foreachPartition { iter => ... }` in `ForeachSink.addBatch()`, `foreachPartition()` does not support incremental planning for now. This patches makes `foreachPartition()` support incremental planning in `ForeachSink`, by making a special version of `Dataset` with its `rdd()` method supporting incremental planning. ## How was this patch tested? Added a unit test which failed before the change Author: Liwei Lin <lwlin7@gmail.com> Closes #14030 from lw-lin/fix-foreach-complete.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ForeachSink.scala 38 additions, 2 deletions...rg/apache/spark/sql/execution/streaming/ForeachSink.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/IncrementalExecution.scala 2 additions, 2 deletions.../spark/sql/execution/streaming/IncrementalExecution.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/ForeachSinkSuite.scala 77 additions, 9 deletions...ache/spark/sql/execution/streaming/ForeachSinkSuite.scala
Please register or sign in to comment