-
- Downloads
[SPARK-15593][SQL] Add DataFrameWriter.foreach to allow the user consuming data in ContinuousQuery
## What changes were proposed in this pull request? * Add DataFrameWriter.foreach to allow the user consuming data in ContinuousQuery * ForeachWriter is the interface for the user to consume partitions of data * Add a type parameter T to DataFrameWriter Usage ```Scala val ds = spark.read....stream().as[String] ds.....write .queryName(...) .option("checkpointLocation", ...) .foreach(new ForeachWriter[Int] { def open(partitionId: Long, version: Long): Boolean = { // prepare some resources for a partition // check `version` if possible and return `false` if this is a duplicated data to skip the data processing. } override def process(value: Int): Unit = { // process data } def close(errorOrNull: Throwable): Unit = { // release resources for a partition // check `errorOrNull` and handle the error if necessary. } }) ``` ## How was this patch tested? New unit tests. Author: Shixiong Zhu <shixiong@databricks.com> Closes #13342 from zsxwing/foreach.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala 110 additions, 40 deletions...src/main/scala/org/apache/spark/sql/DataFrameWriter.scala
- sql/core/src/main/scala/org/apache/spark/sql/Dataset.scala 1 addition, 1 deletionsql/core/src/main/scala/org/apache/spark/sql/Dataset.scala
- sql/core/src/main/scala/org/apache/spark/sql/ForeachWriter.scala 105 additions, 0 deletions...e/src/main/scala/org/apache/spark/sql/ForeachWriter.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ForeachSink.scala 53 additions, 0 deletions...rg/apache/spark/sql/execution/streaming/ForeachSink.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/ForeachSinkSuite.scala 141 additions, 0 deletions...ache/spark/sql/execution/streaming/ForeachSinkSuite.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala 3 additions, 1 deletion...cala/org/apache/spark/sql/sources/BucketedReadSuite.scala
Loading
Please register or sign in to comment