-
- Downloads
[SPARK-18516][SQL] Split state and progress in streaming
This PR separates the status of a `StreamingQuery` into two separate APIs: - `status` - describes the status of a `StreamingQuery` at this moment, including what phase of processing is currently happening and if data is available. - `recentProgress` - an array of statistics about the most recent microbatches that have executed. A recent progress contains the following information: ``` { "id" : "2be8670a-fce1-4859-a530-748f29553bb6", "name" : "query-29", "timestamp" : 1479705392724, "inputRowsPerSecond" : 230.76923076923077, "processedRowsPerSecond" : 10.869565217391303, "durationMs" : { "triggerExecution" : 276, "queryPlanning" : 3, "getBatch" : 5, "getOffset" : 3, "addBatch" : 234, "walCommit" : 30 }, "currentWatermark" : 0, "stateOperators" : [ ], "sources" : [ { "description" : "KafkaSource[Subscribe[topic-14]]", "startOffset" : { "topic-14" : { "2" : 0, "4" : 1, "1" : 0, "3" : 0, "0" : 0 } }, "endOffset" : { "topic-14" : { "2" : 1, "4" : 2, "1" : 0, "3" : 0, "0" : 1 } }, "numRecords" : 3, "inputRowsPerSecond" : 230.76923076923077, "processedRowsPerSecond" : 10.869565217391303 } ] } ``` Additionally, in order to make it possible to correlate progress updates across restarts, we change the `id` field from an integer that is unique with in the JVM to a `UUID` that is globally unique. Author: Tathagata Das <tathagata.das1565@gmail.com> Author: Michael Armbrust <michael@databricks.com> Closes #15954 from marmbrus/queryProgress.
Showing
- external/kafka-0-10-sql/src/test/scala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala 4 additions, 3 deletions...cala/org/apache/spark/sql/kafka010/KafkaSourceSuite.scala
- project/MimaExcludes.scala 11 additions, 0 deletionsproject/MimaExcludes.scala
- python/pyspark/sql/streaming.py 22 additions, 304 deletionspython/pyspark/sql/streaming.py
- python/pyspark/sql/tests.py 22 additions, 0 deletionspython/pyspark/sql/tests.py
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/MetricsReporter.scala 53 additions, 0 deletions...pache/spark/sql/execution/streaming/MetricsReporter.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/ProgressReporter.scala 234 additions, 0 deletions...ache/spark/sql/execution/streaming/ProgressReporter.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamExecution.scala 64 additions, 218 deletions...pache/spark/sql/execution/streaming/StreamExecution.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/StreamMetrics.scala 0 additions, 243 deletions.../apache/spark/sql/execution/streaming/StreamMetrics.scala
- sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala 8 additions, 0 deletions...rc/main/scala/org/apache/spark/sql/internal/SQLConf.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/SourceStatus.scala 0 additions, 95 deletions...n/scala/org/apache/spark/sql/streaming/SourceStatus.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQuery.scala 20 additions, 13 deletions...scala/org/apache/spark/sql/streaming/StreamingQuery.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryException.scala 1 addition, 1 deletion.../apache/spark/sql/streaming/StreamingQueryException.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryListener.scala 12 additions, 12 deletions...g/apache/spark/sql/streaming/StreamingQueryListener.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryManager.scala 20 additions, 7 deletions...rg/apache/spark/sql/streaming/StreamingQueryManager.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/StreamingQueryStatus.scala 11 additions, 140 deletions...org/apache/spark/sql/streaming/StreamingQueryStatus.scala
- sql/core/src/main/scala/org/apache/spark/sql/streaming/progress.scala 193 additions, 0 deletions.../main/scala/org/apache/spark/sql/streaming/progress.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/streaming/StreamMetricsSuite.scala 0 additions, 213 deletions...he/spark/sql/execution/streaming/StreamMetricsSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/streaming/FileStreamSourceSuite.scala 7 additions, 3 deletions...rg/apache/spark/sql/streaming/FileStreamSourceSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamTest.scala 3 additions, 70 deletions...est/scala/org/apache/spark/sql/streaming/StreamTest.scala
- sql/core/src/test/scala/org/apache/spark/sql/streaming/StreamingQueryListenerSuite.scala 143 additions, 124 deletions...che/spark/sql/streaming/StreamingQueryListenerSuite.scala
Loading
Please register or sign in to comment