-
- Downloads
[SPARK-20452][SS][KAFKA] Fix a potential ConcurrentModificationException for batch Kafka DataFrame
## What changes were proposed in this pull request? Cancel a batch Kafka query but one of task cannot be cancelled, and rerun the same DataFrame may cause ConcurrentModificationException because it may launch two tasks sharing the same group id. This PR always create a new consumer when `reuseKafkaConsumer = false` to avoid ConcurrentModificationException. It also contains other minor fixes. ## How was this patch tested? Jenkins. Author: Shixiong Zhu <shixiong@databricks.com> Closes #17752 from zsxwing/kafka-fix.
Showing
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala 10 additions, 2 deletions...a/org/apache/spark/sql/kafka010/CachedKafkaConsumer.scala
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala 4 additions, 2 deletions...ala/org/apache/spark/sql/kafka010/KafkaOffsetReader.scala
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala 26 additions, 4 deletions...n/scala/org/apache/spark/sql/kafka010/KafkaRelation.scala
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala 69 additions, 78 deletions...a/org/apache/spark/sql/kafka010/KafkaSourceProvider.scala
- external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala 9 additions, 10 deletions.../scala/org/apache/spark/sql/kafka010/KafkaSourceRDD.scala
- external/kafka-0-10/src/main/scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala 1 addition, 1 deletion.../scala/org/apache/spark/streaming/kafka010/KafkaRDD.scala
Loading
Please register or sign in to comment