-
- Downloads
[SPARK-6079] Use index to speed up StatusTracker.getJobIdsForGroup()
`StatusTracker.getJobIdsForGroup()` is implemented via a linear scan over a HashMap rather than using an index, which might be an expensive operation if there are many (e.g. thousands) of retained jobs. This patch adds a new map to `JobProgressListener` in order to speed up these lookups. Author: Josh Rosen <joshrosen@databricks.com> Closes #4830 from JoshRosen/statustracker-job-group-indexing and squashes the following commits: e39c5c7 [Josh Rosen] Address review feedback 6709fb2 [Josh Rosen] Merge remote-tracking branch 'origin/master' into statustracker-job-group-indexing 2c49614 [Josh Rosen] getOrElse 97275a7 [Josh Rosen] Add jobGroup to jobId index to JobProgressListener
Showing
- core/src/main/scala/org/apache/spark/SparkStatusTracker.scala 1 addition, 2 deletions.../src/main/scala/org/apache/spark/SparkStatusTracker.scala
- core/src/main/scala/org/apache/spark/ui/jobs/JobProgressListener.scala 21 additions, 2 deletions.../scala/org/apache/spark/ui/jobs/JobProgressListener.scala
- core/src/test/scala/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala 29 additions, 2 deletions...a/org/apache/spark/ui/jobs/JobProgressListenerSuite.scala
Loading
Please register or sign in to comment