-
- Downloads
[SPARK-4612] Reduce task latency and increase scheduling throughput by making...
[SPARK-4612] Reduce task latency and increase scheduling throughput by making configuration initialization lazy https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L337 creates a configuration object for every task that is launched, even if there is no new dependent file/JAR to update. This is a heavy-weight creation that should be avoided if there is no new file/JAR to update. This PR makes that creation lazy. Quick local test in spark-perf scheduling throughput tests gives the following numbers in a local standalone scheduler mode. 1 job with 10000 tasks: before 7.8395 seconds, after 2.6415 seconds = 3x increase in task scheduling throughput pwendell JoshRosen Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #3463 from tdas/lazy-config and squashes the following commits: c791c1e [Tathagata Das] Reduce task latency by making configuration initialization lazy
Please register or sign in to comment