Skip to content
Snippets Groups Projects
  1. Feb 09, 2016
  2. Feb 08, 2016
    • Andrew Or's avatar
      [SPARK-10620][SPARK-13054] Minor addendum to #10835 · eeaf45b9
      Andrew Or authored
      Additional changes to #10835, mainly related to style and visibility. This patch also adds back a few deprecated methods for backward compatibility.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #10958 from andrewor14/task-metrics-to-accums-followups.
      eeaf45b9
    • Davies Liu's avatar
      [SPARK-13210][SQL] catch OOM when allocate memory and expand array · 37bc203c
      Davies Liu authored
      There is a bug when we try to grow the buffer, OOM is ignore wrongly (the assert also skipped by JVM), then we try grow the array again, this one will trigger spilling free the current page, the current record we inserted will be invalid.
      
      The root cause is that JVM has less free memory than MemoryManager thought, it will OOM when allocate a page without trigger spilling. We should catch the OOM, and acquire memory again to trigger spilling.
      
      And also, we could not grow the array in `insertRecord` of `InMemorySorter` (it was there just for easy testing).
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #11095 from davies/fix_expand.
      37bc203c
  3. Feb 06, 2016
  4. Feb 05, 2016
    • Jakob Odersky's avatar
      [SPARK-13171][CORE] Replace future calls with Future · 6883a512
      Jakob Odersky authored
      Trivial search-and-replace to eliminate deprecation warnings in Scala 2.11.
      Also works with 2.10
      
      Author: Jakob Odersky <jakob@odersky.com>
      
      Closes #11085 from jodersky/SPARK-13171.
      6883a512
    • Luc Bourlier's avatar
      [SPARK-13002][MESOS] Send initial request of executors for dyn allocation · 0bb5b733
      Luc Bourlier authored
      Fix for [SPARK-13002](https://issues.apache.org/jira/browse/SPARK-13002) about the initial number of executors when running with dynamic allocation on Mesos.
      Instead of fixing it just for the Mesos case, made the change in `ExecutorAllocationManager`. It is already driving the number of executors running on Mesos, only no the initial value.
      
      The `None` and `Some(0)` are internal details on the computation of resources to reserved, in the Mesos backend scheduler. `executorLimitOption` has to be initialized correctly, otherwise the Mesos backend scheduler will, either, create to many executors at launch, or not create any executors and not be able to recover from this state.
      
      Removed the 'special case' description in the doc. It was not totally accurate, and is not needed anymore.
      
      This doesn't fix the same problem visible with Spark standalone. There is no straightforward way to send the initial value in standalone mode.
      
      Somebody knowing this part of the yarn support should review this change.
      
      Author: Luc Bourlier <luc.bourlier@typesafe.com>
      
      Closes #11047 from skyluc/issue/initial-dyn-alloc-2.
      0bb5b733
    • Jakob Odersky's avatar
      [SPARK-13208][CORE] Replace use of Pairs with Tuple2s · 352102ed
      Jakob Odersky authored
      Another trivial deprecation fix for Scala 2.11
      
      Author: Jakob Odersky <jakob@odersky.com>
      
      Closes #11089 from jodersky/SPARK-13208.
      352102ed
  5. Feb 04, 2016
    • Raafat Akkad's avatar
      [SPARK-13052] waitingApps metric doesn't show the number of apps currently in the WAITING state · 6dbfc407
      Raafat Akkad authored
      Author: Raafat Akkad <raafat.akkad@gmail.com>
      
      Closes #10959 from RaafatAkkad/master.
      6dbfc407
    • Andrew Or's avatar
      7a4b37f0
    • Andrew Or's avatar
      [SPARK-12330][MESOS][HOTFIX] Rename timeout config · c756bda4
      Andrew Or authored
      The config already describes time and accepts a general format
      that is not restricted to ms. This commit renames the internal
      config to use a format that's consistent in Spark.
      c756bda4
    • Andrew Or's avatar
      [SPARK-13053][TEST] Unignore tests in InternalAccumulatorSuite · 15205da8
      Andrew Or authored
      These were ignored because they are incorrectly written; they don't actually trigger stage retries, which is what the tests are testing. These tests are now rewritten to induce stage retries through fetch failures.
      
      Note: there were 2 tests before and now there's only 1. What happened? It turns out that the case where we only resubmit a subset of of the original missing partitions is very difficult to simulate in tests without potentially introducing flakiness. This is because the `DAGScheduler` removes all map outputs associated with a given executor when this happens, and we will need multiple executors to trigger this case, and sometimes the scheduler still removes map outputs from all executors.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #10969 from andrewor14/unignore-accum-test.
      15205da8
    • Andrew Or's avatar
      [SPARK-13162] Standalone mode does not respect initial executors · 4120bcba
      Andrew Or authored
      Currently the Master would always set an application's initial executor limit to infinity. If the user specified `spark.dynamicAllocation.initialExecutors`, the config would not take effect. This is similar to #11047 but for standalone mode.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #11054 from andrewor14/standalone-da-initial.
      4120bcba
    • Holden Karau's avatar
      [SPARK-13164][CORE] Replace deprecated synchronized buffer in core · 62a7c283
      Holden Karau authored
      Building with scala 2.11 results in the warning trait SynchronizedBuffer in package mutable is deprecated: Synchronization via traits is deprecated as it is inherently unreliable. Consider java.util.concurrent.ConcurrentLinkedQueue as an alternative. Investigation shows we are already using ConcurrentLinkedQueue in other locations so switch our uses of SynchronizedBuffer to ConcurrentLinkedQueue.
      
      Author: Holden Karau <holden@us.ibm.com>
      
      Closes #11059 from holdenk/SPARK-13164-replace-deprecated-synchronized-buffer-in-core.
      62a7c283
    • Charles Allen's avatar
      [SPARK-12330][MESOS] Fix mesos coarse mode cleanup · 2eaeafe8
      Charles Allen authored
      In the current implementation the mesos coarse scheduler does not wait for the mesos tasks to complete before ending the driver. This causes a race where the task has to finish cleaning up before the mesos driver terminates it with a SIGINT (and SIGKILL after 3 seconds if the SIGINT doesn't work).
      
      This PR causes the mesos coarse scheduler to wait for the mesos tasks to finish (with a timeout defined by `spark.mesos.coarse.shutdown.ms`)
      
      This PR also fixes a regression caused by [SPARK-10987] whereby submitting a shutdown causes a race between the local shutdown procedure and the notification of the scheduler driver disconnection. If the scheduler driver disconnection wins the race, the coarse executor incorrectly exits with status 1 (instead of the proper status 0)
      
      With this patch the mesos coarse scheduler terminates properly, the executors clean up, and the tasks are reported as `FINISHED` in the Mesos console (as opposed to `KILLED` in < 1.6 or `FAILED` in 1.6 and later)
      
      Author: Charles Allen <charles@allen-net.com>
      
      Closes #10319 from drcrallen/SPARK-12330.
      2eaeafe8
    • Liang-Chi Hsieh's avatar
      [SPARK-13113] [CORE] Remove unnecessary bit operation when decoding page number · d3908714
      Liang-Chi Hsieh authored
      JIRA: https://issues.apache.org/jira/browse/SPARK-13113
      
      As we shift bits right, looks like the bitwise AND operation is unnecessary.
      
      Author: Liang-Chi Hsieh <viirya@gmail.com>
      
      Closes #11002 from viirya/improve-decodepagenumber.
      d3908714
  6. Feb 03, 2016
    • Holden Karau's avatar
      [SPARK-13152][CORE] Fix task metrics deprecation warning · a8e2ba77
      Holden Karau authored
      Make an internal non-deprecated version of incBytesRead and incRecordsRead so we don't have unecessary deprecation warnings in our build.
      
      Right now incBytesRead and incRecordsRead are marked as deprecated and for internal use only. We should make private[spark] versions which are not deprecated and switch to those internally so as to not clutter up the warning messages when building.
      
      cc andrewor14 who did the initial deprecation
      
      Author: Holden Karau <holden@us.ibm.com>
      
      Closes #11056 from holdenk/SPARK-13152-fix-task-metrics-deprecation-warnings.
      a8e2ba77
    • Davies Liu's avatar
      [SPARK-13131] [SQL] Use best and average time in benchmark · de091452
      Davies Liu authored
      Best time is stabler than average time, also added a column for nano seconds per row (which could be used to estimate contributions of each components in a query).
      
      Having best time and average time together for more information (we can see kind of variance).
      
      rate, time per row and relative are all calculated using best time.
      
      The result looks like this:
      ```
      Intel(R) Core(TM) i7-4558U CPU  2.80GHz
      rang/filter/sum:                    Best/Avg Time(ms)    Rate(M/s)   Per Row(ns)   Relative
      -------------------------------------------------------------------------------------------
      rang/filter/sum codegen=false          14332 / 16646         36.0          27.8       1.0X
      rang/filter/sum codegen=true              845 /  940        620.0           1.6      17.0X
      ```
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #11018 from davies/gen_bench.
      de091452
    • Alex Bozarth's avatar
      [SPARK-3611][WEB UI] Show number of cores for each executor in application web UI · 3221eddb
      Alex Bozarth authored
      Added a Cores column in the Executors UI
      
      Author: Alex Bozarth <ajbozart@us.ibm.com>
      
      Closes #11039 from ajbozarth/spark3611.
      3221eddb
  7. Feb 02, 2016
    • Shixiong Zhu's avatar
      [SPARK-7997][CORE] Add rpcEnv.awaitTermination() back to SparkEnv · 335f10ed
      Shixiong Zhu authored
      `rpcEnv.awaitTermination()` was not added in #10854 because some Streaming Python tests hung forever.
      
      This patch fixed the hung issue and added rpcEnv.awaitTermination() back to SparkEnv.
      
      Previously, Streaming Kafka Python tests shutdowns the zookeeper server before stopping StreamingContext. Then when stopping StreamingContext, KafkaReceiver may be hung due to https://issues.apache.org/jira/browse/KAFKA-601, hence, some thread of RpcEnv's Dispatcher cannot exit and rpcEnv.awaitTermination is hung.The patch just changed the shutdown order to fix it.
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #11031 from zsxwing/awaitTermination.
      335f10ed
    • Adam Budde's avatar
      [SPARK-13122] Fix race condition in MemoryStore.unrollSafely() · ff71261b
      Adam Budde authored
      https://issues.apache.org/jira/browse/SPARK-13122
      
      A race condition can occur in MemoryStore's unrollSafely() method if two threads that
      return the same value for currentTaskAttemptId() execute this method concurrently. This
      change makes the operation of reading the initial amount of unroll memory used, performing
      the unroll, and updating the associated memory maps atomic in order to avoid this race
      condition.
      
      Initial proposed fix wraps all of unrollSafely() in a memoryManager.synchronized { } block. A cleaner approach might be introduce a mechanism that synchronizes based on task attempt ID. An alternative option might be to track unroll/pending unroll memory based on block ID rather than task attempt ID.
      
      Author: Adam Budde <budde@amazon.com>
      
      Closes #11012 from budde/master.
      ff71261b
  8. Feb 01, 2016
  9. Jan 30, 2016
    • Josh Rosen's avatar
      [SPARK-6363][BUILD] Make Scala 2.11 the default Scala version · 289373b2
      Josh Rosen authored
      This patch changes Spark's build to make Scala 2.11 the default Scala version. To be clear, this does not mean that Spark will stop supporting Scala 2.10: users will still be able to compile Spark for Scala 2.10 by following the instructions on the "Building Spark" page; however, it does mean that Scala 2.11 will be the default Scala version used by our CI builds (including pull request builds).
      
      The Scala 2.11 compiler is faster than 2.10, so I think we'll be able to look forward to a slight speedup in our CI builds (it looks like it's about 2X faster for the Maven compile-only builds, for instance).
      
      After this patch is merged, I'll update Jenkins to add new compile-only jobs to ensure that Scala 2.10 compilation doesn't break.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #10608 from JoshRosen/SPARK-6363.
      289373b2
  10. Jan 29, 2016
  11. Jan 27, 2016
    • Andrew Or's avatar
      [HOTFIX] Fix Scala 2.11 compilation · d702f0c1
      Andrew Or authored
      by explicitly marking annotated parameters as vals (SI-8813).
      
      Caused by #10835.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #10955 from andrewor14/fix-scala211.
      d702f0c1
    • Josh Rosen's avatar
      [SPARK-13021][CORE] Fail fast when custom RDDs violate RDD.partition's API contract · 32f74111
      Josh Rosen authored
      Spark's `Partition` and `RDD.partitions` APIs have a contract which requires custom implementations of `RDD.partitions` to ensure that for all `x`, `rdd.partitions(x).index == x`; in other words, the `index` reported by a repartition needs to match its position in the partitions array.
      
      If a custom RDD implementation violates this contract, then Spark has the potential to become stuck in an infinite recomputation loop when recomputing a subset of an RDD's partitions, since the tasks that are actually run will not correspond to the missing output partitions that triggered the recomputation. Here's a link to a notebook which demonstrates this problem: https://rawgit.com/JoshRosen/e520fb9a64c1c97ec985/raw/5e8a5aa8d2a18910a1607f0aa4190104adda3424/Violating%2520RDD.partitions%2520contract.html
      
      In order to guard against this infinite loop behavior, this patch modifies Spark so that it fails fast and refuses to compute RDDs' whose `partitions` violate the API contract.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #10932 from JoshRosen/SPARK-13021.
      32f74111
    • Andrew Or's avatar
      [SPARK-12895][SPARK-12896] Migrate TaskMetrics to accumulators · 87abcf7d
      Andrew Or authored
      The high level idea is that instead of having the executors send both accumulator updates and TaskMetrics, we should have them send only accumulator updates. This eliminates the need to maintain both code paths since one can be implemented in terms of the other. This effort is split into two parts:
      
      **SPARK-12895: Implement TaskMetrics using accumulators.** TaskMetrics is basically just a bunch of accumulable fields. This patch makes TaskMetrics a syntactic wrapper around a collection of accumulators so we don't need to send TaskMetrics from the executors to the driver.
      
      **SPARK-12896: Send only accumulator updates to the driver.** Now that TaskMetrics are expressed in terms of accumulators, we can capture all TaskMetrics values if we just send accumulator updates from the executors to the driver. This completes the parent issue SPARK-10620.
      
      While an effort has been made to preserve as much of the public API as possible, there were a few known breaking DeveloperApi changes that would be very awkward to maintain. I will gather the full list shortly and post it here.
      
      Note: This was once part of #10717. This patch is split out into its own patch from there to make it easier for others to review. Other smaller pieces of already been merged into master.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #10835 from andrewor14/task-metrics-use-accums.
      87abcf7d
  12. Jan 26, 2016
Loading