Skip to content
Snippets Groups Projects
  1. Jan 03, 2014
  2. Jan 02, 2014
    • Patrick Wendell's avatar
      Merge pull request #323 from tgravescs/sparkconf_yarn_fix · 498a5f0a
      Patrick Wendell authored
      fix spark on yarn after the sparkConf changes
      
      This fixes it so that spark on yarn now compiles and works after the sparkConf changes.
      
      There are also other issues I discovered along the way that are broken:
      - mvn builds for yarn don't assemble correctly
      - unset SPARK_EXAMPLES_JAR isn't handled properly anymore
      - I'm pretty sure spark.conf doesn't actually work as its not distributed with yarn
      
      those things can be fixed in separate pr unless others disagree.
      498a5f0a
    • Reynold Xin's avatar
      Merge pull request #320 from kayousterhout/erroneous_failed_msg · 0475ca8f
      Reynold Xin authored
      Remove erroneous FAILED state for killed tasks.
      
      Currently, when tasks are killed, the Executor first sends a
      status update for the task with a "KILLED" state, and then
      sends a second status update with a "FAILED" state saying that
      the task failed due to an exception. The second FAILED state is
      misleading/unncessary, and occurs due to a NonLocalReturnControl
      Exception that gets thrown due to the way we kill tasks. This
      commit eliminates that problem.
      
      I'm not at all sure that this is the best way to fix this problem,
      so alternate suggestions welcome. @rxin guessing you're the right
      person to look at this.
      0475ca8f
    • Thomas Graves's avatar
      fix yarn-client · fced7885
      Thomas Graves authored
      fced7885
    • Thomas Graves's avatar
      Fix yarn build after sparkConf changes · c6de982b
      Thomas Graves authored
      c6de982b
    • Patrick Wendell's avatar
      Merge pull request #297 from tdas/window-improvement · 588a1695
      Patrick Wendell authored
      Improvements to DStream window ops and refactoring of Spark's CheckpointSuite
      
      - Added a new RDD - PartitionerAwareUnionRDD. Using this RDD, one can take multiple RDDs partitioned by the same partitioner and unify them into a single RDD while preserving the partitioner. So m RDDs with p partitions each will be unified to a single RDD with p partitions and the same partitioner. The preferred location for each partition of the unified RDD will be the most common preferred location of the corresponding partitions of the parent RDDs. For example, location of partition 0 of the unified RDD will be where most of partition 0 of the parent RDDs are located.
      - Improved the performance of DStream's reduceByKeyAndWindow and groupByKeyAndWindow. Both these operations work by doing per-batch reduceByKey/groupByKey and then using PartitionerAwareUnionRDD to union the RDDs across the window. This eliminates a shuffle related to the window operation, which can reduce batch processing time by 30-40% for simple workloads.
      - Fixed bugs and simplified Spark's CheckpointSuite. Some of the tests were incorrect and unreliable. Added missing tests for ZippedRDD. I can go into greater detail if necessary.
      - Added mapSideCombine option to combineByKeyAndWindow.
      588a1695
    • Matei Zaharia's avatar
    • Reynold Xin's avatar
      Merge pull request #319 from kayousterhout/remove_error_method · 5e67cdc8
      Reynold Xin authored
      Removed redundant TaskSetManager.error() function.
      
      This function was leftover from a while ago, and now just
      passes all calls through to the abort() function, so this
      commit deletes it.
      5e67cdc8
    • Matei Zaharia's avatar
      Merge pull request #311 from tmyklebu/master · ca67909c
      Matei Zaharia authored
      SPARK-991: Report information gleaned from a Python stacktrace in the UI
      
      Scala:
      
      - Added setCallSite/clearCallSite to SparkContext and JavaSparkContext.
        These functions mutate a LocalProperty called "externalCallSite."
      - Add a wrapper, getCallSite, that checks for an externalCallSite and, if
        none is found, calls the usual Utils.formatSparkCallSite.
      - Change everything that calls Utils.formatSparkCallSite to call
        getCallSite instead. Except getCallSite.
      - Add wrappers to setCallSite/clearCallSite wrappers to JavaSparkContext.
      
      Python:
      
      - Add a gruesome hack to rdd.py that inspects the traceback and guesses
        what you want to see in the UI.
      - Add a RAII wrapper around said gruesome hack that calls
        setCallSite/clearCallSite as appropriate.
      - Wire said RAII wrapper up around three calls into the Scala code.
        I'm not sure that I hit all the spots with the RAII wrapper. I'm also
        not sure that my gruesome hack does exactly what we want.
      
      One could also approach this change by refactoring
      runJob/submitJob/runApproximateJob to take a call site, then threading
      that parameter through everything that needs to know it.
      
      One might object to the pointless-looking wrappers in JavaSparkContext.
      Unfortunately, I can't directly access the SparkContext from
      Python---or, if I can, I don't know how---so I need to wrap everything
      that matters in JavaSparkContext.
      
      Conflicts:
      	core/src/main/scala/org/apache/spark/api/java/JavaSparkContext.scala
      ca67909c
    • Kay Ousterhout's avatar
      Remove erroneous FAILED state for killed tasks. · a1b438d9
      Kay Ousterhout authored
      Currently, when tasks are killed, the Executor first sends a
      status update for the task with a "KILLED" state, and then
      sends a second status update with a "FAILED" state saying that
      the task failed due to an exception. The second FAILED state is
      misleading/unncessary, and occurs due to a NonLocalReturnControl
      Exception that gets thrown due to the way we kill tasks. This
      commit eliminates that problem.
      a1b438d9
    • Kay Ousterhout's avatar
      Removed redundant TaskSetManager.error() function. · 5a3c00c9
      Kay Ousterhout authored
      This function was leftover from a while ago, and now just
      passes all calls through to the abort() function, so this
      commit deletes it.
      5a3c00c9
    • Prashant Sharma's avatar
    • Prashant Sharma's avatar
      ignoring tests for now, contrary to what I assumed these tests make sense... · 436f3d28
      Prashant Sharma authored
      ignoring tests for now, contrary to what I assumed these tests make sense given what they are testing.
      436f3d28
    • Prashant Sharma's avatar
      6be4c111
    • Prashant Sharma's avatar
      8821c3a5
  3. Jan 01, 2014
  4. Dec 31, 2013
Loading