Commits · bc9f7eacb94e05ec089ee7a2d130a2e8a9e54c64 · cs525-sp18-g07 / spark

Nov 11, 2013
- Enable stopping and starting a spot cluster · bc9f7eac
  Ankur Dave authored 11 years ago
  
  bc9f7eac
- Merge pull request #156 from haoyuan/master · 23b53efc
  Matei Zaharia authored 11 years ago
  
  add tachyon module
  23b53efc
Nov 10, 2013
- expose UI port only · 6f455553
  Haoyuan Li authored 11 years ago
  
  6f455553
- Merge pull request #157 from rxin/kryo · 58d4f6c8
  Matei Zaharia authored 11 years ago
  
  3 Kryo related changes. 1. Call Kryo setReferences before calling user specified Kryo registrator. This is done so the user specified registrator can override the default setting. 2. Register more internal classes (MapStatus, BlockManagerId). 3. Slightly refactored the internal class registration to allocate less memory.
  58d4f6c8
- Moved the Spark internal class registration for Kryo into an object, and added... · c845611f
  Reynold Xin authored 11 years ago
  
  Moved the Spark internal class registration for Kryo into an object, and added more classes (e.g. MapStatus, BlockManagerId) to the registration.
  c845611f
- add tachyon module · 77cedf81
  Haoyuan Li authored 11 years ago
  
  77cedf81
- Call Kryo setReferences before calling user specified Kryo registrator. · 7c5f70d8
  Reynold Xin authored 11 years ago
  
  7c5f70d8
Nov 09, 2013

Merge pull request #147 from JoshRosen/fix-java-api-completeness-checker · 3efc0195

Matei Zaharia authored 11 years ago

Add spark-tools assembly to spark-class'ss classpath

This commit adds an assembly for `spark-tools` and adds it to `spark-class`'s classpath, allowing the JavaAPICompletenessChecker to be run against Spark 0.8+ with

    ./spark-class org.apache.spark.tools.JavaAPICompletenessChecker

Previously, this tool was run through the `run` script.  I chose to add this to `run-example` because I didn't want to duplicate code in a `run-tool` script.

3efc0195

Merge pull request #154 from soulmachine/ClusterScheduler · 87954d4c

Matei Zaharia authored 11 years ago

Replace the thread inside ClusterScheduler.start() with an Akka scheduler

Threads are precious resources so that we shouldn't abuse them

87954d4c

Merge pull request #155 from rxin/jobgroup · 83bf1920
Reynold Xin authored 11 years ago
```
Don't reset job group when a new job description is set.
```
83bf1920
Don't reset job group when a new job description is set. · 28f27097
Reynold Xin authored 11 years ago

28f27097

Merge pull request #149 from tgravescs/fixSecureHdfsAccess · 8af99f23

Matei Zaharia authored 11 years ago

Fix secure hdfs access for spark on yarn

https://github.com/apache/incubator-spark/pull/23 broke secure hdfs access. Not sure if it works with secure hdfs on standalone. Fixing it at least for spark on yarn.

The broadcasting of jobconf change also broke secure hdfs access as it didn't take into account things calling the getPartitions before sparkContext is initialized. The DAGScheduler does this as it tries to getShuffleMapStage.

8af99f23

Add spark-tools assembly to spark-class classpath. · a37ff0f1
Josh Rosen authored 11 years ago
```
This allows the JavaAPICompletenessChecker to be
run with Spark 0.8+.
```
a37ff0f1

Merge pull request #152 from rxin/repl · 72a601ec

Matei Zaharia authored 11 years ago

Propagate SparkContext local properties from spark-repl caller thread to the repl execution thread.

72a601ec

replace the thread with a Akka scheduler · 28115fa8
soulmachine authored 11 years ago

28115fa8
Propagate the SparkContext local property from the thread that calls the... · 31929994
Reynold Xin authored 11 years ago
```
Propagate the SparkContext local property from the thread that calls the spark-repl to the actual execution thread.
```
31929994

Nov 08, 2013
- Use SPARK_HOME instead of user.dir in ExecutorRunnerTest · dd63c548
  Aaron Davidson authored 11 years ago
  
  dd63c548
- Don't call the doAs if user is unknown or the same user that is already running · 13a19505
  tgravescs authored 11 years ago
  
  13a19505
- Remove the runAsUser as it breaks secure hdfs access · f95cb04e
  tgravescs authored 11 years ago
  
  f95cb04e
- Fix access to Secure HDFS · 5f9ed517
  tgravescs authored 11 years ago
  
  5f9ed517
Nov 07, 2013

Merge pull request #148 from squito/include_appId · 3d4ad84b

Reynold Xin authored 11 years ago

Include appId in executor cmd line args

add the appId back into the executor cmd line args.

I also made a pretty lame regression test, just to make sure it doesn't get dropped in the future. not sure it will run on the build server, though, b/c `ExecutorRunner.buildCommandSeq()` expects to be abel to run the scripts in `bin`.

3d4ad84b

fix formatting · ca66f5d5
Imran Rashid authored 11 years ago

ca66f5d5
very basic regression test to make sure appId doesnt get dropped in future · 8d3cdda9
Imran Rashid authored 11 years ago

8d3cdda9

Merge pull request #23 from jerryshao/multi-user · be7e8da9

Reynold Xin authored 11 years ago

Add Spark multi-user support for standalone mode and Mesos

This PR add multi-user support for Spark both standalone mode and Mesos (coarse and fine grained ) mode, user can specify the user name who submit app through environment variable `SPARK_USER` or use default one. Executor will communicate with Hadoop using specified user name.

Also I fixed one bug in JobLogger when different user wrote job log to specified folder which has no right file permission.

I separate previous [PR750](https://github.com/mesos/spark/pull/750) into two PRs, in this PR I only solve multi-user support problem. I will try to solve security auth problem in subsequent PR because security auth is a complicated problem especially for Shark Server like long-run app (both Kerberos TGT and HDFS delegation token should be renewed or re-created through app's run time).

be7e8da9

include the appid in the cmd line arguments to Executors · 36e832bf
Imran Rashid authored 11 years ago

36e832bf

Nov 06, 2013

Add Spark multi-user support for standalone mode and Mesos · 12dc385a
jerryshao authored 11 years ago

12dc385a

Merge pull request #144 from liancheng/runjob-clean · aadeda5e

Reynold Xin authored 11 years ago

Removed unused return value in SparkContext.runJob

Return type of this `runJob` version is `Unit`:

    def runJob[T, U: ClassManifest](
        rdd: RDD[T],
        func: (TaskContext, Iterator[T]) => U,
        partitions: Seq[Int],
        allowLocal: Boolean,
        resultHandler: (Int, U) => Unit) {
        ...
    }

It's obviously unnecessary to "return" `result`.

aadeda5e

Merge pull request #145 from aarondav/sls-fix · 951024fe

Reynold Xin authored 11 years ago

Attempt to fix SparkListenerSuite breakage

Could not reproduce locally, but this test could've been flaky if the build machine was too fast, due to typo. (index 0 is intentionally slowed down to ensure total time is >= 1 ms)

This should be merged into branch-0.8 as well.

951024fe

Attempt to fix SparkListenerSuite breakage · 80e98d2b

Aaron Davidson authored 11 years ago

Could not reproduce locally, but this test could've been flaky if the
build machine was too fast.

80e98d2b

Removed unused return value in SparkContext.runJob · a0c45651
Lian, Cheng authored 11 years ago

a0c45651

Merge pull request #143 from rxin/scheduler-hang · bf4e6131

Reynold Xin authored 11 years ago

Ignore a task update status if the executor doesn't exist anymore.

Otherwise if the scheduler receives a task update message when the executor's been removed, the scheduler would hang.

It is pretty hard to add unit tests for these right now because it is hard to mock the cluster scheduler. We should do that once @kayousterhout finishes merging the local scheduler and the cluster scheduler.

bf4e6131

Nov 05, 2013
- Ignore a task update status if the executor doesn't exist anymore. · a02eed68
  Reynold Xin authored 11 years ago
  
  a02eed68
- Merge pull request #142 from liancheng/dagscheduler-pattern-matching · 9f7b9bb1
  Reynold Xin authored 11 years ago
  
  Using case class deep match to simplify code in DAGScheduler.processEvent Since all `XxxEvent` pushed in `DAGScheduler.eventQueue` are case classes, deep pattern matching is more convenient to extract event object components.
  9f7b9bb1
- Using compact case class pattern matching syntax to simplify code in DAGScheduler.processEvent · 8b4c994e
  Lian, Cheng authored 11 years ago
  
  8b4c994e
Nov 04, 2013

Merge pull request #139 from aarondav/shuffle-next · 81065321

Reynold Xin authored 11 years ago

Never store shuffle blocks in BlockManager

After the BlockId refactor (PR #114), it became very clear that ShuffleBlocks are of no use
within BlockManager (they had a no-arg constructor!). This patch completely eliminates
them, saving us around 100-150 bytes per shuffle block.
The total, system-wide overhead per shuffle block is now a flat 8 bytes, excluding
state saved by the MapOutputTracker.

Note: This should *not* be merged directly into 0.8.0 -- see #138

81065321

Never store shuffle blocks in BlockManager · 93c90844

Aaron Davidson authored 11 years ago

93c90844

Merge pull request #128 from shimingfei/joblogger-doc · 0b26a392

Reynold Xin authored 11 years ago


add javadoc to JobLogger, and some small fix

against Spark-941

add javadoc to JobLogger, output more info for RDD, modify recordStageDepGraph to avoid output duplicate stage dependency information

(cherry picked from commit 518cf22e)
Signed-off-by: Reynold Xin <rxin@apache.org>

0b26a392

Merge pull request #130 from aarondav/shuffle · 7a26104a

Reynold Xin authored 11 years ago

Memory-optimized shuffle file consolidation

Reduces overhead of each shuffle block for consolidation from >300 bytes to 8 bytes (1 primitive Long). Verified via profiler testing with 1 mil shuffle blocks, net overhead was ~8,400,000 bytes.

Despite the memory-optimized implementation incurring extra CPU overhead, the runtime of the shuffle phase in this test was only around 2% slower, while the reduce phase was 40% faster, when compared to not using any shuffle file consolidation.

This is accomplished by replacing the map from ShuffleBlockId to FileSegment (i.e., block id to where it's located), which had high overhead due to being a gigantic, timestamped, concurrent map with a more space-efficient structure. Namely, the following are introduced (I have omitted the word "Shuffle" from some names for clarity):
**ShuffleFile** - there is one ShuffleFile per consolidated shuffle file on disk. We store an array of offsets into the physical shuffle file for each ShuffleMapTask that wrote into the file. This is sufficient to reconstruct FileSegments for mappers that are in the file.
**FileGroup** - contains a set of ShuffleFiles, one per reducer, that a MapTask can use to write its output. There is one FileGroup created per _concurrent_ MapTask. The FileGroup contains an array of the mapIds that have been written to all files in the group. The positions of elements in this array map directly onto the positions in each ShuffleFile's offsets array.

In order to locate the FileSegment associated with a BlockId, we have another structure which maps each reducer to the set of ShuffleFiles that were created for it. (There will be as many ShuffleFiles per reducer as there are FileGroups.) To lookup a given ShuffleBlockId (shuffleId, reducerId, mapId), we thus search through all ShuffleFiles associated with that reducer.

As a time optimization, we ensure that FileGroups are only reused for MapTasks with monotonically increasing mapIds. This allows us to perform a binary search to locate a mapId inside a group, and also enables potential future optimization (based on the usual monotonic access order).

7a26104a

Minor cleanup in ShuffleBlockManager · 1ba11b1c
Aaron Davidson authored 11 years ago

1ba11b1c

Refactor ShuffleBlockManager to reduce public interface · 6201e5e2

Aaron Davidson authored 11 years ago

- ShuffleBlocks has been removed and replaced by ShuffleWriterGroup.
- ShuffleWriterGroup no longer contains a reference to a ShuffleFileGroup.
- ShuffleFile has been removed and its contents are now within ShuffleFileGroup.
- ShuffleBlockManager.forShuffle has been replaced by a more stateful forMapTask.

6201e5e2