Commits · df5a62f510031cf2be20f0f7c6ff33d82233359a · cs525-sp18-g07 / spark

Sep 29, 2014

[SPARK-3032][Shuffle] Fix key comparison integer overflow introduced sorting exception · df5a62f5

jerryshao authored 10 years ago

Previous key comparison in `ExternalSorter` will get wrong sorting result or exception when key comparison overflows, details can be seen in [SPARK-3032](https://issues.apache.org/jira/browse/SPARK-3032

). Here fix this and add a unit test to prove it.

Author: jerryshao <saisai.shao@intel.com>

Closes #2514 from jerryshao/SPARK-3032 and squashes the following commits:

6f3c302 [jerryshao] Improve the unit test according to comments
01911e6 [jerryshao] Change the test to show the contract violate exception
83acb38 [jerryshao] Minor changes according to comments
fa2a08f [jerryshao] Fix key comparison integer overflow introduced sorting exception

(cherry picked from commit dab1b0ae)
Signed-off-by: Matei Zaharia <matei@databricks.com>

df5a62f5

[CORE] Bugfix: LogErr format in DAGScheduler.scala · 7d88471e

Zhang, Liye authored 10 years ago


Author: Zhang, Liye <liye.zhang@intel.com>

Closes #2572 from liyezhang556520/DAGLogErr and squashes the following commits:

5be2491 [Zhang, Liye] Bugfix: LogErr format in DAGScheduler.scala

(cherry picked from commit 657bdff4)
Signed-off-by: Reynold Xin <rxin@apache.org>

7d88471e

Sep 28, 2014

[SPARK-3715][Docs]minor typo · 004b6fa7

WangTaoTheTonic authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-3715



Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #2567 from WangTaoTheTonic/minortypo and squashes the following commits:

9cc3f7a [WangTaoTheTonic] minor typo

(cherry picked from commit 1f13a40c)
Signed-off-by: Michael Armbrust <michael@databricks.com>

004b6fa7

Sep 27, 2014

Docs : use "--total-executor-cores" rather than "--cores" after spark-shell · d9d94e0b

CrazyJvm authored 10 years ago


Author: CrazyJvm <crazyjvm@gmail.com>

Closes #2540 from CrazyJvm/standalone-core and squashes the following commits:

66d9fc6 [CrazyJvm] use "--total-executor-cores" rather than "--cores" after spark-shell

(cherry picked from commit 66107f46)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

d9d94e0b

Sep 26, 2014

SPARK-3639 | Removed settings master in examples · d6ed5abf

aniketbhatnagar authored 10 years ago


This patch removes setting of master as local in Kinesis examples so that users can set it using submit-job.

Author: aniketbhatnagar <aniket.bhatnagar@gmail.com>

Closes #2536 from aniketbhatnagar/Kinesis-Examples-Master-Unset and squashes the following commits:

c9723ac [aniketbhatnagar] Merge remote-tracking branch 'origin/Kinesis-Examples-Master-Unset' into Kinesis-Examples-Master-Unset
fec8ead [aniketbhatnagar] SPARK-3639 | Removed settings master in examples
31cdc59 [aniketbhatnagar] SPARK-3639 | Removed settings master in examples

(cherry picked from commit d16e161d)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

d6ed5abf

Sep 23, 2014

[SPARK-1853] Show Streaming application code context (file, line number) in Spark Stages UI · 505ed6ba

Mubarak Seyed authored 10 years ago

This is a refactored version of the original PR https://github.com/apache/spark/pull/1723

my mubarak

Please take a look andrewor14, mubarak

Author: Mubarak Seyed <mubarak.seyed@gmail.com>
Author: Tathagata Das <tathagata.das1565@gmail.com>

Closes #2464 from tdas/streaming-callsite and squashes the following commits:

dc54c71 [Tathagata Das] Made changes based on PR comments.
390b45d [Tathagata Das] Fixed minor bugs.
904cd92 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-callsite
7baa427 [Tathagata Das] Refactored getCallSite and setCallSite to make it simpler. Also added unit test for DStream creation site.
b9ed945 [Mubarak Seyed] Adding streaming utils
c461cf4 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master'
ceb43da [Mubarak Seyed] Changing default regex function name
8c5d443 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master'
196121b [Mubarak Seyed] Merge remote-tracking branch 'upstream/master'
491a1eb [Mubarak Seyed] Removing streaming visibility from getRDDCreationCallSite in DStream
33a7295 [Mubarak Seyed] Fixing review comments: Merging both setCallSite methods
c26d933 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master'
f51fd9f [Mubarak Seyed] Fixing scalastyle, Regex for Utils.getCallSite, and changing method names in DStream
5051c58 [Mubarak Seyed] Getting return value of compute() into variable and call setCallSite(prevCallSite) only once. Adding return for other code paths (for None)
a207eb7 [Mubarak Seyed] Fixing code review comments
ccde038 [Mubarak Seyed] Removing Utils import from MappedDStream
2a09ad6 [Mubarak Seyed] Changes in Utils.scala for SPARK-1853
1d90cc3 [Mubarak Seyed] Changes for SPARK-1853
5f3105a [Mubarak Seyed] Merge remote-tracking branch 'upstream/master'
70f494f [Mubarak Seyed] Changes for SPARK-1853
1500deb [Mubarak Seyed] Changes in Spark Streaming UI
9d38d3c [Mubarak Seyed] [SPARK-1853] Show Streaming application code context (file, line number) in Spark Stages UI
d466d75 [Mubarak Seyed] Changes for spark streaming UI

(cherry picked from commit 729952a5)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

505ed6ba

[SPARK-3653] Respect SPARK_*_MEMORY for cluster mode · 5bbc621f

Andrew Or authored 10 years ago

`SPARK_DRIVER_MEMORY` was only used to start the `SparkSubmit` JVM, which becomes the driver only in client mode but not cluster mode. In cluster mode, this property is simply not propagated to the worker nodes.

`SPARK_EXECUTOR_MEMORY` is picked up from `SparkContext`, but in cluster mode the driver runs on one of the worker machines, where this environment variable may not be set.

Author: Andrew Or <andrewor14@gmail.com>

Closes #2500 from andrewor14/memory-env-vars and squashes the following commits:

6217b38 [Andrew Or] Respect SPARK_*_MEMORY for cluster mode

Conflicts:
	core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala

5bbc621f

SPARK-3612. Executor shouldn't quit if heartbeat message fails to reach ... · ffd97be3

Sandy Ryza authored 10 years ago


...the driver

Author: Sandy Ryza <sandy@cloudera.com>

Closes #2487 from sryza/sandy-spark-3612 and squashes the following commits:

2b7353d [Sandy Ryza] SPARK-3612. Executor shouldn't quit if heartbeat message fails to reach the driver
(cherry picked from commit d79238d0)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>

ffd97be3

Sep 22, 2014

Update docs to use jsonRDD instead of wrong jsonRdd. · aab0a1dd

Grega Kespret authored 10 years ago


Author: Grega Kespret <grega.kespret@gmail.com>

Closes #2479 from gregakespret/patch-1 and squashes the following commits:

dd6b90a [Grega Kespret] Update docs to use jsonRDD instead of wrong jsonRdd.

(cherry picked from commit 56dae30c)
Signed-off-by: Michael Armbrust <michael@databricks.com>

aab0a1dd

[MLLib] Fix example code variable name misspelling in MLLib Feature Extraction guide · 32bb97fc

RJ Nowling authored 10 years ago


Author: RJ Nowling <rnowling@gmail.com>

Closes #2459 from rnowling/tfidf-fix and squashes the following commits:

b370a91 [RJ Nowling] Fix variable name misspelling in MLLib Feature Extraction guide

(cherry picked from commit fec92155)
Signed-off-by: Xiangrui Meng <meng@databricks.com>

32bb97fc

Sep 21, 2014

Revert "[SPARK-3595] Respect configured OutputCommitters when calling saveAsHadoopFile" · f5bf7ded
Patrick Wendell authored 10 years ago
```
This reverts commit 7a766577.

[NOTE: After some thought I decided not to merge this into 1.1 quite yet]
```
f5bf7ded

[SPARK-3595] Respect configured OutputCommitters when calling saveAsHadoopFile · 7a766577

Ian Hummel authored 10 years ago

Addresses the issue in https://issues.apache.org/jira/browse/SPARK-3595, namely saveAsHadoopFile hardcoding the OutputCommitter. This is not ideal when running Spark jobs that write to S3, especially when running them from an EMR cluster where the default OutputCommitter is a DirectOutputCommitter.

Author: Ian Hummel <ian@themodernlife.net>

Closes #2450 from themodernlife/spark-3595 and squashes the following commits:

f37a0e5 [Ian Hummel] Update based on comments from pwendell
a11d9f3 [Ian Hummel] Fix formatting
4359664 [Ian Hummel] Add an example showing usage
8b6be94 [Ian Hummel] Add ability to specify OutputCommitter, espcially useful when writing to an S3 bucket from an EMR cluster

7a766577

Sep 19, 2014

[Docs] Fix outdated docs for standalone cluster · fd883532

andrewor14 authored 10 years ago


This is now supported!

Author: andrewor14 <andrewor14@gmail.com>
Author: Andrew Or <andrewor14@gmail.com>

Closes #2461 from andrewor14/document-standalone-cluster and squashes the following commits:

85c8b9e [andrewor14] Wording change per Patrick
35e30ee [Andrew Or] Fix outdated docs for standalone cluster

(cherry picked from commit 8af23706)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

fd883532

[SPARK-2062][GraphX] VertexRDD.apply does not use the mergeFunc · 1687d6ba

Larry Xiao authored 10 years ago


VertexRDD.apply had a bug where it ignored the merge function for
duplicate vertices and instead used whichever vertex attribute occurred
first. This commit fixes the bug by passing the merge function through
to ShippableVertexPartition.apply, which merges any duplicates using the
merge function and then fills in missing vertices using the specified
default vertex attribute. This commit also adds a unit test for
VertexRDD.apply.

Author: Larry Xiao <xiaodi@sjtu.edu.cn>
Author: Blie Arkansol <xiaodi@sjtu.edu.cn>
Author: Ankur Dave <ankurdave@gmail.com>

Closes #1903 from larryxiao/2062 and squashes the following commits:

625aa9d [Blie Arkansol] Merge pull request #1 from ankurdave/SPARK-2062
476770b [Ankur Dave] ShippableVertexPartition.initFrom: Don't run mergeFunc on default values
614059f [Larry Xiao] doc update: note about the default null value vertices construction
dfdb3c9 [Larry Xiao] minor fix
1c70366 [Larry Xiao] scalastyle check: wrap line, parameter list indent 4 spaces
e4ca697 [Larry Xiao] [TEST] VertexRDD.apply mergeFunc
6a35ea8 [Larry Xiao] [TEST] VertexRDD.apply mergeFunc
4fbc29c [Blie Arkansol] undo unnecessary change
efae765 [Larry Xiao] fix mistakes: should be able to call with or without mergeFunc
b2422f9 [Larry Xiao] Merge branch '2062' of github.com:larryxiao/spark into 2062
52dc7f7 [Larry Xiao] pass mergeFunc to VertexPartitionBase, where merge is handled
581e9ee [Larry Xiao] TODO: VertexRDDSuite
20d80a3 [Larry Xiao] [SPARK-2062][GraphX] VertexRDD.apply does not use the mergeFunc

(cherry picked from commit 3bbbdd81)
Signed-off-by: Ankur Dave <ankurdave@gmail.com>

1687d6ba

Sep 18, 2014

[Minor Hot Fix] Move a line in SparkSubmit to the right place · cf15b22d

Andrew Or authored 10 years ago


This was introduced in #2449

Author: Andrew Or <andrewor14@gmail.com>

Closes #2452 from andrewor14/standalone-hot-fix and squashes the following commits:

d5190ca [Andrew Or] Put that line in the right place

(cherry picked from commit 9306297d)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

cf15b22d

[SPARK-3560] Fixed setting spark.jars system property in yarn-cluster mode · 832dff64

Victsm authored 10 years ago

Author: Victsm <victor.nju@gmail.com>
Author: Min Shen <mshen@linkedin.com>

Closes #2449 from Victsm/SPARK-3560 and squashes the following commits:

918405a [Victsm] Removed the additional space
4502a2a [Min Shen] [SPARK-3560] Fixed setting spark.jars system property in yarn-cluster mode.

832dff64

[SPARK-3589][Minor]remove redundant code · 2b286926

WangTaoTheTonic authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-3589



"export CLASSPATH" in spark-class is redundant since same variable is exported before.
We could reuse defined value "isYarnCluster" in SparkSubmit.scala.

Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #2445 from WangTaoTheTonic/removeRedundant and squashes the following commits:

6fb6872 [WangTaoTheTonic] remove redundant code

(cherry picked from commit 471e6a3a)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>

2b286926

Sep 17, 2014

[SPARK-3565]Fix configuration item not consistent with document · 32f2222e

WangTaoTheTonic authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-3565



"spark.ports.maxRetries" should be "spark.port.maxRetries". Make the configuration keys in document and code consistent.

Author: WangTaoTheTonic <barneystinson@aliyun.com>

Closes #2427 from WangTaoTheTonic/fixPortRetries and squashes the following commits:

c178813 [WangTaoTheTonic] Use blank lines trigger Jenkins
646f3fe [WangTaoTheTonic] also in SparkBuild.scala
3700dba [WangTaoTheTonic] Fix configuration item not consistent with document

(cherry picked from commit 3f169bfe)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>

32f2222e

[SPARK-3564][WebUI] Display App ID on HistoryPage · 3f1f9744

Kousuke Saruta authored 10 years ago


Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2424 from sarutak/display-appid-on-webui and squashes the following commits:

417fe90 [Kousuke Saruta] Added "App ID column" to HistoryPage

(cherry picked from commit 6688a266)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

3f1f9744

Docs: move HA subsections to a deeper indentation level · 0690410e

Andrew Ash authored 10 years ago


Makes the table of contents read better

Author: Andrew Ash <andrew@andrewash.com>

Closes #2402 from ash211/docs/better-indentation and squashes the following commits:

ea0e130 [Andrew Ash] Move HA subsections to a deeper indentation level

(cherry picked from commit b3830b28)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

0690410e

[SQL][DOCS] Improve table caching section · 85e7c52b

Michael Armbrust authored 10 years ago


Author: Michael Armbrust <michael@databricks.com>

Closes #2434 from marmbrus/patch-1 and squashes the following commits:

67215be [Michael Armbrust] [SQL][DOCS] Improve table caching section

(cherry picked from commit cbf983bb)
Signed-off-by: Michael Armbrust <michael@databricks.com>

85e7c52b

Sep 16, 2014

[SPARK-3490] Disable SparkUI for tests (backport into 1.1) · 937de93e

Andrew Or authored 10 years ago

Original PR: #2363

Author: Andrew Or <andrewor14@gmail.com>

Closes #2415 from andrewor14/disable-ui-for-tests-1.1 and squashes the following commits:

8d9df5a [Andrew Or] Oops, missed one.
509507d [Andrew Or] Backport #2363 (SPARK-3490) into branch-1.1

937de93e

[SPARK-3555] Fix UISuite race condition · 856156b4

Andrew Or authored 10 years ago


The test "jetty selects different port under contention" is flaky.

If another process binds to 4040 before the test starts, then the first server we start there will fail, and the subsequent servers we start thereafter may successfully bind to 4040 if it was released between the servers starting. Instead, we should just let Java find a random free port for us and hold onto it for the duration of the test.

Author: Andrew Or <andrewor14@gmail.com>

Closes #2418 from andrewor14/fix-port-contention and squashes the following commits:

0cd4974 [Andrew Or] Stop them servers
a7071fe [Andrew Or] Pick random port instead of 4040

(cherry picked from commit 0a7091e6)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

856156b4

[SQL][DOCS] Improve section on thrift-server · 75158a7e

Michael Armbrust authored 10 years ago


Taken from liancheng's updates. Merged conflicts with #2316.

Author: Michael Armbrust <michael@databricks.com>

Closes #2384 from marmbrus/sqlDocUpdate and squashes the following commits:

2db6319 [Michael Armbrust] @liancheng's updates

(cherry picked from commit 84073eb1)
Signed-off-by: Michael Armbrust <michael@databricks.com>

75158a7e

Sep 15, 2014

[SPARK-3518] Remove wasted statement in JsonProtocol · 99a6c5e5

Kousuke Saruta authored 10 years ago


Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>

Closes #2380 from sarutak/SPARK-3518 and squashes the following commits:

8a1464e [Kousuke Saruta] Replaced a variable with simple field reference
c660fbc [Kousuke Saruta] Removed useless statement in JsonProtocol.scala

(cherry picked from commit e59fac1f)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>

99a6c5e5

Sep 14, 2014

SPARK-3039: Allow spark to be built using avro-mapred for hadoop2 · 78887f94

Bertrand Bossy authored 10 years ago


SPARK-3039: Adds the maven property "avro.mapred.classifier" to build spark-assembly with avro-mapred with support for the new Hadoop API. Sets this property to hadoop2 for Hadoop 2 profiles.

I am not very familiar with maven, nor do I know whether this potentially breaks something in the hive part of spark. There might be a more elegant way of doing this.

Author: Bertrand Bossy <bertrandbossy@gmail.com>

Closes #1945 from bbossy/SPARK-3039 and squashes the following commits:

c32ce59 [Bertrand Bossy] SPARK-3039: Allow spark to be built using avro-mapred for hadoop2

(cherry picked from commit c243b21a)
Signed-off-by: Patrick Wendell <pwendell@gmail.com>

78887f94

Sep 13, 2014

[SQL] [Docs] typo fixes · 70f93d5a

Nicholas Chammas authored 10 years ago


* Fixed random typo
* Added in missing description for DecimalType

Author: Nicholas Chammas <nicholas.chammas@gmail.com>

Closes #2367 from nchammas/patch-1 and squashes the following commits:

aa528be [Nicholas Chammas] doc fix for SQL DecimalType
3247ac1 [Nicholas Chammas] [SQL] [Docs] typo fixes

(cherry picked from commit a523ceaf)
Signed-off-by: Michael Armbrust <michael@databricks.com>

70f93d5a

Sep 12, 2014

[SPARK-3515][SQL] Moves test suite setup code to beforeAll rather than in constructor · 44e534eb

Cheng Lian authored 10 years ago


Please refer to the JIRA ticket for details.

**NOTE** We should check all test suites that do similar initialization-like side effects in their constructors. This PR only fixes `ParquetMetastoreSuite` because it breaks our Jenkins Maven build.

Author: Cheng Lian <lian.cs.zju@gmail.com>

Closes #2375 from liancheng/say-no-to-constructor and squashes the following commits:

0ceb75b [Cheng Lian] Moves test suite setup code to beforeAll rather than in constructor

(cherry picked from commit 6d887db7)
Signed-off-by: Michael Armbrust <michael@databricks.com>

44e534eb

[SPARK-3500] [SQL] use JavaSchemaRDD as SchemaRDD._jschema_rdd · 9c06c723

Davies Liu authored 10 years ago


Currently, SchemaRDD._jschema_rdd is SchemaRDD, the Scala API (coalesce(), repartition()) can not been called in Python easily, there is no way to specify the implicit parameter `ord`. The _jrdd is an JavaRDD, so _jschema_rdd should also be JavaSchemaRDD.

In this patch, change _schema_rdd to JavaSchemaRDD, also added an assert for it. If some methods are missing from JavaSchemaRDD, then it's called by _schema_rdd.baseSchemaRDD().xxx().

BTW, Do we need JavaSQLContext?

Author: Davies Liu <davies.liu@gmail.com>

Closes #2369 from davies/fix_schemardd and squashes the following commits:

abee159 [Davies Liu] use JavaSchemaRDD as SchemaRDD._jschema_rdd

(cherry picked from commit 885d1621)
Signed-off-by: Josh Rosen <joshrosen@apache.org>

Conflicts:
	python/pyspark/tests.py

9c06c723

[SPARK-3481] [SQL] Eliminate the error log in local Hive comparison test · 6cbf83c0

Cheng Hao authored 10 years ago


Logically, we should remove the Hive Table/Database first and then reset the Hive configuration, repoint to the new data warehouse directory etc.
Otherwise it raised exceptions like "Database doesn't not exists: default" in the local testing.

Author: Cheng Hao <hao.cheng@intel.com>

Closes #2352 from chenghao-intel/test_hive and squashes the following commits:

74fd76b [Cheng Hao] eliminate the error log

(cherry picked from commit 8194fc66)
Signed-off-by: Michael Armbrust <michael@databricks.com>

6cbf83c0

Revert "[Spark-3490] Disable SparkUI for tests" · f17b7957
Andrew Or authored 10 years ago
```
This reverts commit 2ffc7980.
```
f17b7957

Sep 11, 2014

[SPARK-3465] fix task metrics aggregation in local mode · e69deb81

Davies Liu authored 10 years ago

Before overwrite t.taskMetrics, take a deepcopy of it.

Author: Davies Liu <davies.liu@gmail.com>

Closes #2338 from davies/fix_metric and squashes the following commits:

a5cdb63 [Davies Liu] Merge branch 'master' into fix_metric
7c879e0 [Davies Liu] add more comments
754b5b8 [Davies Liu] copy taskMetrics only when isLocal is true
5ca26dc [Davies Liu] fix task metrics aggregation in local mode

e69deb81

[SPARK-3429] Don't include the empty string "" as a defaultAclUser · 4245404e

Andrew Ash authored 10 years ago


Changes logging from

```
14/09/05 02:01:08 INFO SecurityManager: Changing view acls to: aash,
14/09/05 02:01:08 INFO SecurityManager: Changing modify acls to: aash,
14/09/05 02:01:08 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aash, ); users with modify permissions: Set(aash, )
```
to
```
14/09/05 02:28:28 INFO SecurityManager: Changing view acls to: aash
14/09/05 02:28:28 INFO SecurityManager: Changing modify acls to: aash
14/09/05 02:28:28 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(aash); users with modify permissions: Set(aash)
```

Note that the first set of logs have a Set of size 2 containing "aash" and the empty string ""

cc tgravescs

Author: Andrew Ash <andrew@andrewash.com>

Closes #2286 from ash211/empty-default-acl and squashes the following commits:

18cc612 [Andrew Ash] Use .isEmpty instead of ==""
cf973a1 [Andrew Ash] Don't include the empty string "" as a defaultAclUser

(cherry picked from commit ce59725b)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

4245404e

[Spark-3490] Disable SparkUI for tests · 2ffc7980

Andrew Or authored 10 years ago


We currently open many ephemeral ports during the tests, and as a result we occasionally can't bind to new ones. This has caused the `DriverSuite` and the `SparkSubmitSuite` to fail intermittently.

By disabling the `SparkUI` when it's not needed, we already cut down on the number of ports opened significantly, on the order of the number of `SparkContexts` ever created. We must keep it enabled for a few tests for the UI itself, however.

Author: Andrew Or <andrewor14@gmail.com>

Closes #2363 from andrewor14/disable-ui-for-tests and squashes the following commits:

332a7d5 [Andrew Or] No need to set spark.ui.port to 0 anymore
30c93a2 [Andrew Or] Simplify streaming UISuite
a431b84 [Andrew Or] Fix streaming test failures
8f5ae53 [Andrew Or] Fix no new line at the end
29c9b5b [Andrew Or] Disable SparkUI for tests

(cherry picked from commit 6324eb7b)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

Conflicts:
	pom.xml
	yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala
	yarn/common/src/main/scala/org/apache/spark/scheduler/cluster/YarnClientSchedulerBackend.scala

2ffc7980

[SPARK-2140] Updating heap memory calculation for YARN stable and alpha. · 06fb2d05

Chris Cope authored 10 years ago


Updated pull request, reflecting YARN stable and alpha states. I am getting intermittent test failures on my own test infrastructure. Is that tracked anywhere yet?

Author: Chris Cope <ccope@resilientscience.com>

Closes #2253 from copester/master and squashes the following commits:

5ad89da [Chris Cope] [SPARK-2140] Removing calculateAMMemory functions since they are no longer needed.
52b4e45 [Chris Cope] [SPARK-2140] Updating heap memory calculation for YARN stable and alpha.

(cherry picked from commit ed1980ff)
Signed-off-by: Thomas Graves <tgraves@apache.org>

06fb2d05

HOTFIX: Changing color on doc menu · e51ce9a5
Patrick Wendell authored 10 years ago

e51ce9a5

Sep 09, 2014

[SPARK-1919] Fix Windows spark-shell --jars · 359cd59d

Andrew Or authored 10 years ago

We were trying to add `file:/C:/path/to/my.jar` to the class path. We should add `C:/path/to/my.jar` instead. Tested on Windows 8.1.

Author: Andrew Or <andrewor14@gmail.com>

Closes #2211 from andrewor14/windows-shell-jars and squashes the following commits:

262c6a2 [Andrew Or] Oops... Add the new code to the correct place
0d5a0c1 [Andrew Or] Format jar path only for adding to shell classpath
42bd626 [Andrew Or] Remove unnecessary code
0049f1b [Andrew Or] Remove embarrassing log messages
b1755a0 [Andrew Or] Format jar paths properly before adding them to the classpath

359cd59d

[SPARK-3061] Fix Maven build under Windows · 23fd3e8b

Josh Rosen authored 10 years ago

The Maven build was failing on Windows because it tried to call the unix `unzip` utility to extract the Py4J files into core's build directory.  I've fixed this issue by using the `maven-antrun-plugin` to perform the unzipping.

I also fixed an issue that prevented tests from running under Windows:

In the Maven ScalaTest plugin, the filename listed in <filereports> is placed under the <reportsDirectory>; the current code places it in a subdirectory of reportsDirectory, e.g.

```
${project.build.directory}/surefire-reports/${project.build.directory}/SparkTestSuite.txt
```

This caused problems under Windows because it would try to create a subdirectory named "c:\\".

Note that the tests still fail under Windows (for other reasons); this PR just allows them to run and fail rather than crash when trying to create the test reports directory.

Author: Josh Rosen <joshrosen@apache.org>
Author: Josh Rosen <rosenville@gmail.com>
Author: Josh Rosen <joshrosen@databricks.com>

Closes #2165 from JoshRosen/windows-support and squashes the following commits:

651d210 [Josh Rosen] Unzip to python/build instead of core/build
fbf3e61 [Josh Rosen] 4 spaces -> 2 spaces
e347668 [Josh Rosen] Fix Maven scalatest filereports path:
4994af1 [Josh Rosen] [SPARK-3061] Use maven-antrun-plugin to unzip Py4J.

23fd3e8b

[SPARK-3345] Do correct parameters for ShuffleFileGroup · e5f77ae9

Liang-Chi Hsieh authored 10 years ago

In the method `newFileGroup` of class `FileShuffleBlockManager`, the parameters for creating new `ShuffleFileGroup` object is in wrong order.

Because in current codes, the parameters `shuffleId` and `fileId` are not used. So it doesn't cause problem now. However it should be corrected for readability and avoid future problem.

Author: Liang-Chi Hsieh <viirya@gmail.com>

Closes #2235 from viirya/correct_shufflefilegroup_params and squashes the following commits:

fe72567 [Liang-Chi Hsieh] Do correct parameters for ShuffleFileGroup.

e5f77ae9

[SPARK-3193]output errer info when Process exit code is not zero in test suite · 24262684

scwf authored 10 years ago

https://issues.apache.org/jira/browse/SPARK-3193
I noticed that sometimes pr tests failed due to the Process exitcode != 0,refer to
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/18688/consoleFull
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/19118/consoleFull



[info] SparkSubmitSuite:
[info] - prints usage on empty input
[info] - prints usage with only --help
[info] - prints error with unrecognized options
[info] - handle binary specified but not class
[info] - handles arguments with --key=val
[info] - handles arguments to user program
[info] - handles arguments to user program with name collision
[info] - handles YARN cluster mode
[info] - handles YARN client mode
[info] - handles standalone cluster mode
[info] - handles standalone client mode
[info] - handles mesos client mode
[info] - handles confs with flag equivalents
[info] - launch simple application with spark-submit *** FAILED ***
[info]   org.apache.spark.SparkException: Process List(./bin/spark-submit, --class, org.apache.spark.deploy.SimpleApplicationTest, --name, testApp, --master, local, file:/tmp/1408854098404-0/testJar-1408854098404.jar) exited with code 1
[info]   at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:872)
[info]   at org.apache.spark.deploy.SparkSubmitSuite.runSparkSubmit(SparkSubmitSuite.scala:311)
[info]   at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply$mcV$sp(SparkSubmitSuite.scala:291)
[info]   at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply(SparkSubmitSuite.scala:284)
[info]   at org.apacSpark assembly has been built with Hive, including Datanucleus jars on classpath

this PR output the process error info when failed, it can be helpful for diagnosis.

Author: scwf <wangfei1@huawei.com>

Closes #2108 from scwf/output-test-error-info and squashes the following commits:

0c48082 [scwf] minor fix according to comments
563fde1 [scwf] output errer info when Process exitcode not zero

(cherry picked from commit 26862337)
Signed-off-by: Andrew Or <andrewor14@gmail.com>

24262684