- Oct 03, 2014
-
-
Masayoshi TSUZUKI authored
Modified some sentence of error message in bin\*.cmd. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes #2640 from tsudukim/feature/SPARK-3775 and squashes the following commits: 3458afb [Masayoshi TSUZUKI] [SPARK-3775] Not suitable error message in spark-shell.cmd (cherry picked from commit 358d7ffd) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Brenden Matthews authored
Author: Brenden Matthews <brenden@diddyinc.com> Closes #2401 from brndnmtthws/master and squashes the following commits: 4abaa5d [Brenden Matthews] [SPARK-3535][Mesos] Fix resource handling. (cherry picked from commit a8c52d53) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
WangTaoTheTonic authored
https://issues.apache.org/jira/browse/SPARK-3696 We see if SPARK_CONF_DIR is already defined before assignment. Author: WangTaoTheTonic <barneystinson@aliyun.com> Closes #2541 from WangTaoTheTonic/confdir and squashes the following commits: c3f31e0 [WangTaoTheTonic] Do not override the user-difined conf_dir (cherry picked from commit 9d320e22) Signed-off-by:
Andrew Or <andrewor14@gmail.com> Conflicts: sbin/spark-config.sh
-
EugenCepoi authored
Update of PR #997. With this PR, setting SPARK_CONF_DIR overrides SPARK_HOME/conf (not only spark-defaults.conf and spark-env). Author: EugenCepoi <cepoi.eugen@gmail.com> Closes #2481 from EugenCepoi/SPARK-2058 and squashes the following commits: 0bb32c2 [EugenCepoi] use orElse orNull and fixing trailing percent in compute-classpath.cmd 77f35d7 [EugenCepoi] SPARK-2058: Overriding SPARK_HOME/conf with SPARK_CONF_DIR (cherry picked from commit f0811f92) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Oct 02, 2014
-
-
Eric Eijkelenboom authored
SparkSubmitDriverBootstrapper.scala now returns the exit code of the driver process, instead of always returning 0. Author: Eric Eijkelenboom <ee@userreport.com> Closes #2628 from ericeijkelenboom/master and squashes the following commits: cc4a571 [Eric Eijkelenboom] Return the exit code of the driver process (cherry picked from commit 42d5077f) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
scwf authored
pwendell, ```tryPort``` is not compatible with old code in last PR, this is to fix it. And after discuss with srowen renamed the title to "avoid trying privileged port when request a non-privileged port". Plz refer to the discuss for detail. Author: scwf <wangfei1@huawei.com> Closes #2623 from scwf/1-1024 and squashes the following commits: 10a4437 [scwf] add comment de3fd17 [scwf] do not try privileged port when request a non-privileged port 42cb0fa [scwf] make tryPort compatible with old code cb8cc76 [scwf] do not use port 1 - 1024 (cherry picked from commit 8081ce8b) Signed-off-by:
Andrew Or <andrewor14@gmail.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala
-
Yin Huai authored
We have changed the output format of `printSchema`. This PR will update our SQL programming guide to show the updated format. Also, it fixes a typo (the value type of `StructType` in Java API). Author: Yin Huai <huai@cse.ohio-state.edu> Closes #2630 from yhuai/sqlDoc and squashes the following commits: 267d63e [Yin Huai] Update the output of printSchema and fix a typo. (cherry picked from commit 82a6a083) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
- Oct 01, 2014
-
-
aniketbhatnagar authored
This patch forces use of commons http client 4.2 in Kinesis-asl profile so that the AWS SDK does not run into dependency conflicts Author: aniketbhatnagar <aniket.bhatnagar@gmail.com> Closes #2535 from aniketbhatnagar/Kinesis-HttpClient-Dep-Fix and squashes the following commits: aa2079f [aniketbhatnagar] Merge branch 'Kinesis-HttpClient-Dep-Fix' of https://github.com/aniketbhatnagar/spark into Kinesis-HttpClient-Dep-Fix 73f55f6 [aniketbhatnagar] SPARK-3638 | Forced a compatible version of http client in kinesis-asl profile 70cc75b [aniketbhatnagar] deleted merge files 725dbc9 [aniketbhatnagar] Merge remote-tracking branch 'origin/Kinesis-HttpClient-Dep-Fix' into Kinesis-HttpClient-Dep-Fix 4ed61d8 [aniketbhatnagar] SPARK-3638 | Forced a compatible version of http client in kinesis-asl profile 9cd6103 [aniketbhatnagar] SPARK-3638 | Forced a compatible version of http client in kinesis-asl profile (cherry picked from commit 93861a5e) Signed-off-by:
Josh Rosen <joshrosen@apache.org>
-
Gaspar Munoz authored
topicpMap to topicMap Author: Gaspar Munoz <munozs.88@gmail.com> Closes #2614 from gasparms/patch-1 and squashes the following commits: 00aab2c [Gaspar Munoz] Typo error in KafkaWordCount example (cherry picked from commit b81ee0b4) Signed-off-by:
Tathagata Das <tathagata.das1565@gmail.com>
-
scwf authored
Jetty server use MultiException to handle exceptions when start server refer https://github.com/eclipse/jetty.project/blob/jetty-8.1.14.v20131031/jetty-server/src/main/java/org/eclipse/jetty/server/Server.java So in ```isBindCollision``` add the logical to cover MultiException Author: scwf <wangfei1@huawei.com> Closes #2611 from scwf/fix-isBindCollision and squashes the following commits: 984cb12 [scwf] optimize the fix 3a6c849 [scwf] fix bug in isBindCollision (cherry picked from commit 2fedb5dd) Signed-off-by:
Patrick Wendell <pwendell@gmail.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala
-
Sean Owen authored
Call SparkContext.stop() in all examples (and touch up minor nearby code style issues while at it) Author: Sean Owen <sowen@cloudera.com> Closes #2575 from srowen/SPARK-2626 and squashes the following commits: 5b2baae [Sean Owen] Call SparkContext.stop() in all examples (and touch up minor nearby code style issues while at it) Conflicts: examples/src/main/python/parquet_inputformat.py
-
scwf authored
Non-root user use port 1- 1024 to start jetty server will get the exception " java.net.SocketException: Permission denied", so not use these ports Author: scwf <wangfei1@huawei.com> Closes #2610 from scwf/1-1024 and squashes the following commits: cb8cc76 [scwf] do not use port 1 - 1024 (cherry picked from commit 6390aae4) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Reynold Xin authored
[SPARK-3747] TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task Author: Reynold Xin <rxin@apache.org> Closes #2599 from rxin/SPARK-3747 and squashes the following commits: a74c04d [Reynold Xin] Added a line of comment explaining NonFatal 0e8d44c [Reynold Xin] [SPARK-3747] TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task (cherry picked from commit eb43043f) Signed-off-by:
Reynold Xin <rxin@apache.org>
-
- Sep 30, 2014
-
-
shane knapp authored
for details, see: https://issues.apache.org/jira/browse/SPARK-3745 Author: shane knapp <incomplete@gmail.com> Closes #2596 from shaneknapp/SPARK-3745 and squashes the following commits: c95eea9 [shane knapp] SPARK-3745 - fix check-license to properly download and check jar (cherry picked from commit a01a3092) Signed-off-by:
Josh Rosen <joshrosen@apache.org> Conflicts: dev/check-license
-
Reynold Xin authored
[SPARK-3709] Executors don't always report broadcast block removal properly back to the driver (for branch-1.1) Author: Reynold Xin <rxin@apache.org> Closes #2591 from rxin/SPARK-3709-1.1 and squashes the following commits: ab99cc0 [Reynold Xin] [SPARK-3709] Executors don't always report broadcast block removal properly back to the driver
-
Josh Rosen authored
When using spark-submit in `cluster` mode to submit a job to a Spark Standalone cluster, if the JAVA_HOME environment variable was set on the submitting machine then DriverRunner would attempt to use the submitter's JAVA_HOME to launch the driver process (instead of the worker's JAVA_HOME), causing the driver to fail unless the submitter and worker had the same Java location. This commit fixes this by reading JAVA_HOME from sys.env instead of command.environment. Author: Josh Rosen <joshrosen@apache.org> Closes #2586 from JoshRosen/SPARK-3734 and squashes the following commits: e9513d9 [Josh Rosen] [SPARK-3734] DriverRunner should not read SPARK_HOME from submitter's environment. (cherry picked from commit b167a8c7) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Sep 29, 2014
-
-
oded authored
Author: oded <oded@HP-DV6.c4internal.c4-security.com> Closes #2486 from odedz/master and squashes the following commits: dd7890a [oded] Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635 (cherry picked from commit dc30e450) Signed-off-by:
Ankur Dave <ankurdave@gmail.com>
-
yingjieMiao authored
When `numVertices > 50`, probability is set to 0. This would cause infinite loop. Author: yingjieMiao <yingjie@42go.com> Closes #2553 from yingjieMiao/graphx and squashes the following commits: 6adf3c8 [yingjieMiao] [graphX] GraphOps: random pick vertex bug (cherry picked from commit 51229ff7) Signed-off-by:
Ankur Dave <ankurdave@gmail.com>
-
jerryshao authored
Previous key comparison in `ExternalSorter` will get wrong sorting result or exception when key comparison overflows, details can be seen in [SPARK-3032](https://issues.apache.org/jira/browse/SPARK-3032 ). Here fix this and add a unit test to prove it. Author: jerryshao <saisai.shao@intel.com> Closes #2514 from jerryshao/SPARK-3032 and squashes the following commits: 6f3c302 [jerryshao] Improve the unit test according to comments 01911e6 [jerryshao] Change the test to show the contract violate exception 83acb38 [jerryshao] Minor changes according to comments fa2a08f [jerryshao] Fix key comparison integer overflow introduced sorting exception (cherry picked from commit dab1b0ae) Signed-off-by:
Matei Zaharia <matei@databricks.com>
-
Zhang, Liye authored
Author: Zhang, Liye <liye.zhang@intel.com> Closes #2572 from liyezhang556520/DAGLogErr and squashes the following commits: 5be2491 [Zhang, Liye] Bugfix: LogErr format in DAGScheduler.scala (cherry picked from commit 657bdff4) Signed-off-by:
Reynold Xin <rxin@apache.org>
-
- Sep 28, 2014
-
-
WangTaoTheTonic authored
https://issues.apache.org/jira/browse/SPARK-3715 Author: WangTaoTheTonic <barneystinson@aliyun.com> Closes #2567 from WangTaoTheTonic/minortypo and squashes the following commits: 9cc3f7a [WangTaoTheTonic] minor typo (cherry picked from commit 1f13a40c) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
- Sep 27, 2014
-
-
CrazyJvm authored
Author: CrazyJvm <crazyjvm@gmail.com> Closes #2540 from CrazyJvm/standalone-core and squashes the following commits: 66d9fc6 [CrazyJvm] use "--total-executor-cores" rather than "--cores" after spark-shell (cherry picked from commit 66107f46) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Sep 26, 2014
-
-
aniketbhatnagar authored
This patch removes setting of master as local in Kinesis examples so that users can set it using submit-job. Author: aniketbhatnagar <aniket.bhatnagar@gmail.com> Closes #2536 from aniketbhatnagar/Kinesis-Examples-Master-Unset and squashes the following commits: c9723ac [aniketbhatnagar] Merge remote-tracking branch 'origin/Kinesis-Examples-Master-Unset' into Kinesis-Examples-Master-Unset fec8ead [aniketbhatnagar] SPARK-3639 | Removed settings master in examples 31cdc59 [aniketbhatnagar] SPARK-3639 | Removed settings master in examples (cherry picked from commit d16e161d) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Sep 23, 2014
-
-
Mubarak Seyed authored
This is a refactored version of the original PR https://github.com/apache/spark/pull/1723 my mubarak Please take a look andrewor14, mubarak Author: Mubarak Seyed <mubarak.seyed@gmail.com> Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #2464 from tdas/streaming-callsite and squashes the following commits: dc54c71 [Tathagata Das] Made changes based on PR comments. 390b45d [Tathagata Das] Fixed minor bugs. 904cd92 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-callsite 7baa427 [Tathagata Das] Refactored getCallSite and setCallSite to make it simpler. Also added unit test for DStream creation site. b9ed945 [Mubarak Seyed] Adding streaming utils c461cf4 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' ceb43da [Mubarak Seyed] Changing default regex function name 8c5d443 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' 196121b [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' 491a1eb [Mubarak Seyed] Removing streaming visibility from getRDDCreationCallSite in DStream 33a7295 [Mubarak Seyed] Fixing review comments: Merging both setCallSite methods c26d933 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' f51fd9f [Mubarak Seyed] Fixing scalastyle, Regex for Utils.getCallSite, and changing method names in DStream 5051c58 [Mubarak Seyed] Getting return value of compute() into variable and call setCallSite(prevCallSite) only once. Adding return for other code paths (for None) a207eb7 [Mubarak Seyed] Fixing code review comments ccde038 [Mubarak Seyed] Removing Utils import from MappedDStream 2a09ad6 [Mubarak Seyed] Changes in Utils.scala for SPARK-1853 1d90cc3 [Mubarak Seyed] Changes for SPARK-1853 5f3105a [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' 70f494f [Mubarak Seyed] Changes for SPARK-1853 1500deb [Mubarak Seyed] Changes in Spark Streaming UI 9d38d3c [Mubarak Seyed] [SPARK-1853] Show Streaming application code context (file, line number) in Spark Stages UI d466d75 [Mubarak Seyed] Changes for spark streaming UI (cherry picked from commit 729952a5) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Andrew Or authored
`SPARK_DRIVER_MEMORY` was only used to start the `SparkSubmit` JVM, which becomes the driver only in client mode but not cluster mode. In cluster mode, this property is simply not propagated to the worker nodes. `SPARK_EXECUTOR_MEMORY` is picked up from `SparkContext`, but in cluster mode the driver runs on one of the worker machines, where this environment variable may not be set. Author: Andrew Or <andrewor14@gmail.com> Closes #2500 from andrewor14/memory-env-vars and squashes the following commits: 6217b38 [Andrew Or] Respect SPARK_*_MEMORY for cluster mode Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
-
Sandy Ryza authored
...the driver Author: Sandy Ryza <sandy@cloudera.com> Closes #2487 from sryza/sandy-spark-3612 and squashes the following commits: 2b7353d [Sandy Ryza] SPARK-3612. Executor shouldn't quit if heartbeat message fails to reach the driver (cherry picked from commit d79238d0) Signed-off-by:
Patrick Wendell <pwendell@gmail.com>
-
- Sep 22, 2014
-
-
Grega Kespret authored
Author: Grega Kespret <grega.kespret@gmail.com> Closes #2479 from gregakespret/patch-1 and squashes the following commits: dd6b90a [Grega Kespret] Update docs to use jsonRDD instead of wrong jsonRdd. (cherry picked from commit 56dae30c) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
RJ Nowling authored
Author: RJ Nowling <rnowling@gmail.com> Closes #2459 from rnowling/tfidf-fix and squashes the following commits: b370a91 [RJ Nowling] Fix variable name misspelling in MLLib Feature Extraction guide (cherry picked from commit fec92155) Signed-off-by:
Xiangrui Meng <meng@databricks.com>
-
- Sep 21, 2014
-
-
Patrick Wendell authored
This reverts commit 7a766577. [NOTE: After some thought I decided not to merge this into 1.1 quite yet]
-
Ian Hummel authored
Addresses the issue in https://issues.apache.org/jira/browse/SPARK-3595, namely saveAsHadoopFile hardcoding the OutputCommitter. This is not ideal when running Spark jobs that write to S3, especially when running them from an EMR cluster where the default OutputCommitter is a DirectOutputCommitter. Author: Ian Hummel <ian@themodernlife.net> Closes #2450 from themodernlife/spark-3595 and squashes the following commits: f37a0e5 [Ian Hummel] Update based on comments from pwendell a11d9f3 [Ian Hummel] Fix formatting 4359664 [Ian Hummel] Add an example showing usage 8b6be94 [Ian Hummel] Add ability to specify OutputCommitter, espcially useful when writing to an S3 bucket from an EMR cluster
-
- Sep 19, 2014
-
-
andrewor14 authored
This is now supported! Author: andrewor14 <andrewor14@gmail.com> Author: Andrew Or <andrewor14@gmail.com> Closes #2461 from andrewor14/document-standalone-cluster and squashes the following commits: 85c8b9e [andrewor14] Wording change per Patrick 35e30ee [Andrew Or] Fix outdated docs for standalone cluster (cherry picked from commit 8af23706) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Larry Xiao authored
VertexRDD.apply had a bug where it ignored the merge function for duplicate vertices and instead used whichever vertex attribute occurred first. This commit fixes the bug by passing the merge function through to ShippableVertexPartition.apply, which merges any duplicates using the merge function and then fills in missing vertices using the specified default vertex attribute. This commit also adds a unit test for VertexRDD.apply. Author: Larry Xiao <xiaodi@sjtu.edu.cn> Author: Blie Arkansol <xiaodi@sjtu.edu.cn> Author: Ankur Dave <ankurdave@gmail.com> Closes #1903 from larryxiao/2062 and squashes the following commits: 625aa9d [Blie Arkansol] Merge pull request #1 from ankurdave/SPARK-2062 476770b [Ankur Dave] ShippableVertexPartition.initFrom: Don't run mergeFunc on default values 614059f [Larry Xiao] doc update: note about the default null value vertices construction dfdb3c9 [Larry Xiao] minor fix 1c70366 [Larry Xiao] scalastyle check: wrap line, parameter list indent 4 spaces e4ca697 [Larry Xiao] [TEST] VertexRDD.apply mergeFunc 6a35ea8 [Larry Xiao] [TEST] VertexRDD.apply mergeFunc 4fbc29c [Blie Arkansol] undo unnecessary change efae765 [Larry Xiao] fix mistakes: should be able to call with or without mergeFunc b2422f9 [Larry Xiao] Merge branch '2062' of github.com:larryxiao/spark into 2062 52dc7f7 [Larry Xiao] pass mergeFunc to VertexPartitionBase, where merge is handled 581e9ee [Larry Xiao] TODO: VertexRDDSuite 20d80a3 [Larry Xiao] [SPARK-2062][GraphX] VertexRDD.apply does not use the mergeFunc (cherry picked from commit 3bbbdd81) Signed-off-by:
Ankur Dave <ankurdave@gmail.com>
-
- Sep 18, 2014
-
-
Andrew Or authored
This was introduced in #2449 Author: Andrew Or <andrewor14@gmail.com> Closes #2452 from andrewor14/standalone-hot-fix and squashes the following commits: d5190ca [Andrew Or] Put that line in the right place (cherry picked from commit 9306297d) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Victsm authored
Author: Victsm <victor.nju@gmail.com> Author: Min Shen <mshen@linkedin.com> Closes #2449 from Victsm/SPARK-3560 and squashes the following commits: 918405a [Victsm] Removed the additional space 4502a2a [Min Shen] [SPARK-3560] Fixed setting spark.jars system property in yarn-cluster mode.
-
WangTaoTheTonic authored
https://issues.apache.org/jira/browse/SPARK-3589 "export CLASSPATH" in spark-class is redundant since same variable is exported before. We could reuse defined value "isYarnCluster" in SparkSubmit.scala. Author: WangTaoTheTonic <barneystinson@aliyun.com> Closes #2445 from WangTaoTheTonic/removeRedundant and squashes the following commits: 6fb6872 [WangTaoTheTonic] remove redundant code (cherry picked from commit 471e6a3a) Signed-off-by:
Patrick Wendell <pwendell@gmail.com>
-
- Sep 17, 2014
-
-
WangTaoTheTonic authored
https://issues.apache.org/jira/browse/SPARK-3565 "spark.ports.maxRetries" should be "spark.port.maxRetries". Make the configuration keys in document and code consistent. Author: WangTaoTheTonic <barneystinson@aliyun.com> Closes #2427 from WangTaoTheTonic/fixPortRetries and squashes the following commits: c178813 [WangTaoTheTonic] Use blank lines trigger Jenkins 646f3fe [WangTaoTheTonic] also in SparkBuild.scala 3700dba [WangTaoTheTonic] Fix configuration item not consistent with document (cherry picked from commit 3f169bfe) Signed-off-by:
Patrick Wendell <pwendell@gmail.com>
-
Kousuke Saruta authored
Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #2424 from sarutak/display-appid-on-webui and squashes the following commits: 417fe90 [Kousuke Saruta] Added "App ID column" to HistoryPage (cherry picked from commit 6688a266) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Andrew Ash authored
Makes the table of contents read better Author: Andrew Ash <andrew@andrewash.com> Closes #2402 from ash211/docs/better-indentation and squashes the following commits: ea0e130 [Andrew Ash] Move HA subsections to a deeper indentation level (cherry picked from commit b3830b28) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Michael Armbrust authored
Author: Michael Armbrust <michael@databricks.com> Closes #2434 from marmbrus/patch-1 and squashes the following commits: 67215be [Michael Armbrust] [SQL][DOCS] Improve table caching section (cherry picked from commit cbf983bb) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
- Sep 16, 2014
-
-
Andrew Or authored
Original PR: #2363 Author: Andrew Or <andrewor14@gmail.com> Closes #2415 from andrewor14/disable-ui-for-tests-1.1 and squashes the following commits: 8d9df5a [Andrew Or] Oops, missed one. 509507d [Andrew Or] Backport #2363 (SPARK-3490) into branch-1.1
-