- Oct 08, 2014
-
-
Marcelo Vanzin authored
HA and viewfs use namespaces instead of host names, so you can't resolve them since that will fail. So be smarter to avoid doing unnecessary work. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes #2650 from vanzin/SPARK-3788-1.1 and squashes the following commits: 174bf71 [Marcelo Vanzin] Update comment. 0e36be7 [Marcelo Vanzin] Use Objects.equal() instead of ==. 772aead [Marcelo Vanzin] [SPARK-3788] [yarn] Fix compareFs to do the right thing for HA, federation (1.1 version).
-
- Oct 07, 2014
-
-
Kousuke Saruta authored
There is a Spark logo on the header of HistoryPage. We can have too many HistoryPages if we run 20+ applications. So I think, it's useful if the logo is as a link to the HistoryPage's page number 1. Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp> Closes #2690 from sarutak/SPARK-3829 and squashes the following commits: 908c109 [Kousuke Saruta] Removed extra space. 00bfbd7 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-3829 dd87480 [Kousuke Saruta] Made header Spark log image as a link to History Server's top page. (cherry picked from commit b69c9fb6) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
zsxwing authored
Now the Stage page only displays "Executor"(host) for tasks. However, there may be more than one Executors running in the same host. Currently, when some task is hung, I only know the host of the faulty executor. Therefore I have to check all executors in the host. Adding "Executor ID" in the Tasks table. would be helpful to locate the faulty executor. Here is the new page:  Author: zsxwing <zsxwing@gmail.com> Closes #2642 from zsxwing/SPARK-3777 and squashes the following commits: 37945af [zsxwing] Put Executor ID and Host into one cell 4bbe2c7 [zsxwing] [SPARK-3777] Display "Executor ID" for Tasks in Stage page (cherry picked from commit 446063ec) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Davies Liu authored
The parent.getOrCompute() of PythonRDD is executed in a separated thread, it should release the memory reserved for shuffle and unrolling finally. Author: Davies Liu <davies.liu@gmail.com> Closes #2668 from davies/leak and squashes the following commits: ae98be2 [Davies Liu] fix memory leak in PythonRDD (cherry picked from commit bc87cc41) Signed-off-by:
Josh Rosen <joshrosen@apache.org> Conflicts: core/src/main/scala/org/apache/spark/api/python/PythonRDD.scala
-
Andrew Or authored
Before: ``` 14/10/06 16:45:42 WARN CacheManager: Not enough space to cache partition rdd_0_2 in memory! Free memory is 481861527 bytes. ``` After: ``` 14/10/07 11:08:24 WARN MemoryStore: Not enough space to cache rdd_2_0 in memory! (computed 68.8 MB so far) 14/10/07 11:08:24 INFO MemoryStore: Memory use = 1088.0 B (blocks) + 445.1 MB (scratch space shared across 8 thread(s)) = 445.1 MB. Storage limit = 459.5 MB. ``` Author: Andrew Or <andrewor14@gmail.com> Closes #2688 from andrewor14/cache-log-message and squashes the following commits: 28e33d6 [Andrew Or] Shy away from "unrolling" 5638c49 [Andrew Or] Grammar 39a0c28 [Andrew Or] Log more detail when unrolling a block fails (cherry picked from commit 553737c6) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Masayoshi TSUZUKI authored
Modified syntax error of *.cmd script. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes #2669 from tsudukim/feature/SPARK-3808 and squashes the following commits: 7f804e6 [Masayoshi TSUZUKI] [SPARK-3808] PySpark fails to start in Windows (cherry picked from commit 12e2551e) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Hossein authored
With Spark SQL we generate very long RDD names. These names are not properly rendered in the web UI. This PR fixes the rendering issue. [SPARK-3827] #comment Linking PR with JIRA Author: Hossein <hossein@databricks.com> Closes #2687 from falaki/sparkTableUI and squashes the following commits: fd06409 [Hossein] Limit width of cell when RDD name is too long (cherry picked from commit d65fd554) Signed-off-by:
Josh Rosen <joshrosen@apache.org>
-
- Oct 05, 2014
-
-
scwf authored
Do not use TestSQLContext in JavaHiveQLSuite, that may lead to two SparkContexts in one jvm and enable JavaHiveQLSuite Author: scwf <wangfei1@huawei.com> Closes #2652 from scwf/fix-JavaHiveQLSuite and squashes the following commits: be35c91 [scwf] enable JavaHiveQLSuite (cherry picked from commit 58f5361c) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
zsxwing authored
JIRA: https://issues.apache.org/jira/browse/SPARK-1656 Author: zsxwing <zsxwing@gmail.com> Closes #577 from zsxwing/SPARK-1656 and squashes the following commits: c431095 [zsxwing] Add a comment and fix the code style 2de96e5 [zsxwing] Make sure file will be deleted if exception happens 28b90dc [zsxwing] Update to follow the code style 4521d6e [zsxwing] Merge branch 'master' into SPARK-1656 afc3383 [zsxwing] Update to follow the code style 071fdd1 [zsxwing] SPARK-1656: Fix potential resource leaks (cherry picked from commit a7c73130) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Brenden Matthews authored
The MesosSchedulerBackend did not previously implement `killTask`, resulting in an exception. Author: Brenden Matthews <brenden@diddyinc.com> Closes #2453 from brndnmtthws/implement-killtask and squashes the following commits: 23ddcdc [Brenden Matthews] [SPARK-3597][Mesos] Implement `killTask`. (cherry picked from commit 32fad423) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Oct 03, 2014
-
-
Masayoshi TSUZUKI authored
Modified the comment of bin/utils.sh. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes #2639 from tsudukim/feature/SPARK-3774 and squashes the following commits: 707b779 [Masayoshi TSUZUKI] [SPARK-3774] typo comment in bin/utils.sh (cherry picked from commit e5566e05) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Masayoshi TSUZUKI authored
Modified some sentence of error message in bin\*.cmd. Author: Masayoshi TSUZUKI <tsudukim@oss.nttdata.co.jp> Closes #2640 from tsudukim/feature/SPARK-3775 and squashes the following commits: 3458afb [Masayoshi TSUZUKI] [SPARK-3775] Not suitable error message in spark-shell.cmd (cherry picked from commit 358d7ffd) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Brenden Matthews authored
Author: Brenden Matthews <brenden@diddyinc.com> Closes #2401 from brndnmtthws/master and squashes the following commits: 4abaa5d [Brenden Matthews] [SPARK-3535][Mesos] Fix resource handling. (cherry picked from commit a8c52d53) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
WangTaoTheTonic authored
https://issues.apache.org/jira/browse/SPARK-3696 We see if SPARK_CONF_DIR is already defined before assignment. Author: WangTaoTheTonic <barneystinson@aliyun.com> Closes #2541 from WangTaoTheTonic/confdir and squashes the following commits: c3f31e0 [WangTaoTheTonic] Do not override the user-difined conf_dir (cherry picked from commit 9d320e22) Signed-off-by:
Andrew Or <andrewor14@gmail.com> Conflicts: sbin/spark-config.sh
-
EugenCepoi authored
Update of PR #997. With this PR, setting SPARK_CONF_DIR overrides SPARK_HOME/conf (not only spark-defaults.conf and spark-env). Author: EugenCepoi <cepoi.eugen@gmail.com> Closes #2481 from EugenCepoi/SPARK-2058 and squashes the following commits: 0bb32c2 [EugenCepoi] use orElse orNull and fixing trailing percent in compute-classpath.cmd 77f35d7 [EugenCepoi] SPARK-2058: Overriding SPARK_HOME/conf with SPARK_CONF_DIR (cherry picked from commit f0811f92) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Oct 02, 2014
-
-
Eric Eijkelenboom authored
SparkSubmitDriverBootstrapper.scala now returns the exit code of the driver process, instead of always returning 0. Author: Eric Eijkelenboom <ee@userreport.com> Closes #2628 from ericeijkelenboom/master and squashes the following commits: cc4a571 [Eric Eijkelenboom] Return the exit code of the driver process (cherry picked from commit 42d5077f) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
scwf authored
pwendell, ```tryPort``` is not compatible with old code in last PR, this is to fix it. And after discuss with srowen renamed the title to "avoid trying privileged port when request a non-privileged port". Plz refer to the discuss for detail. Author: scwf <wangfei1@huawei.com> Closes #2623 from scwf/1-1024 and squashes the following commits: 10a4437 [scwf] add comment de3fd17 [scwf] do not try privileged port when request a non-privileged port 42cb0fa [scwf] make tryPort compatible with old code cb8cc76 [scwf] do not use port 1 - 1024 (cherry picked from commit 8081ce8b) Signed-off-by:
Andrew Or <andrewor14@gmail.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala
-
Yin Huai authored
We have changed the output format of `printSchema`. This PR will update our SQL programming guide to show the updated format. Also, it fixes a typo (the value type of `StructType` in Java API). Author: Yin Huai <huai@cse.ohio-state.edu> Closes #2630 from yhuai/sqlDoc and squashes the following commits: 267d63e [Yin Huai] Update the output of printSchema and fix a typo. (cherry picked from commit 82a6a083) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
- Oct 01, 2014
-
-
aniketbhatnagar authored
This patch forces use of commons http client 4.2 in Kinesis-asl profile so that the AWS SDK does not run into dependency conflicts Author: aniketbhatnagar <aniket.bhatnagar@gmail.com> Closes #2535 from aniketbhatnagar/Kinesis-HttpClient-Dep-Fix and squashes the following commits: aa2079f [aniketbhatnagar] Merge branch 'Kinesis-HttpClient-Dep-Fix' of https://github.com/aniketbhatnagar/spark into Kinesis-HttpClient-Dep-Fix 73f55f6 [aniketbhatnagar] SPARK-3638 | Forced a compatible version of http client in kinesis-asl profile 70cc75b [aniketbhatnagar] deleted merge files 725dbc9 [aniketbhatnagar] Merge remote-tracking branch 'origin/Kinesis-HttpClient-Dep-Fix' into Kinesis-HttpClient-Dep-Fix 4ed61d8 [aniketbhatnagar] SPARK-3638 | Forced a compatible version of http client in kinesis-asl profile 9cd6103 [aniketbhatnagar] SPARK-3638 | Forced a compatible version of http client in kinesis-asl profile (cherry picked from commit 93861a5e) Signed-off-by:
Josh Rosen <joshrosen@apache.org>
-
Gaspar Munoz authored
topicpMap to topicMap Author: Gaspar Munoz <munozs.88@gmail.com> Closes #2614 from gasparms/patch-1 and squashes the following commits: 00aab2c [Gaspar Munoz] Typo error in KafkaWordCount example (cherry picked from commit b81ee0b4) Signed-off-by:
Tathagata Das <tathagata.das1565@gmail.com>
-
scwf authored
Jetty server use MultiException to handle exceptions when start server refer https://github.com/eclipse/jetty.project/blob/jetty-8.1.14.v20131031/jetty-server/src/main/java/org/eclipse/jetty/server/Server.java So in ```isBindCollision``` add the logical to cover MultiException Author: scwf <wangfei1@huawei.com> Closes #2611 from scwf/fix-isBindCollision and squashes the following commits: 984cb12 [scwf] optimize the fix 3a6c849 [scwf] fix bug in isBindCollision (cherry picked from commit 2fedb5dd) Signed-off-by:
Patrick Wendell <pwendell@gmail.com> Conflicts: core/src/main/scala/org/apache/spark/util/Utils.scala
-
Sean Owen authored
Call SparkContext.stop() in all examples (and touch up minor nearby code style issues while at it) Author: Sean Owen <sowen@cloudera.com> Closes #2575 from srowen/SPARK-2626 and squashes the following commits: 5b2baae [Sean Owen] Call SparkContext.stop() in all examples (and touch up minor nearby code style issues while at it) Conflicts: examples/src/main/python/parquet_inputformat.py
-
scwf authored
Non-root user use port 1- 1024 to start jetty server will get the exception " java.net.SocketException: Permission denied", so not use these ports Author: scwf <wangfei1@huawei.com> Closes #2610 from scwf/1-1024 and squashes the following commits: cb8cc76 [scwf] do not use port 1 - 1024 (cherry picked from commit 6390aae4) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Reynold Xin authored
[SPARK-3747] TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task Author: Reynold Xin <rxin@apache.org> Closes #2599 from rxin/SPARK-3747 and squashes the following commits: a74c04d [Reynold Xin] Added a line of comment explaining NonFatal 0e8d44c [Reynold Xin] [SPARK-3747] TaskResultGetter could incorrectly abort a stage if it cannot get result for a specific task (cherry picked from commit eb43043f) Signed-off-by:
Reynold Xin <rxin@apache.org>
-
- Sep 30, 2014
-
-
shane knapp authored
for details, see: https://issues.apache.org/jira/browse/SPARK-3745 Author: shane knapp <incomplete@gmail.com> Closes #2596 from shaneknapp/SPARK-3745 and squashes the following commits: c95eea9 [shane knapp] SPARK-3745 - fix check-license to properly download and check jar (cherry picked from commit a01a3092) Signed-off-by:
Josh Rosen <joshrosen@apache.org> Conflicts: dev/check-license
-
Reynold Xin authored
[SPARK-3709] Executors don't always report broadcast block removal properly back to the driver (for branch-1.1) Author: Reynold Xin <rxin@apache.org> Closes #2591 from rxin/SPARK-3709-1.1 and squashes the following commits: ab99cc0 [Reynold Xin] [SPARK-3709] Executors don't always report broadcast block removal properly back to the driver
-
Josh Rosen authored
When using spark-submit in `cluster` mode to submit a job to a Spark Standalone cluster, if the JAVA_HOME environment variable was set on the submitting machine then DriverRunner would attempt to use the submitter's JAVA_HOME to launch the driver process (instead of the worker's JAVA_HOME), causing the driver to fail unless the submitter and worker had the same Java location. This commit fixes this by reading JAVA_HOME from sys.env instead of command.environment. Author: Josh Rosen <joshrosen@apache.org> Closes #2586 from JoshRosen/SPARK-3734 and squashes the following commits: e9513d9 [Josh Rosen] [SPARK-3734] DriverRunner should not read SPARK_HOME from submitter's environment. (cherry picked from commit b167a8c7) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Sep 29, 2014
-
-
oded authored
Author: oded <oded@HP-DV6.c4internal.c4-security.com> Closes #2486 from odedz/master and squashes the following commits: dd7890a [oded] Fixed the condition in StronglyConnectedComponents Issue: SPARK-3635 (cherry picked from commit dc30e450) Signed-off-by:
Ankur Dave <ankurdave@gmail.com>
-
yingjieMiao authored
When `numVertices > 50`, probability is set to 0. This would cause infinite loop. Author: yingjieMiao <yingjie@42go.com> Closes #2553 from yingjieMiao/graphx and squashes the following commits: 6adf3c8 [yingjieMiao] [graphX] GraphOps: random pick vertex bug (cherry picked from commit 51229ff7) Signed-off-by:
Ankur Dave <ankurdave@gmail.com>
-
jerryshao authored
Previous key comparison in `ExternalSorter` will get wrong sorting result or exception when key comparison overflows, details can be seen in [SPARK-3032](https://issues.apache.org/jira/browse/SPARK-3032 ). Here fix this and add a unit test to prove it. Author: jerryshao <saisai.shao@intel.com> Closes #2514 from jerryshao/SPARK-3032 and squashes the following commits: 6f3c302 [jerryshao] Improve the unit test according to comments 01911e6 [jerryshao] Change the test to show the contract violate exception 83acb38 [jerryshao] Minor changes according to comments fa2a08f [jerryshao] Fix key comparison integer overflow introduced sorting exception (cherry picked from commit dab1b0ae) Signed-off-by:
Matei Zaharia <matei@databricks.com>
-
Zhang, Liye authored
Author: Zhang, Liye <liye.zhang@intel.com> Closes #2572 from liyezhang556520/DAGLogErr and squashes the following commits: 5be2491 [Zhang, Liye] Bugfix: LogErr format in DAGScheduler.scala (cherry picked from commit 657bdff4) Signed-off-by:
Reynold Xin <rxin@apache.org>
-
- Sep 28, 2014
-
-
WangTaoTheTonic authored
https://issues.apache.org/jira/browse/SPARK-3715 Author: WangTaoTheTonic <barneystinson@aliyun.com> Closes #2567 from WangTaoTheTonic/minortypo and squashes the following commits: 9cc3f7a [WangTaoTheTonic] minor typo (cherry picked from commit 1f13a40c) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
- Sep 27, 2014
-
-
CrazyJvm authored
Author: CrazyJvm <crazyjvm@gmail.com> Closes #2540 from CrazyJvm/standalone-core and squashes the following commits: 66d9fc6 [CrazyJvm] use "--total-executor-cores" rather than "--cores" after spark-shell (cherry picked from commit 66107f46) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Sep 26, 2014
-
-
aniketbhatnagar authored
This patch removes setting of master as local in Kinesis examples so that users can set it using submit-job. Author: aniketbhatnagar <aniket.bhatnagar@gmail.com> Closes #2536 from aniketbhatnagar/Kinesis-Examples-Master-Unset and squashes the following commits: c9723ac [aniketbhatnagar] Merge remote-tracking branch 'origin/Kinesis-Examples-Master-Unset' into Kinesis-Examples-Master-Unset fec8ead [aniketbhatnagar] SPARK-3639 | Removed settings master in examples 31cdc59 [aniketbhatnagar] SPARK-3639 | Removed settings master in examples (cherry picked from commit d16e161d) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
- Sep 23, 2014
-
-
Mubarak Seyed authored
This is a refactored version of the original PR https://github.com/apache/spark/pull/1723 my mubarak Please take a look andrewor14, mubarak Author: Mubarak Seyed <mubarak.seyed@gmail.com> Author: Tathagata Das <tathagata.das1565@gmail.com> Closes #2464 from tdas/streaming-callsite and squashes the following commits: dc54c71 [Tathagata Das] Made changes based on PR comments. 390b45d [Tathagata Das] Fixed minor bugs. 904cd92 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into streaming-callsite 7baa427 [Tathagata Das] Refactored getCallSite and setCallSite to make it simpler. Also added unit test for DStream creation site. b9ed945 [Mubarak Seyed] Adding streaming utils c461cf4 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' ceb43da [Mubarak Seyed] Changing default regex function name 8c5d443 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' 196121b [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' 491a1eb [Mubarak Seyed] Removing streaming visibility from getRDDCreationCallSite in DStream 33a7295 [Mubarak Seyed] Fixing review comments: Merging both setCallSite methods c26d933 [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' f51fd9f [Mubarak Seyed] Fixing scalastyle, Regex for Utils.getCallSite, and changing method names in DStream 5051c58 [Mubarak Seyed] Getting return value of compute() into variable and call setCallSite(prevCallSite) only once. Adding return for other code paths (for None) a207eb7 [Mubarak Seyed] Fixing code review comments ccde038 [Mubarak Seyed] Removing Utils import from MappedDStream 2a09ad6 [Mubarak Seyed] Changes in Utils.scala for SPARK-1853 1d90cc3 [Mubarak Seyed] Changes for SPARK-1853 5f3105a [Mubarak Seyed] Merge remote-tracking branch 'upstream/master' 70f494f [Mubarak Seyed] Changes for SPARK-1853 1500deb [Mubarak Seyed] Changes in Spark Streaming UI 9d38d3c [Mubarak Seyed] [SPARK-1853] Show Streaming application code context (file, line number) in Spark Stages UI d466d75 [Mubarak Seyed] Changes for spark streaming UI (cherry picked from commit 729952a5) Signed-off-by:
Andrew Or <andrewor14@gmail.com>
-
Andrew Or authored
`SPARK_DRIVER_MEMORY` was only used to start the `SparkSubmit` JVM, which becomes the driver only in client mode but not cluster mode. In cluster mode, this property is simply not propagated to the worker nodes. `SPARK_EXECUTOR_MEMORY` is picked up from `SparkContext`, but in cluster mode the driver runs on one of the worker machines, where this environment variable may not be set. Author: Andrew Or <andrewor14@gmail.com> Closes #2500 from andrewor14/memory-env-vars and squashes the following commits: 6217b38 [Andrew Or] Respect SPARK_*_MEMORY for cluster mode Conflicts: core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala
-
Sandy Ryza authored
...the driver Author: Sandy Ryza <sandy@cloudera.com> Closes #2487 from sryza/sandy-spark-3612 and squashes the following commits: 2b7353d [Sandy Ryza] SPARK-3612. Executor shouldn't quit if heartbeat message fails to reach the driver (cherry picked from commit d79238d0) Signed-off-by:
Patrick Wendell <pwendell@gmail.com>
-
- Sep 22, 2014
-
-
Grega Kespret authored
Author: Grega Kespret <grega.kespret@gmail.com> Closes #2479 from gregakespret/patch-1 and squashes the following commits: dd6b90a [Grega Kespret] Update docs to use jsonRDD instead of wrong jsonRdd. (cherry picked from commit 56dae30c) Signed-off-by:
Michael Armbrust <michael@databricks.com>
-
RJ Nowling authored
Author: RJ Nowling <rnowling@gmail.com> Closes #2459 from rnowling/tfidf-fix and squashes the following commits: b370a91 [RJ Nowling] Fix variable name misspelling in MLLib Feature Extraction guide (cherry picked from commit fec92155) Signed-off-by:
Xiangrui Meng <meng@databricks.com>
-
- Sep 21, 2014
-
-
Patrick Wendell authored
This reverts commit 7a766577. [NOTE: After some thought I decided not to merge this into 1.1 quite yet]
-