Skip to content
Snippets Groups Projects
  1. Apr 23, 2014
  2. Apr 22, 2014
    • Michael Armbrust's avatar
      SPARK-1562 Fix visibility / annotation of Spark SQL APIs · aa77f8a6
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #489 from marmbrus/sqlDocFixes and squashes the following commits:
      
      acee4f3 [Michael Armbrust] Fix visibility / annotation of Spark SQL APIs
      aa77f8a6
    • Xiangrui Meng's avatar
      [FIX: SPARK-1376] use --arg instead of --args in SparkSubmit to avoid warning messages · 662c860e
      Xiangrui Meng authored
      Even if users use `--arg`, `SparkSubmit` still uses `--args` for child args internally, which triggers a warning message that may confuse users:
      
      ~~~
      --args is deprecated. Use --arg instead.
      ~~~
      
      @sryza Does it look good to you?
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #485 from mengxr/submit-arg and squashes the following commits:
      
      5e1b9fe [Xiangrui Meng] update test
      cebbeb7 [Xiangrui Meng] use --arg instead of --args in SparkSubmit to avoid warning messages
      662c860e
    • Tathagata Das's avatar
      [streaming][SPARK-1578] Removed requirement for TTL in StreamingContext. · f3d19a9f
      Tathagata Das authored
      Since shuffles and RDDs that are out of context are automatically cleaned by Spark core (using ContextCleaner) there is no need for setting the cleaner TTL while creating a StreamingContext.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #491 from tdas/ttl-fix and squashes the following commits:
      
      cf01dc7 [Tathagata Das] Removed requirement for TTL in StreamingContext.
      f3d19a9f
    • Andrew Or's avatar
      [Spark-1538] Fix SparkUI incorrectly hiding persisted RDDs · 2de57387
      Andrew Or authored
      **Bug**: After the following command `sc.parallelize(1 to 1000).persist.map(_ + 1).count()` is run, the the persisted RDD is missing from the storage tab of the SparkUI.
      
      **Cause**: The command creates two RDDs in one stage, a `ParallelCollectionRDD` and a `MappedRDD`. However, the existing StageInfo only keeps the RDDInfo of the last RDD associated with the stage (`MappedRDD`), and so all RDD information regarding the first RDD (`ParallelCollectionRDD`) is discarded. In this case, we persist the first RDD,  but the StorageTab doesn't know about this RDD because it is not encoded in the StageInfo.
      
      **Fix**: Record information of all RDDs in StageInfo, instead of just the last RDD (i.e. `stage.rdd`). Since stage boundaries are marked by shuffle dependencies, the solution is to traverse the last RDD's dependency tree, visiting only ancestor RDDs related through a sequence of narrow dependencies.
      
      ---
      
      This PR also moves RDDInfo to its own file, includes a few style fixes, and adds a unit test for constructing StageInfos.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #469 from andrewor14/storage-ui-fix and squashes the following commits:
      
      07fc7f0 [Andrew Or] Add back comment that was accidentally removed (minor)
      5d799fe [Andrew Or] Add comment to justify testing of getNarrowAncestors with cycles
      9d0e2b8 [Andrew Or] Hide details of getNarrowAncestors from outsiders
      d2bac8a [Andrew Or] Deal with cycles in RDD dependency graph + add extensive tests
      2acb177 [Andrew Or] Move getNarrowAncestors to RDD.scala
      bfe83f0 [Andrew Or] Backtrace RDD dependency tree to find all RDDs that belong to a Stage
      2de57387
    • Patrick Wendell's avatar
      Assorted clean-up for Spark-on-YARN. · 995fdc96
      Patrick Wendell authored
      In particular when the HADOOP_CONF_DIR is not not specified.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #488 from pwendell/hadoop-cleanup and squashes the following commits:
      
      fe95f13 [Patrick Wendell] Changes based on Andrew's feeback
      18d09c1 [Patrick Wendell] Review comments from Andrew
      17929cc [Patrick Wendell] Assorted clean-up for Spark-on-YARN.
      995fdc96
    • Kan Zhang's avatar
      [SPARK-1570] Fix classloading in JavaSQLContext.applySchema · ea8cea82
      Kan Zhang authored
      I think I hit a class loading issue when running JavaSparkSQL example using spark-submit in local mode.
      
      Author: Kan Zhang <kzhang@apache.org>
      
      Closes #484 from kanzhang/SPARK-1570 and squashes the following commits:
      
      feaaeba [Kan Zhang] [SPARK-1570] Fix classloading in JavaSQLContext.applySchema
      ea8cea82
    • Marcelo Vanzin's avatar
      Fix compilation on Hadoop 2.4.x. · 0ea0b1a2
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #483 from vanzin/yarn-2.4 and squashes the following commits:
      
      0fc57d8 [Marcelo Vanzin] Fix compilation on Hadoop 2.4.x.
      0ea0b1a2
    • Andrew Or's avatar
      [Fix #204] Eliminate delay between binding and log checking · 745e496c
      Andrew Or authored
      **Bug**: In the existing history server, there is a `spark.history.updateInterval` seconds delay before application logs show up on the UI.
      
      **Cause**: This is because the following events happen in this order: (1) The background thread that checks for logs starts, but realizes the server has not yet bound and so waits for N seconds, (2) server binds, (3) N seconds later the background thread finds that the server has finally bound to a port, and so finally checks for application logs.
      
      **Fix**: This PR forces the log checking thread to start immediately after binding. It also documents two relevant environment variables that are currently missing.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #441 from andrewor14/history-server-fix and squashes the following commits:
      
      b2eb46e [Andrew Or] Document SPARK_PUBLIC_DNS and SPARK_HISTORY_OPTS for the history server
      e8d1fbc [Andrew Or] Eliminate delay between binding and checking for logs
      745e496c
    • Xiangrui Meng's avatar
      [SPARK-1506][MLLIB] Documentation improvements for MLlib 1.0 · 26d35f3f
      Xiangrui Meng authored
      Preview: http://54.82.240.23:4000/mllib-guide.html
      
      Table of contents:
      
      * Basics
        * Data types
        * Summary statistics
      * Classification and regression
        * linear support vector machine (SVM)
        * logistic regression
        * linear linear squares, Lasso, and ridge regression
        * decision tree
        * naive Bayes
      * Collaborative Filtering
        * alternating least squares (ALS)
      * Clustering
        * k-means
      * Dimensionality reduction
        * singular value decomposition (SVD)
        * principal component analysis (PCA)
      * Optimization
        * stochastic gradient descent
        * limited-memory BFGS (L-BFGS)
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #422 from mengxr/mllib-doc and squashes the following commits:
      
      944e3a9 [Xiangrui Meng] merge master
      f9fda28 [Xiangrui Meng] minor
      9474065 [Xiangrui Meng] add alpha to ALS examples
      928e630 [Xiangrui Meng] initialization_mode -> initializationMode
      5bbff49 [Xiangrui Meng] add imports to labeled point examples
      c17440d [Xiangrui Meng] fix python nb example
      28f40dc [Xiangrui Meng] remove localhost:4000
      369a4d3 [Xiangrui Meng] Merge branch 'master' into mllib-doc
      7dc95cc [Xiangrui Meng] update linear methods
      053ad8a [Xiangrui Meng] add links to go back to the main page
      abbbf7e [Xiangrui Meng] update ALS argument names
      648283e [Xiangrui Meng] level down statistics
      14e2287 [Xiangrui Meng] add sample libsvm data and use it in guide
      8cd2441 [Xiangrui Meng] minor updates
      186ab07 [Xiangrui Meng] update section names
      6568d65 [Xiangrui Meng] update toc, level up lr and svm
      162ee12 [Xiangrui Meng] rename section names
      5c1e1b1 [Xiangrui Meng] minor
      8aeaba1 [Xiangrui Meng] wrap long lines
      6ce6a6f [Xiangrui Meng] add summary statistics to toc
      5760045 [Xiangrui Meng] claim beta
      cc604bf [Xiangrui Meng] remove classification and regression
      92747b3 [Xiangrui Meng] make section titles consistent
      e605dd6 [Xiangrui Meng] add LIBSVM loader
      f639674 [Xiangrui Meng] add python section to migration guide
      c82ffb4 [Xiangrui Meng] clean optimization
      31660eb [Xiangrui Meng] update linear algebra and stat
      0a40837 [Xiangrui Meng] first pass over linear methods
      1fc8271 [Xiangrui Meng] update toc
      906ed0a [Xiangrui Meng] add a python example to naive bayes
      5f0a700 [Xiangrui Meng] update collaborative filtering
      656d416 [Xiangrui Meng] update mllib-clustering
      86e143a [Xiangrui Meng] remove data types section from main page
      8d1a128 [Xiangrui Meng] move part of linear algebra to data types and add Java/Python examples
      d1b5cbf [Xiangrui Meng] merge master
      72e4804 [Xiangrui Meng] one pass over tree guide
      64f8995 [Xiangrui Meng] move decision tree guide to a separate file
      9fca001 [Xiangrui Meng] add first version of linear algebra guide
      53c9552 [Xiangrui Meng] update dependencies
      f316ec2 [Xiangrui Meng] add migration guide
      f399f6c [Xiangrui Meng] move linear-algebra to dimensionality-reduction
      182460f [Xiangrui Meng] add guide for naive Bayes
      137fd1d [Xiangrui Meng] re-organize toc
      a61e434 [Xiangrui Meng] update mllib's toc
      26d35f3f
    • Tor Myklebust's avatar
      [SPARK-1281] Improve partitioning in ALS · bf9d49b6
      Tor Myklebust authored
      ALS was using HashPartitioner and explicit uses of `%` together.  Further, the naked use of `%` meant that, if the number of partitions corresponded with the stride of arithmetic progressions appearing in user and product ids, users and products could be mapped into buckets in an unfair or unwise way.
      
      This pull request:
      1) Makes the Partitioner an instance variable of ALS.
      2) Replaces the direct uses of `%` with calls to a Partitioner.
      3) Defines an anonymous Partitioner that scrambles the bits of the object's hashCode before reducing to the number of present buckets.
      
      This pull request does not make the partitioner user-configurable.
      
      I'm not all that happy about the way I did (1).  It introduces an icky lifetime issue and dances around it by nulling something.  However, I don't know a better way to make the partitioner visible everywhere it needs to be visible.
      
      Author: Tor Myklebust <tmyklebu@gmail.com>
      
      Closes #407 from tmyklebu/master and squashes the following commits:
      
      dcf583a [Tor Myklebust] Remove the partitioner member variable; instead, thread that needle everywhere it needs to go.
      23d6f91 [Tor Myklebust] Stop making the partitioner configurable.
      495784f [Tor Myklebust] Merge branch 'master' of https://github.com/apache/spark
      674933a [Tor Myklebust] Fix style.
      40edc23 [Tor Myklebust] Fix missing space.
      f841345 [Tor Myklebust] Fix daft bug creating 'pairs', also for -> foreach.
      5ec9e6c [Tor Myklebust] Clean a couple of things up using 'map'.
      36a0f43 [Tor Myklebust] Make the partitioner private.
      d872b09 [Tor Myklebust] Add negative id ALS test.
      df27697 [Tor Myklebust] Support custom partitioners.  Currently we use the same partitioner for users and products.
      c90b6d8 [Tor Myklebust] Scramble user and product ids before bucketing.
      c774d7d [Tor Myklebust] Make the partitioner a member variable and use it instead of modding directly.
      bf9d49b6
    • Xusen Yin's avatar
      fix bugs of dot in python · c919798f
      Xusen Yin authored
      If there are no `transpose()` in `self.theta`, a
      
      *ValueError: matrices are not aligned*
      
      is occurring. The former test case just ignore this situation.
      
      Author: Xusen Yin <yinxusen@gmail.com>
      
      Closes #463 from yinxusen/python-naive-bayes and squashes the following commits:
      
      fcbe3bc [Xusen Yin] fix bugs of dot in python
      c919798f
    • Ahir Reddy's avatar
      [SPARK-1560]: Updated Pyrolite Dependency to be Java 6 compatible · 0f87e6ad
      Ahir Reddy authored
      Changed the Pyrolite dependency to a build which targets Java 6.
      
      Author: Ahir Reddy <ahirreddy@gmail.com>
      
      Closes #479 from ahirreddy/java6-pyrolite and squashes the following commits:
      
      8ea25d3 [Ahir Reddy] Updated maven build to use java 6 compatible pyrolite
      dabc703 [Ahir Reddy] Updated Pyrolite dependency to be Java 6 compatible
      0f87e6ad
    • CodingCat's avatar
      [HOTFIX] SPARK-1399: remove outdated comments · 87de2908
      CodingCat authored
      as the original PR was merged before this mistake is found....fix here,
      
      Sorry about that @pwendell, @andrewor14, I will be more careful next time
      
      Author: CodingCat <zhunansjtu@gmail.com>
      
      Closes #474 from CodingCat/hotfix_1399 and squashes the following commits:
      
      f3a8ba9 [CodingCat] move outdated comments
      87de2908
    • Patrick Wendell's avatar
      SPARK-1496: Have jarOfClass return Option[String] · 83084d3b
      Patrick Wendell authored
      A simple change, mostly had to change a bunch of example code.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #438 from pwendell/jar-of-class and squashes the following commits:
      
      aa010ff [Patrick Wendell] SPARK-1496: Have jarOfClass return Option[String]
      83084d3b
    • Marcelo Vanzin's avatar
      [SPARK-1459] Use local path (and not complete URL) when opening local lo... · ac164b79
      Marcelo Vanzin authored
      ...g file.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #375 from vanzin/event-file and squashes the following commits:
      
      f673029 [Marcelo Vanzin] [SPARK-1459] Use local path (and not complete URL) when opening local log file.
      ac164b79
    • Andrew Or's avatar
      [Fix #274] Document + fix annotation usages · b3e5366f
      Andrew Or authored
      ... so that we don't follow an unspoken set of forbidden rules for adding **@AlphaComponent**, **@DeveloperApi**, and **@Experimental** annotations in the code.
      
      In addition, this PR
      (1) removes unnecessary `:: * ::` tags,
      (2) adds missing `:: * ::` tags, and
      (3) removes annotations for internal APIs.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #470 from andrewor14/annotations-fix and squashes the following commits:
      
      92a7f42 [Andrew Or] Document + fix annotation usages
      b3e5366f
  3. Apr 21, 2014
    • Matei Zaharia's avatar
      [SPARK-1439, SPARK-1440] Generate unified Scaladoc across projects and Javadocs · fc783847
      Matei Zaharia authored
      I used the sbt-unidoc plugin (https://github.com/sbt/sbt-unidoc) to create a unified Scaladoc of our public packages, and generate Javadocs as well. One limitation is that I haven't found an easy way to exclude packages in the Javadoc; there is a SBT task that identifies Java sources to run javadoc on, but it's been very difficult to modify it from outside to change what is set in the unidoc package. Some SBT-savvy people should help with this. The Javadoc site also lacks package-level descriptions and things like that, so we may want to look into that. We may decide not to post these right now if it's too limited compared to the Scala one.
      
      Example of the built doc site: http://people.csail.mit.edu/matei/spark-unified-docs/
      
      Author: Matei Zaharia <matei@databricks.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Patrick Wendell <pwendell@gmail.com>
      
      Closes #457 from mateiz/better-docs and squashes the following commits:
      
      a63d4a3 [Matei Zaharia] Skip Java/Scala API docs for Python package
      5ea1f43 [Matei Zaharia] Fix links to Java classes in Java guide, fix some JS for scrolling to anchors on page load
      f05abc0 [Matei Zaharia] Don't include java.lang package names
      995e992 [Matei Zaharia] Skip internal packages and class names with $ in JavaDoc
      a14a93c [Matei Zaharia] typo
      76ce64d [Matei Zaharia] Add groups to Javadoc index page, and a first package-info.java
      ed6f994 [Matei Zaharia] Generate JavaDoc as well, add titles, update doc site to use unified docs
      acb993d [Matei Zaharia] Add Unidoc plugin for the projects we want Unidoced
      fc783847
    • Tathagata Das's avatar
      [SPARK-1332] Improve Spark Streaming's Network Receiver and InputDStream API [WIP] · 04c37b6f
      Tathagata Das authored
      The current Network Receiver API makes it slightly complicated to right a new receiver as one needs to create an instance of BlockGenerator as shown in SocketReceiver
      https://github.com/apache/spark/blob/master/streaming/src/main/scala/org/apache/spark/streaming/dstream/SocketInputDStream.scala#L51
      
      Exposing the BlockGenerator interface has made it harder to improve the receiving process. The API of NetworkReceiver (which was not a very stable API anyways) needs to be change if we are to ensure future stability.
      
      Additionally, the functions like streamingContext.socketStream that create input streams, return DStream objects. That makes it hard to expose functionality (say, rate limits) unique to input dstreams. They should return InputDStream or NetworkInputDStream. This is still not yet implemented.
      
      This PR is blocked on the graceful shutdown PR #247
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #300 from tdas/network-receiver-api and squashes the following commits:
      
      ea27b38 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
      3a4777c [Tathagata Das] Renamed NetworkInputDStream to ReceiverInputDStream, and ActorReceiver related stuff.
      838dd39 [Tathagata Das] Added more events to the StreamingListener to report errors and stopped receivers.
      a75c7a6 [Tathagata Das] Address some PR comments and fixed other issues.
      91bfa72 [Tathagata Das] Fixed bugs.
      8533094 [Tathagata Das] Scala style fixes.
      028bde6 [Tathagata Das] Further refactored receiver to allow restarting of a receiver.
      43f5290 [Tathagata Das] Made functions that create input streams return InputDStream and NetworkInputDStream, for both Scala and Java.
      2c94579 [Tathagata Das] Fixed graceful shutdown by removing interrupts on receiving thread.
      9e37a0b [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into network-receiver-api
      3223e95 [Tathagata Das] Refactored the code that runs the NetworkReceiver into further classes and traits to make them more testable.
      a36cc48 [Tathagata Das] Refactored the NetworkReceiver API for future stability.
      04c37b6f
    • Patrick Wendell's avatar
      Dev script: include RC name in git tag · 5a5b3346
      Patrick Wendell authored
      5a5b3346
    • CodingCat's avatar
      SPARK-1399: show stage failure reason in UI · 43e4a29d
      CodingCat authored
      https://issues.apache.org/jira/browse/SPARK-1399
      
      refactor StageTable a bit to support additional column for failed stage
      
      Author: CodingCat <zhunansjtu@gmail.com>
      Author: Nan Zhu <CodingCat@users.noreply.github.com>
      
      Closes #421 from CodingCat/SPARK-1399 and squashes the following commits:
      
      2caba36 [CodingCat] remove dummy tag
      77cf305 [CodingCat] create dummy element to wrap columns
      3989ce2 [CodingCat] address Aaron's comments
      18fc09f [Nan Zhu] fix compile error
      00ea30a [Nan Zhu] address Kay's comments
      16ac83d [CodingCat] set a default value of failureReason
      35df3df [CodingCat] address andrew's comments
      06d21a4 [CodingCat] address andrew's comments
      25a6db6 [CodingCat] style fix
      dc8856d [CodingCat] show stage failure reason in UI
      43e4a29d
    • Xiangrui Meng's avatar
      SPARK-1539: RDDPage.scala contains RddPage class · b7df31eb
      Xiangrui Meng authored
      SPARK-1386 changed RDDPage to RddPage but didn't change the filename. I tried sbt/sbt publish-local. Inside the spark-core jar, the unit name is RDDPage.class and hence I got the following error:
      
      ~~~
      [error] (run-main) java.lang.NoClassDefFoundError: org/apache/spark/ui/storage/RddPage
      java.lang.NoClassDefFoundError: org/apache/spark/ui/storage/RddPage
      	at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:59)
      	at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:52)
      	at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:42)
      	at org.apache.spark.SparkContext.<init>(SparkContext.scala:215)
      	at MovieLensALS$.main(MovieLensALS.scala:38)
      	at MovieLensALS.main(MovieLensALS.scala)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      Caused by: java.lang.ClassNotFoundException: org.apache.spark.ui.storage.RddPage
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
      	at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
      	at java.security.AccessController.doPrivileged(Native Method)
      	at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
      	at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
      	at org.apache.spark.ui.SparkUI.initialize(SparkUI.scala:59)
      	at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:52)
      	at org.apache.spark.ui.SparkUI.<init>(SparkUI.scala:42)
      	at org.apache.spark.SparkContext.<init>(SparkContext.scala:215)
      	at MovieLensALS$.main(MovieLensALS.scala:38)
      	at MovieLensALS.main(MovieLensALS.scala)
      	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
      	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
      	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
      	at java.lang.reflect.Method.invoke(Method.java:606)
      ~~~
      
      This can be fixed after renaming RddPage to RDDPage, or renaming RDDPage.scala to RddPage.scala. I chose the former since the name `RDD` is common in Spark code.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #454 from mengxr/rddpage-fix and squashes the following commits:
      
      f75e544 [Xiangrui Meng] rename RddPage to RDDPage
      b7df31eb
    • Andrew Or's avatar
      [Hot Fix] Ignore org.apache.spark.ui.UISuite tests · af46f1fd
      Andrew Or authored
      #446 faced a connection refused exception from these tests, causing them to timeout and fail after a long time. For now, let's disable these tests.
      
      (We recently disabled the corresponding test in streaming in 7863ecca. These tests are very similar).
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #466 from andrewor14/ignore-ui-tests and squashes the following commits:
      
      6f5a362 [Andrew Or] Ignore org.apache.spark.ui.UISuite tests
      af46f1fd
    • Patrick Wendell's avatar
      Clean up and simplify Spark configuration · fb98488f
      Patrick Wendell authored
      Over time as we've added more deployment modes, this have gotten a bit unwieldy with user-facing configuration options in Spark. Going forward we'll advise all users to run `spark-submit` to launch applications. This is a WIP patch but it makes the following improvements:
      
      1. Improved `spark-env.sh.template` which was missing a lot of things users now set in that file.
      2. Removes the shipping of SPARK_CLASSPATH, SPARK_JAVA_OPTS, and SPARK_LIBRARY_PATH to the executors on the cluster. This was an ugly hack. Instead it introduces config variables spark.executor.extraJavaOpts, spark.executor.extraLibraryPath, and spark.executor.extraClassPath.
      3. Adds ability to set these same variables for the driver using `spark-submit`.
      4. Allows you to load system properties from a `spark-defaults.conf` file when running `spark-submit`. This will allow setting both SparkConf options and other system properties utilized by `spark-submit`.
      5. Made `SPARK_LOCAL_IP` an environment variable rather than a SparkConf property. This is more consistent with it being set on each node.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #299 from pwendell/config-cleanup and squashes the following commits:
      
      127f301 [Patrick Wendell] Improvements to testing
      a006464 [Patrick Wendell] Moving properties file template.
      b4b496c [Patrick Wendell] spark-defaults.properties -> spark-defaults.conf
      0086939 [Patrick Wendell] Minor style fixes
      af09e3e [Patrick Wendell] Mention config file in docs and clean-up docs
      b16e6a2 [Patrick Wendell] Cleanup of spark-submit script and Scala quick start guide
      af0adf7 [Patrick Wendell] Automatically add user jar
      a56b125 [Patrick Wendell] Responses to Tom's review
      d50c388 [Patrick Wendell] Merge remote-tracking branch 'apache/master' into config-cleanup
      a762901 [Patrick Wendell] Fixing test failures
      ffa00fe [Patrick Wendell] Review feedback
      fda0301 [Patrick Wendell] Note
      308f1f6 [Patrick Wendell] Properly escape quotes and other clean-up for YARN
      e83cd8f [Patrick Wendell] Changes to allow re-use of test applications
      be42f35 [Patrick Wendell] Handle case where SPARK_HOME is not set
      c2a2909 [Patrick Wendell] Test compile fixes
      4ee6f9d [Patrick Wendell] Making YARN doc changes consistent
      afc9ed8 [Patrick Wendell] Cleaning up line limits and two compile errors.
      b08893b [Patrick Wendell] Additional improvements.
      ace4ead [Patrick Wendell] Responses to review feedback.
      b72d183 [Patrick Wendell] Review feedback for spark env file
      46555c1 [Patrick Wendell] Review feedback and import clean-ups
      437aed1 [Patrick Wendell] Small fix
      761ebcd [Patrick Wendell] Library path and classpath for drivers
      7cc70e4 [Patrick Wendell] Clean up terminology inside of spark-env script
      5b0ba8e [Patrick Wendell] Don't ship executor envs
      84cc5e5 [Patrick Wendell] Small clean-up
      1f75238 [Patrick Wendell] SPARK_JAVA_OPTS --> SPARK_MASTER_OPTS for master settings
      4982331 [Patrick Wendell] Remove SPARK_LIBRARY_PATH
      6eaf7d0 [Patrick Wendell] executorJavaOpts
      0faa3b6 [Patrick Wendell] Stash of adding config options in submit script and YARN
      ac2d65e [Patrick Wendell] Change spark.local.dir -> SPARK_LOCAL_DIRS
      fb98488f
  4. Apr 19, 2014
    • Michael Armbrust's avatar
      REPL cleanup. · 3a390bfd
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #451 from marmbrus/replCleanup and squashes the following commits:
      
      088526a [Michael Armbrust] REPL cleanup.
      3a390bfd
    • Tor Myklebust's avatar
      [SPARK-1535] ALS: Avoid the garbage-creating ctor of DoubleMatrix · 25fc3188
      Tor Myklebust authored
      `new DoubleMatrix(double[])` creates a garbage `double[]` of the same length as its argument and immediately throws it away.  This pull request avoids that constructor in the ALS code.
      
      Author: Tor Myklebust <tmyklebu@gmail.com>
      
      Closes #442 from tmyklebu/foo2 and squashes the following commits:
      
      2784fc5 [Tor Myklebust] Mention that this is probably fixed as of jblas 1.2.4; repunctuate.
      a09904f [Tor Myklebust] Helper function for wrapping Array[Double]'s with DoubleMatrix's.
      25fc3188
    • Michael Armbrust's avatar
      Add insertInto and saveAsTable to Python API. · 10d04213
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #447 from marmbrus/pythonInsert and squashes the following commits:
      
      c7ab692 [Michael Armbrust] Keep docstrings < 72 chars.
      ff62870 [Michael Armbrust] Add insertInto and saveAsTable to Python API.
      10d04213
    • Michael Armbrust's avatar
      Use scala deprecation instead of java. · 5d0f58b2
      Michael Armbrust authored
      This gets rid of a warning when compiling core (since we were depending on a deprecated interface with a non-deprecated function).  I also tested with javac, and this does the right thing when compiling java code.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #452 from marmbrus/scalaDeprecation and squashes the following commits:
      
      f628b4d [Michael Armbrust] Use scala deprecation instead of java.
      5d0f58b2
    • Reynold Xin's avatar
      README update · 28238c81
      Reynold Xin authored
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #443 from rxin/readme and squashes the following commits:
      
      16853de [Reynold Xin] Updated SBT and Scala instructions.
      3ac3ceb [Reynold Xin] README update
      28238c81
  5. Apr 18, 2014
  6. Apr 17, 2014
Loading