Skip to content
Snippets Groups Projects
  1. Apr 01, 2014
    • Mark Hamstra's avatar
      [SPARK-1342] Scala 2.10.4 · 764353d2
      Mark Hamstra authored
      Just a Scala version increment
      
      Author: Mark Hamstra <markhamstra@gmail.com>
      
      Closes #259 from markhamstra/scala-2.10.4 and squashes the following commits:
      
      fbec547 [Mark Hamstra] [SPARK-1342] Bumped Scala version to 2.10.4
      764353d2
    • Michael Armbrust's avatar
      [SQL] SPARK-1372 Support for caching and uncaching tables in a SQLContext. · f5c418da
      Michael Armbrust authored
      This doesn't yet support different databases in Hive (though you can probably workaround this by calling `USE <dbname>`).  However, given the time constraints for 1.0 I think its probably worth including this now and extending the functionality in the next release.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #282 from marmbrus/cacheTables and squashes the following commits:
      
      83785db [Michael Armbrust] Support for caching and uncaching tables in a SQLContext.
      f5c418da
    • Andrew Or's avatar
      [Hot Fix #42] Persisted RDD disappears on storage page if re-used · ada310a9
      Andrew Or authored
      If a previously persisted RDD is re-used, its information disappears from the Storage page.
      
      This is because the tasks associated with re-using the RDD do not report the RDD's blocks as updated (which is correct). On stage submit, however, we overwrite any existing information regarding that RDD with a fresh one, whether or not the information for the RDD already exists.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #281 from andrewor14/ui-storage-fix and squashes the following commits:
      
      408585a [Andrew Or] Fix storage UI bug
      ada310a9
  2. Mar 31, 2014
    • Andrew Or's avatar
      [SPARK-1377] Upgrade Jetty to 8.1.14v20131031 · 94fe7fd4
      Andrew Or authored
      Previous version was 7.6.8v20121106. The only difference between Jetty 7 and Jetty 8 is that the former uses Servlet API 2.5, while the latter uses Servlet API 3.0.
      
      Author: Andrew Or <andrewor14@gmail.com>
      
      Closes #280 from andrewor14/jetty-upgrade and squashes the following commits:
      
      dd57104 [Andrew Or] Merge github.com:apache/spark into jetty-upgrade
      e75fa85 [Andrew Or] Upgrade Jetty to 8.1.14v20131031
      94fe7fd4
    • Sandy Ryza's avatar
      SPARK-1376. In the yarn-cluster submitter, rename "args" option to "arg" · 564f1c13
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #279 from sryza/sandy-spark-1376 and squashes the following commits:
      
      d8aebfa [Sandy Ryza] SPARK-1376. In the yarn-cluster submitter, rename "args" option to "arg"
      564f1c13
    • Patrick Wendell's avatar
      SPARK-1365 [HOTFIX] Fix RateLimitedOutputStream test · 33b3c2a8
      Patrick Wendell authored
      This test needs to be fixed. It currently depends on Thread.sleep() having exact-timing
      semantics, which is not a valid assumption.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #277 from pwendell/rate-limited-stream and squashes the following commits:
      
      6c0ff81 [Patrick Wendell] SPARK-1365: Fix RateLimitedOutputStream test
      33b3c2a8
    • Michael Armbrust's avatar
      [SQL] Rewrite join implementation to allow streaming of one relation. · 5731af5b
      Michael Armbrust authored
      Before we were materializing everything in memory.  This also uses the projection interface so will be easier to plug in code gen (its ported from that branch).
      
      @rxin @liancheng
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #250 from marmbrus/hashJoin and squashes the following commits:
      
      1ad873e [Michael Armbrust] Change hasNext logic back to the correct version.
      8e6f2a2 [Michael Armbrust] Review comments.
      1e9fb63 [Michael Armbrust] style
      bc0cb84 [Michael Armbrust] Rewrite join implementation to allow streaming of one relation.
      5731af5b
    • Patrick Wendell's avatar
      SPARK-1352: Improve robustness of spark-submit script · 841721e0
      Patrick Wendell authored
      1. Better error messages when required arguments are missing.
      2. Support for unit testing cases where presented arguments are invalid.
      3. Bug fix: Only use environment varaibles when they are set (otherwise will cause NPE).
      4. A verbose mode to aid debugging.
      5. Visibility of several variables is set to private.
      6. Deprecation warning for existing scripts.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #271 from pwendell/spark-submit and squashes the following commits:
      
      9146def [Patrick Wendell] SPARK-1352: Improve robustness of spark-submit script
      841721e0
  3. Mar 30, 2014
  4. Mar 29, 2014
    • Bernardo Gomez Palacio's avatar
      [SPARK-1186] : Enrich the Spark Shell to support additional arguments. · fda86d8b
      Bernardo Gomez Palacio authored
      Enrich the Spark Shell functionality to support the following options.
      
      ```
      Usage: spark-shell [OPTIONS]
      
      OPTIONS:
          -h  --help              : Print this help information.
          -c  --cores             : The maximum number of cores to be used by the Spark Shell.
          -em --executor-memory   : The memory used by each executor of the Spark Shell, the number
                                    is followed by m for megabytes or g for gigabytes, e.g. "1g".
          -dm --driver-memory     : The memory used by the Spark Shell, the number is followed
                                    by m for megabytes or g for gigabytes, e.g. "1g".
          -m  --master            : A full string that describes the Spark Master, defaults to "local"
                                    e.g. "spark://localhost:7077".
          --log-conf              : Enables logging of the supplied SparkConf as INFO at start of the
                                    Spark Context.
      
      e.g.
          spark-shell -m spark://localhost:7077 -c 4 -dm 512m -em 2g
      ```
      
      **Note**: this commit reflects the changes applied to _master_ based on [5d98cfc1].
      
      [ticket: SPARK-1186] : Enrich the Spark Shell to support additional arguments.
                              https://spark-project.atlassian.net/browse/SPARK-1186
      
      Author      : bernardo.gomezpalcio@gmail.com
      
      Author: Bernardo Gomez Palacio <bernardo.gomezpalacio@gmail.com>
      
      Closes #116 from berngp/feature/enrich-spark-shell and squashes the following commits:
      
      c5f455f [Bernardo Gomez Palacio] [SPARK-1186] : Enrich the Spark Shell to support additional arguments.
      fda86d8b
    • Cheng Hao's avatar
      Implement the RLike & Like in catalyst · af3746ce
      Cheng Hao authored
      This PR includes:
      1) Unify the unit test for expression evaluation
      2) Add implementation of RLike & Like
      
      Author: Cheng Hao <hao.cheng@intel.com>
      
      Closes #224 from chenghao-intel/string_expression and squashes the following commits:
      
      84f72e9 [Cheng Hao] fix bug in RLike/Like & Simplify the unit test
      aeeb1d7 [Cheng Hao] Simplify the implementation/unit test of RLike/Like
      319edb7 [Cheng Hao] change to spark code style
      91cfd33 [Cheng Hao] add implementation for rlike/like
      2c8929e [Cheng Hao] Update the unit test for expression evaluation
      af3746ce
    • Sandy Ryza's avatar
      SPARK-1126. spark-app preliminary · 16178160
      Sandy Ryza authored
      This is a starting version of the spark-app script for running compiled binaries against Spark.  It still needs tests and some polish.  The only testing I've done so far has been using it to launch jobs in yarn-standalone mode against a pseudo-distributed cluster.
      
      This leaves out the changes required for launching python scripts.  I think it might be best to save those for another JIRA/PR (while keeping to the design so that they won't require backwards-incompatible changes).
      
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #86 from sryza/sandy-spark-1126 and squashes the following commits:
      
      d428d85 [Sandy Ryza] Commenting, doc, and import fixes from Patrick's comments
      e7315c6 [Sandy Ryza] Fix failing tests
      34de899 [Sandy Ryza] Change --more-jars to --jars and fix docs
      299ddca [Sandy Ryza] Fix scalastyle
      a94c627 [Sandy Ryza] Add newline at end of SparkSubmit
      04bc4e2 [Sandy Ryza] SPARK-1126. spark-submit script
      16178160
    • Thomas Graves's avatar
      SPARK-1345 adding missing dependency on avro for hadoop 0.23 to the new ... · 3738f244
      Thomas Graves authored
      ...sql pom files
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #263 from tgravescs/SPARK-1345 and squashes the following commits:
      
      b43a2a0 [Thomas Graves] SPARK-1345 adding missing dependency on avro for hadoop 0.23 to the new sql pom files
      3738f244
  5. Mar 28, 2014
    • Nick Lanham's avatar
      fix path for jar, make sed actually work on OSX · 75d46be5
      Nick Lanham authored
      Author: Nick Lanham <nick@afternight.org>
      
      Closes #264 from nicklan/make-distribution-fixes and squashes the following commits:
      
      172b981 [Nick Lanham] fix path for jar, make sed actually work on OSX
      75d46be5
    • Prashant Sharma's avatar
      SPARK-1096, a space after comment start style checker. · 60abc252
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #124 from ScrapCodes/SPARK-1096/scalastyle-comment-check and squashes the following commits:
      
      214135a [Prashant Sharma] Review feedback.
      5eba88c [Prashant Sharma] Fixed style checks for ///+ comments.
      e54b2f8 [Prashant Sharma] improved message, work around.
      83e7144 [Prashant Sharma] removed dependency on scalastyle in plugin, since scalastyle sbt plugin already depends on the right version. Incase we update the plugin we will have to adjust our spark-style project to depend on right scalastyle version.
      810a1d6 [Prashant Sharma] SPARK-1096, a space after comment style checker.
      ba33193 [Prashant Sharma] scala style as a project
      60abc252
    • Nick Lanham's avatar
      Make sed do -i '' on OSX · 632c3220
      Nick Lanham authored
      I don't have access to an OSX machine, so if someone could test this that would be great.
      
      Author: Nick Lanham <nick@afternight.org>
      
      Closes #258 from nicklan/osx-sed-fix and squashes the following commits:
      
      a6f158f [Nick Lanham] Also make mktemp work on OSX
      558fd6e [Nick Lanham] Make sed do -i '' on OSX
      632c3220
    • Takuya UESHIN's avatar
      [SPARK-1210] Prevent ContextClassLoader of Actor from becoming ClassLoader of Executo... · 3d89043b
      Takuya UESHIN authored
      ...r.
      
      Constructor of `org.apache.spark.executor.Executor` should not set context class loader of current thread, which is backend Actor's thread.
      
      Run the following code in local-mode REPL.
      
      ```
      scala> case class Foo(i: Int)
      scala> val ret = sc.parallelize((1 to 100).map(Foo), 10).collect
      ```
      
      This causes errors as follows:
      
      ```
      ERROR actor.OneForOneStrategy: [L$line5.$read$$iwC$$iwC$$iwC$$iwC$Foo;
      java.lang.ArrayStoreException: [L$line5.$read$$iwC$$iwC$$iwC$$iwC$Foo;
           at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88)
           at org.apache.spark.SparkContext$$anonfun$runJob$3.apply(SparkContext.scala:870)
           at org.apache.spark.SparkContext$$anonfun$runJob$3.apply(SparkContext.scala:870)
           at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56)
           at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:859)
           at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:616)
           at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207)
           at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
           at akka.actor.ActorCell.invoke(ActorCell.scala:456)
           at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
           at akka.dispatch.Mailbox.run(Mailbox.scala:219)
           at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
           at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
           at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
           at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
           at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
      ```
      
      This is because the class loaders to deserialize result `Foo` instances might be different from backend Actor's, and the Actor's class loader should be the same as Driver's.
      
      Author: Takuya UESHIN <ueshin@happy-camper.st>
      
      Closes #15 from ueshin/wip/wrongcontextclassloader and squashes the following commits:
      
      d79e8c0 [Takuya UESHIN] Change a parent class loader of ExecutorURLClassLoader.
      c6c09b6 [Takuya UESHIN] Add a test to collect objects of class defined in repl.
      43e0feb [Takuya UESHIN] Prevent ContextClassLoader of Actor from becoming ClassLoader of Executor.
      3d89043b
  6. Mar 27, 2014
    • Petko Nikolov's avatar
      [SPARK-1268] Adding XOR and AND-NOT operations to spark.util.collection.BitSet · 6f986f0b
      Petko Nikolov authored
      Symmetric difference (xor) in particular is useful for computing some distance metrics (e.g. Hamming). Unit tests added.
      
      Author: Petko Nikolov <nikolov@soundcloud.com>
      
      Closes #172 from petko-nikolov/bitset-imprv and squashes the following commits:
      
      451f28b [Petko Nikolov] fixed style mistakes
      5beba18 [Petko Nikolov] rm outer loop in andNot test
      0e61035 [Petko Nikolov] conform to spark style; rm redundant asserts; more unit tests added; use arraycopy instead of loop
      d53cdb9 [Petko Nikolov] rm incidentally added space
      4e1df43 [Petko Nikolov] adding xor and and-not to BitSet; unit tests added
      6f986f0b
    • Sean Owen's avatar
      SPARK-1335. Also increase perm gen / code cache for scalatest when invoked via Maven build · 53953d09
      Sean Owen authored
      I am observing build failures when the Maven build reaches tests in the new SQL components. (I'm on Java 7 / OSX 10.9). The failure is the usual complaint from scala, that it's out of permgen space, or that JIT out of code cache space.
      
      I see that various build scripts increase these both for SBT. This change simply adds these settings to scalatest's arguments. Works for me and seems a bit more consistent.
      
      (I also snuck in cures for new build warnings from new scaladoc. Felt too trivial for a new PR, although it's separate. Just something I also saw while examining the build output.)
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #253 from srowen/SPARK-1335 and squashes the following commits:
      
      c0f2d31 [Sean Owen] Appease scalastyle with a newline at the end of the file
      a02679c [Sean Owen] Fix scaladoc errors due to missing links, which are generating build warnings, from some recent doc changes. We apparently can't generate links outside the module.
      b2c6a09 [Sean Owen] Add perm gen, code cache settings to scalatest, mirroring SBT settings elsewhere, which allows tests to complete in at least one environment where they are failing. (Also removed a duplicate -Xms setting elsewhere.)
      53953d09
    • Thomas Graves's avatar
      SPARK-1330 removed extra echo from comput_classpath.sh · 426042ad
      Thomas Graves authored
      remove the extra echo which prevents spark-class from working.  Note that I did not update the comment above it, which is also wrong because I'm not sure what it should do.
      
      Should hive only be included if explicitly built with sbt hive/assembly or should sbt assembly build it?
      
      Author: Thomas Graves <tgraves@apache.org>
      
      Closes #241 from tgravescs/SPARK-1330 and squashes the following commits:
      
      b10d708 [Thomas Graves] SPARK-1330 removed extra echo from comput_classpath.sh
      426042ad
    • Michael Armbrust's avatar
      Cut down the granularity of travis tests. · 5b2d863e
      Michael Armbrust authored
      This PR amortizes the cost of downloading all the jars and compiling core across more test cases.  In one anecdotal run this change takes the cumulative time down from ~80 minutes to ~40 minutes.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #255 from marmbrus/travis and squashes the following commits:
      
      506b22d [Michael Armbrust] Cut down the granularity of travis tests so we can amortize the cost of compilation.
      5b2d863e
  7. Mar 26, 2014
    • Xiangrui Meng's avatar
      [SPARK-1327] GLM needs to check addIntercept for intercept and weights · d679843a
      Xiangrui Meng authored
      GLM needs to check addIntercept for intercept and weights. The current implementation always uses the first weight as intercept. Added a test for training without adding intercept.
      
      JIRA: https://spark-project.atlassian.net/browse/SPARK-1327
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #236 from mengxr/glm and squashes the following commits:
      
      bcac1ac [Xiangrui Meng] add two tests to ensure {Lasso, Ridge}.setIntercept will throw an exceptions
      a104072 [Xiangrui Meng] remove protected to be compatible with 0.9
      0e57aa4 [Xiangrui Meng] update Lasso and RidgeRegression to parse the weights correctly from GLM mark createModel protected mark predictPoint protected
      d7f629f [Xiangrui Meng] fix a bug in GLM when intercept is not used
      d679843a
    • Sean Owen's avatar
      SPARK-1325. The maven build error for Spark Tools · 1fa48d94
      Sean Owen authored
      This is just a slight variation on https://github.com/apache/spark/pull/234 and alternative suggestion for SPARK-1325. `scala-actors` is not necessary. `SparkBuild.scala` should be updated to reflect the direct dependency on `scala-reflect` and `scala-compiler`. And the `repl` build, which has the same dependencies, should also be consistent between Maven / SBT.
      
      Author: Sean Owen <sowen@cloudera.com>
      Author: witgo <witgo@qq.com>
      
      Closes #240 from srowen/SPARK-1325 and squashes the following commits:
      
      25bd7db [Sean Owen] Add necessary dependencies scala-reflect and scala-compiler to tools. Update repl dependencies, which are similar, to be consistent between Maven / SBT in this regard too.
      1fa48d94
    • NirmalReddy's avatar
      Spark 1095 : Adding explicit return types to all public methods · 3e63d98f
      NirmalReddy authored
      Excluded those that are self-evident and the cases that are discussed in the mailing list.
      
      Author: NirmalReddy <nirmal_reddy2000@yahoo.com>
      Author: NirmalReddy <nirmal.reddy@imaginea.com>
      
      Closes #168 from NirmalReddy/Spark-1095 and squashes the following commits:
      
      ac54b29 [NirmalReddy] import misplaced
      8c5ff3e [NirmalReddy] Changed syntax of unit returning methods
      02d0778 [NirmalReddy] fixed explicit types in all the other packages
      1c17773 [NirmalReddy] fixed explicit types in core package
      3e63d98f
    • Patrick Wendell's avatar
      SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS · be6d96c1
      Patrick Wendell authored
      /cc @aarondav and @andrewor14
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #231 from pwendell/ui-binding and squashes the following commits:
      
      e8025f8 [Patrick Wendell] SPARK-1324: SparkUI Should Not Bind to SPARK_PUBLIC_DNS
      be6d96c1
    • Michael Armbrust's avatar
      [SQL] Add a custom serializer for maps since they do not have a no-arg constructor. · e15e5741
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #243 from marmbrus/mapSer and squashes the following commits:
      
      54045f7 [Michael Armbrust] Add a custom serializer for maps since they do not have a no-arg constructor.
      e15e5741
    • Michael Armbrust's avatar
      [SQL] Un-ignore a test that is now passing. · 32cbdfd2
      Michael Armbrust authored
      Add golden answer for aforementioned test.
      
      Also, fix golden test generation from sbt/sbt by setting the classpath correctly.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #244 from marmbrus/partTest and squashes the following commits:
      
      37a33c9 [Michael Armbrust] Un-ignore a test that is now passing, add golden answer for aforementioned test.  Fix golden test generation from sbt/sbt.
      32cbdfd2
    • Cheng Lian's avatar
      Unified package definition format in Spark SQL · 345825d9
      Cheng Lian authored
      According to discussions in comments of PR #208, this PR unifies package definition format in Spark SQL.
      
      Some broken links in ScalaDoc and typos detected along the way are also fixed.
      
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      
      Closes #225 from liancheng/packageDefinition and squashes the following commits:
      
      75c47b3 [Cheng Lian] Fixed file line length
      4f87968 [Cheng Lian] Unified package definition format in Spark SQL
      345825d9
    • Prashant Sharma's avatar
      SPARK-1322, top in pyspark should sort result in descending order. · a0853a39
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #235 from ScrapCodes/SPARK-1322/top-rev-sort and squashes the following commits:
      
      f316266 [Prashant Sharma] Minor change in comment.
      58e58c6 [Prashant Sharma] SPARK-1322, top in pyspark should sort result in descending order.
      a0853a39
    • Reynold Xin's avatar
      SPARK-1321 Use Guava's top k implementation rather than our... · b859853b
      Reynold Xin authored
      SPARK-1321 Use Guava's top k implementation rather than our BoundedPriorityQueue based implementation
      
      Also updated the documentation for top and takeOrdered.
      
      On my simple test of sorting 100 million (Int, Int) tuples using Spark, Guava's top k implementation (in Ordering) is much faster than the BoundedPriorityQueue implementation for roughly sorted input (10 - 20X faster), and still faster for purely random input (2 - 5X).
      
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #229 from rxin/takeOrdered and squashes the following commits:
      
      0d11844 [Reynold Xin] Use Guava's top k implementation rather than our BoundedPriorityQueue based implementation. Also updated the documentation for top and takeOrdered.
      b859853b
  8. Mar 25, 2014
    • Michael Armbrust's avatar
      Initial experimentation with Travis CI configuration · 4f7d547b
      Michael Armbrust authored
      This is not intended to replace Jenkins immediately, and Jenkins will remain the CI of reference for merging pull requests in the near term.  Long term, it is possible that Travis will give us better integration with github, so we are investigating its use.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #230 from marmbrus/travis and squashes the following commits:
      
      93f9a32 [Michael Armbrust] Add Apache license to .travis.yml
      d7c0e78 [Michael Armbrust] Initial experimentation with Travis CI configuration
      4f7d547b
    • witgo's avatar
      Avoid Option while generating call site · 8237df80
      witgo authored
      This is an update on https://github.com/apache/spark/pull/180, which changes the solution from blacklisting "Option.scala" to avoiding the Option code path while generating the call path.
      
      Also includes a unit test to prevent this issue in the future, and some minor refactoring.
      
      Thanks @witgo for reporting this issue and working on the initial solution!
      
      Author: witgo <witgo@qq.com>
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #222 from aarondav/180 and squashes the following commits:
      
      f74aad1 [Aaron Davidson] Avoid Option while generating call site & add unit tests
      d2b4980 [witgo] Modify the position of the filter
      1bc22d7 [witgo] Fix Stage.name return "apply at Option.scala:120"
      8237df80
    • Shivaram Venkataraman's avatar
      SPARK-1319: Fix scheduler to account for tasks using > 1 CPUs. · f8111eae
      Shivaram Venkataraman authored
      Move CPUS_PER_TASK to TaskSchedulerImpl as the value is a constant and use it in both Mesos and CoarseGrained scheduler backends.
      
      Thanks @kayousterhout for the design discussion
      
      Author: Shivaram Venkataraman <shivaram@eecs.berkeley.edu>
      
      Closes #219 from shivaram/multi-cpus and squashes the following commits:
      
      5c7d685 [Shivaram Venkataraman] Don't pass availableCpus to TaskSetManager
      260e4d5 [Shivaram Venkataraman] Add a check for non-zero CPUs in TaskSetManager
      73fcf6f [Shivaram Venkataraman] Add documentation for spark.task.cpus
      647bc45 [Shivaram Venkataraman] Fix scheduler to account for tasks using > 1 CPUs. Move CPUS_PER_TASK to TaskSchedulerImpl as the value is a constant and use it in both Mesos and CoarseGrained scheduler backends.
      f8111eae
    • Sean Owen's avatar
      SPARK-1316. Remove use of Commons IO · 71d4ed27
      Sean Owen authored
      (This follows from a side point on SPARK-1133, in discussion of the PR: https://github.com/apache/spark/pull/164 )
      
      Commons IO is barely used in the project, and can easily be replaced with equivalent calls to Guava or the existing Spark `Utils.scala` class.
      
      Removing a dependency feels good, and this one in particular can get a little problematic since Hadoop uses it too.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #226 from srowen/SPARK-1316 and squashes the following commits:
      
      21efef3 [Sean Owen] Remove use of Commons IO
      71d4ed27
    • Michael Armbrust's avatar
      Add more hive compatability tests to whitelist · 134ace7f
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #220 from marmbrus/moreTests and squashes the following commits:
      
      223ec35 [Michael Armbrust] Blacklist machine specific test
      9c966cc [Michael Armbrust] add more hive compatability tests to whitelist
      134ace7f
    • Aaron Davidson's avatar
      SPARK-1286: Make usage of spark-env.sh idempotent · 007a7334
      Aaron Davidson authored
      Various spark scripts load spark-env.sh. This can cause growth of any variables that may be appended to (SPARK_CLASSPATH, SPARK_REPL_OPTS) and it makes the precedence order for options specified in spark-env.sh less clear.
      
      One use-case for the latter is that we want to set options from the command-line of spark-shell, but these options will be overridden by subsequent loading of spark-env.sh. If we were to load the spark-env.sh first and then set our command-line options, we could guarantee correct precedence order.
      
      Note that we use SPARK_CONF_DIR if available to support the sbin/ scripts, which always set this variable from sbin/spark-config.sh. Otherwise, we default to the ../conf/ as usual.
      
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #184 from aarondav/idem and squashes the following commits:
      
      e291f91 [Aaron Davidson] Use "private" variables in load-spark-env.sh
      8da8360 [Aaron Davidson] Add .sh extension to load-spark-env.sh
      93a2471 [Aaron Davidson] SPARK-1286: Make usage of spark-env.sh idempotent
      007a7334
Loading