Skip to content
Snippets Groups Projects
  1. Apr 25, 2017
  2. Apr 14, 2017
  3. Mar 28, 2017
  4. Mar 21, 2017
  5. Dec 15, 2016
  6. Dec 08, 2016
  7. Nov 28, 2016
  8. Jul 19, 2016
  9. Jul 11, 2016
    • Reynold Xin's avatar
      [SPARK-16477] Bump master version to 2.1.0-SNAPSHOT · ffcb6e05
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      After SPARK-16476 (committed earlier today as #14128), we can finally bump the version number.
      
      ## How was this patch tested?
      N/A
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #14130 from rxin/SPARK-16477.
      ffcb6e05
  10. Apr 02, 2016
    • Dongjoon Hyun's avatar
      [MINOR][DOCS] Use multi-line JavaDoc comments in Scala code. · 4a6e78ab
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR aims to fix all Scala-Style multiline comments into Java-Style multiline comments in Scala codes.
      (All comment-only changes over 77 files: +786 lines, −747 lines)
      
      ## How was this patch tested?
      
      Manual.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12130 from dongjoon-hyun/use_multiine_javadoc_comments.
      4a6e78ab
  11. Mar 16, 2016
  12. Mar 11, 2016
    • Josh Rosen's avatar
      [SPARK-13294][PROJECT INFRA] Remove MiMa's dependency on spark-class / Spark assembly · 6ca990fb
      Josh Rosen authored
      This patch removes the need to build a full Spark assembly before running the `dev/mima` script.
      
      - I modified the `tools` project to remove a direct dependency on Spark, so `sbt/sbt tools/fullClasspath` will now return the classpath for the `GenerateMIMAIgnore` class itself plus its own dependencies.
         - This required me to delete two classes full of dead code that we don't use anymore
      - `GenerateMIMAIgnore` now uses [ClassUtil](http://software.clapper.org/classutil/) to find all of the Spark classes rather than our homemade JAR traversal code. The problem in our own code was that it didn't handle folders of classes properly, which is necessary in order to generate excludes with an assembly-free Spark build.
      - `./dev/mima` no longer runs through `spark-class`, eliminating the need to reason about classpath ordering between `SPARK_CLASSPATH` and the assembly.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #11178 from JoshRosen/remove-assembly-in-run-tests.
      6ca990fb
  13. Mar 03, 2016
  14. Mar 02, 2016
    • Dongjoon Hyun's avatar
      [SPARK-13627][SQL][YARN] Fix simple deprecation warnings. · 9c274ac4
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR aims to fix the following deprecation warnings.
        * MethodSymbolApi.paramss--> paramLists
        * AnnotationApi.tpe -> tree.tpe
        * BufferLike.readOnly -> toList.
        * StandardNames.nme -> termNames
        * scala.tools.nsc.interpreter.AbstractFileClassLoader -> scala.reflect.internal.util.AbstractFileClassLoader
        * TypeApi.declarations-> decls
      
      ## How was this patch tested?
      
      Check the compile build log and pass the tests.
      ```
      ./build/sbt
      ```
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11479 from dongjoon-hyun/SPARK-13627.
      9c274ac4
  15. Jan 30, 2016
    • Josh Rosen's avatar
      [SPARK-6363][BUILD] Make Scala 2.11 the default Scala version · 289373b2
      Josh Rosen authored
      This patch changes Spark's build to make Scala 2.11 the default Scala version. To be clear, this does not mean that Spark will stop supporting Scala 2.10: users will still be able to compile Spark for Scala 2.10 by following the instructions on the "Building Spark" page; however, it does mean that Scala 2.11 will be the default Scala version used by our CI builds (including pull request builds).
      
      The Scala 2.11 compiler is faster than 2.10, so I think we'll be able to look forward to a slight speedup in our CI builds (it looks like it's about 2X faster for the Maven compile-only builds, for instance).
      
      After this patch is merged, I'll update Jenkins to add new compile-only jobs to ensure that Scala 2.10 compilation doesn't break.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #10608 from JoshRosen/SPARK-6363.
      289373b2
  16. Jan 08, 2016
  17. Dec 31, 2015
  18. Dec 19, 2015
  19. Nov 17, 2015
  20. Sep 15, 2015
  21. Aug 25, 2015
  22. Jul 16, 2015
    • Jan Prach's avatar
      [SPARK-9015] [BUILD] Clean project import in scala ide · b536d5dc
      Jan Prach authored
      Cleanup maven for a clean import in scala-ide / eclipse.
      
      * remove groovy plugin which is really not needed at all
      * add-source from build-helper-maven-plugin is not needed as recent version of scala-maven-plugin do it automatically
      * add lifecycle-mapping plugin to hide a few useless warnings from ide
      
      Author: Jan Prach <jendap@gmail.com>
      
      Closes #7375 from jendap/clean-project-import-in-scala-ide and squashes the following commits:
      
      c4b4c0f [Jan Prach] fix whitespaces
      5a83e07 [Jan Prach] Revert "remove java compiler warnings from java tests"
      312007e [Jan Prach] scala-maven-plugin itself add scala sources by default
      f47d856 [Jan Prach] remove spark-1.4-staging repository
      c8a54db [Jan Prach] remove java compiler warnings from java tests
      999a068 [Jan Prach] remove some maven warnings in scala ide
      80fbdc5 [Jan Prach] remove groovy and gmavenplus plugin
      b536d5dc
  23. Jul 14, 2015
    • Josh Rosen's avatar
      [SPARK-8962] Add Scalastyle rule to ban direct use of Class.forName; fix existing uses · 11e5c372
      Josh Rosen authored
      This pull request adds a Scalastyle regex rule which fails the style check if `Class.forName` is used directly.  `Class.forName` always loads classes from the default / system classloader, but in a majority of cases, we should be using Spark's own `Utils.classForName` instead, which tries to load classes from the current thread's context classloader and falls back to the classloader which loaded Spark when the context classloader is not defined.
      
      <!-- Reviewable:start -->
      [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/apache/spark/7350)
      <!-- Reviewable:end -->
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #7350 from JoshRosen/ban-Class.forName and squashes the following commits:
      
      e3e96f7 [Josh Rosen] Merge remote-tracking branch 'origin/master' into ban-Class.forName
      c0b7885 [Josh Rosen] Hopefully fix the last two cases
      d707ba7 [Josh Rosen] Fix uses of Class.forName that I missed in my first cleanup pass
      046470d [Josh Rosen] Merge remote-tracking branch 'origin/master' into ban-Class.forName
      62882ee [Josh Rosen] Fix uses of Class.forName or add exclusion.
      d9abade [Josh Rosen] Add stylechecker rule to ban uses of Class.forName
      11e5c372
  24. Jul 10, 2015
    • Jonathan Alter's avatar
      [SPARK-7977] [BUILD] Disallowing println · e14b545d
      Jonathan Alter authored
      Author: Jonathan Alter <jonalter@users.noreply.github.com>
      
      Closes #7093 from jonalter/SPARK-7977 and squashes the following commits:
      
      ccd44cc [Jonathan Alter] Changed println to log in ThreadingSuite
      7fcac3e [Jonathan Alter] Reverting to println in ThreadingSuite
      10724b6 [Jonathan Alter] Changing some printlns to logs in tests
      eeec1e7 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      0b1dcb4 [Jonathan Alter] More println cleanup
      aedaf80 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      925fd98 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      0c16fa3 [Jonathan Alter] Replacing some printlns with logs
      45c7e05 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      5c8e283 [Jonathan Alter] Allowing println in audit-release examples
      5b50da1 [Jonathan Alter] Allowing printlns in example files
      ca4b477 [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      83ab635 [Jonathan Alter] Fixing new printlns
      54b131f [Jonathan Alter] Merge branch 'master' of github.com:apache/spark into SPARK-7977
      1cd8a81 [Jonathan Alter] Removing some unnecessary comments and printlns
      b837c3a [Jonathan Alter] Disallowing println
      e14b545d
  25. Jun 03, 2015
    • Patrick Wendell's avatar
      [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0 · 2c4d550e
      Patrick Wendell authored
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #6328 from pwendell/spark-1.5-update and squashes the following commits:
      
      2f42d02 [Patrick Wendell] A few more excludes
      4bebcf0 [Patrick Wendell] Update to RC4
      61aaf46 [Patrick Wendell] Using new release candidate
      55f1610 [Patrick Wendell] Another exclude
      04b4f04 [Patrick Wendell] More issues with transient 1.4 changes
      36f549b [Patrick Wendell] [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0
      2c4d550e
  26. May 01, 2015
    • Sandy Ryza's avatar
      [SPARK-4550] In sort-based shuffle, store map outputs in serialized form · 0a2b15ce
      Sandy Ryza authored
      Refer to the JIRA for the design doc and some perf results.
      
      I wanted to call out some of the more possibly controversial changes up front:
      * Map outputs are only stored in serialized form when Kryo is in use.  I'm still unsure whether Java-serialized objects can be relocated.  At the very least, Java serialization writes out a stream header which causes problems with the current approach, so I decided to leave investigating this to future work.
      * The shuffle now explicitly operates on key-value pairs instead of any object.  Data is written to shuffle files in alternating keys and values instead of key-value tuples.  `BlockObjectWriter.write` now accepts a key argument and a value argument instead of any object.
      * The map output buffer can hold a max of Integer.MAX_VALUE bytes.  Though this wouldn't be terribly difficult to change.
      * When spilling occurs, the objects that still in memory at merge time end up serialized and deserialized an extra time.
      
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #4450 from sryza/sandy-spark-4550 and squashes the following commits:
      
      8c70dd9 [Sandy Ryza] Fix serialization
      9c16fe6 [Sandy Ryza] Fix a couple tests and move getAutoReset to KryoSerializerInstance
      6c54e06 [Sandy Ryza] Fix scalastyle
      d8462d8 [Sandy Ryza] SPARK-4550
      0a2b15ce
  27. Apr 03, 2015
    • Reynold Xin's avatar
      [SPARK-6428] Turn on explicit type checking for public methods. · 82701ee2
      Reynold Xin authored
      This builds on my earlier pull requests and turns on the explicit type checking in scalastyle.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5342 from rxin/SPARK-6428 and squashes the following commits:
      
      7b531ab [Reynold Xin] import ordering
      2d9a8a5 [Reynold Xin] jl
      e668b1c [Reynold Xin] override
      9b9e119 [Reynold Xin] Parenthesis.
      82e0cf5 [Reynold Xin] [SPARK-6428] Turn on explicit type checking for public methods.
      82701ee2
  28. Apr 02, 2015
    • Patrick Wendell's avatar
      [SPARK-6627] Some clean-up in shuffle code. · 6562787b
      Patrick Wendell authored
      Before diving into review #4450 I did a look through the existing shuffle
      code to learn how it works. Unfortunately, there are some very
      confusing things in this code. This patch makes a few small changes
      to simplify things. It is not easily to concisely describe the changes
      because of how convoluted the issues were, but they are fairly small
      logically:
      
      1. There is a trait named `ShuffleBlockManager` that only deals with
         one logical function which is retrieving shuffle block data given shuffle
         block coordinates. This trait has two implementors FileShuffleBlockManager
         and IndexShuffleBlockManager. Confusingly the vast majority of those
         implementations have nothing to do with this particular functionality.
         So I've renamed the trait to ShuffleBlockResolver and documented it.
      2. The aforementioned trait had two almost identical methods, for no good
         reason. I removed one method (getBytes) and modified callers to use the
         other one. I think the behavior is preserved in all cases.
      3. The sort shuffle code uses an identifier "0" in the reduce slot of a
         BlockID as a placeholder. I made it into a constant since it needs to
         be consistent across multiple places.
      
      I think for (3) there is actually a better solution that would avoid the
      need to do this type of workaround/hack in the first place, but it's more
      complex so I'm punting it for now.
      
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #5286 from pwendell/cleanup and squashes the following commits:
      
      c71fbc7 [Patrick Wendell] Open interface back up for testing
      f36edd5 [Patrick Wendell] Code review feedback
      d1c0494 [Patrick Wendell] Style fix
      a406079 [Patrick Wendell] [HOTFIX] Some clean-up in shuffle code.
      6562787b
  29. Mar 20, 2015
    • Marcelo Vanzin's avatar
      [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT. · a7456459
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5056 from vanzin/SPARK-6371 and squashes the following commits:
      
      63220df [Marcelo Vanzin] Merge branch 'master' into SPARK-6371
      6506f75 [Marcelo Vanzin] Use more fine-grained exclusion.
      178ba71 [Marcelo Vanzin] Oops.
      75b2375 [Marcelo Vanzin] Exclude VertexRDD in MiMA.
      a45a62c [Marcelo Vanzin] Work around MIMA warning.
      1d8a670 [Marcelo Vanzin] Re-group jetty exclusion.
      0e8e909 [Marcelo Vanzin] Ignore ml, don't ignore graphx.
      cef4603 [Marcelo Vanzin] Indentation.
      296cf82 [Marcelo Vanzin] [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT.
      a7456459
Loading