Skip to content
Snippets Groups Projects
  1. Sep 21, 2016
  2. Sep 19, 2016
    • sethah's avatar
      [SPARK-17163][ML] Unified LogisticRegression interface · 26145a5a
      sethah authored
      ## What changes were proposed in this pull request?
      
      Merge `MultinomialLogisticRegression` into `LogisticRegression` and remove `MultinomialLogisticRegression`.
      
      Marked as WIP because we should discuss the coefficients API in the model. See discussion below.
      
      JIRA: [SPARK-17163](https://issues.apache.org/jira/browse/SPARK-17163)
      
      ## How was this patch tested?
      
      Merged test suites and added some new unit tests.
      
      ## Design
      
      ### Switching between binomial and multinomial
      
      We default to automatically detecting whether we should run binomial or multinomial lor. We expose a new parameter called `family` which defaults to auto. When "auto" is used, we run normal binomial lor with pivoting if there are 1 or 2 label classes. Otherwise, we run multinomial. If the user explicitly sets the family, then we abide by that setting. In the case where "binomial" is set but multiclass lor is detected, we throw an error.
      
      ### coefficients/intercept model API (TODO)
      
      This is the biggest design point remaining, IMO. We need to decide how to store the coefficients and intercepts in the model, and in turn how to expose them via the API. Two important points:
      
      * We must maintain compatibility with the old API, i.e. we must expose `def coefficients: Vector` and `def intercept: Double`
      * There are two separate cases: binomial lr where we have a single set of coefficients and a single intercept and multinomial lr where we have `numClasses` sets of coefficients and `numClasses` intercepts.
      
      Some options:
      
      1. **Store the binomial coefficients as a `2 x numFeatures` matrix.** This means that we would center the model coefficients before storing them in the model. The BLOR algorithm gives `1 * numFeatures` coefficients, but we would convert them to `2 x numFeatures` coefficients before storing them, effectively doubling the storage in the model. This has the advantage that we can make the code cleaner (i.e. less `if (isMultinomial) ... else ...`) and we don't have to reason about the different cases as much. It has the disadvantage that we double the storage space and we could see small regressions at prediction time since there are 2x the number of operations in the prediction algorithms. Additionally, we still have to produce the uncentered coefficients/intercept via the API, so we will have to either ALSO store the uncentered version, or compute it in `def coefficients: Vector` every time.
      
      2. **Store the binomial coefficients as a `1 x numFeatures` matrix.** We still store the coefficients as a matrix and the intercepts as a vector. When users call `coefficients` we return them a `Vector` that is backed by the same underlying array as the `coefficientMatrix`, so we don't duplicate any data. At prediction time, we use the old prediction methods that are specialized for binary LOR. The benefits here are that we don't store extra data, and we won't see any regressions in performance. The cost of this is that we have separate implementations for predict methods in the binary vs multiclass case. The duplicated code is really not very high, but it's still a bit messy.
      
      If we do decide to store the 2x coefficients, we would likely want to see some performance tests to understand the potential regressions.
      
      **Update:** We have chosen option 2
      
      ### Threshold/thresholds (TODO)
      
      Currently, when `threshold` is set we clear whatever value is in `thresholds` and when `thresholds` is set we clear whatever value is in `threshold`. [SPARK-11543](https://issues.apache.org/jira/browse/SPARK-11543) was created to prefer thresholds over threshold. We should decide if we should implement this behavior now or if we want to do it in a separate JIRA.
      
      **Update:** Let's leave it for a follow up PR
      
      ## Follow up
      
      * Summary model for multiclass logistic regression [SPARK-17139](https://issues.apache.org/jira/browse/SPARK-17139)
      * Thresholds vs threshold [SPARK-11543](https://issues.apache.org/jira/browse/SPARK-11543)
      
      Author: sethah <seth.hendrickson16@gmail.com>
      
      Closes #14834 from sethah/SPARK-17163.
      26145a5a
  3. Sep 15, 2016
  4. Sep 12, 2016
    • Josh Rosen's avatar
      [SPARK-14818] Post-2.0 MiMa exclusion and build changes · 7c51b99a
      Josh Rosen authored
      This patch makes a handful of post-Spark-2.0 MiMa exclusion and build updates. It should be merged to master and a subset of it should be picked into branch-2.0 in order to test Spark 2.0.1-SNAPSHOT.
      
      - Remove the ` sketch`, `mllibLocal`, and `streamingKafka010` from the list of excluded subprojects so that MiMa checks them.
      - Remove now-unnecessary special-case handling of the Kafka 0.8 artifact in `mimaSettings`.
      - Move the exclusion added in SPARK-14743 from `v20excludes` to `v21excludes`, since that patch was only merged into master and not branch-2.0.
      - Add exclusions for an API change introduced by SPARK-17096 / #14675.
      - Add missing exclusions for the `o.a.spark.internal` and `o.a.spark.sql.internal` packages.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #15061 from JoshRosen/post-2.0-mima-changes.
      7c51b99a
  5. Sep 04, 2016
    • Shivansh's avatar
      [SPARK-17308] Improved the spark core code by replacing all pattern match on... · e75c162e
      Shivansh authored
      [SPARK-17308] Improved the spark core code by replacing all pattern match on boolean value by if/else block.
      
      ## What changes were proposed in this pull request?
      Improved the code quality of spark by replacing all pattern match on boolean value by if/else block.
      
      ## How was this patch tested?
      
      By running the tests
      
      Author: Shivansh <shiv4nsh@gmail.com>
      
      Closes #14873 from shiv4nsh/SPARK-17308.
      e75c162e
  6. Aug 26, 2016
    • Michael Gummelt's avatar
      [SPARK-16967] move mesos to module · 8e5475be
      Michael Gummelt authored
      ## What changes were proposed in this pull request?
      
      Move Mesos code into a mvn module
      
      ## How was this patch tested?
      
      unit tests
      manually submitting a client mode and cluster mode job
      spark/mesos integration test suite
      
      Author: Michael Gummelt <mgummelt@mesosphere.io>
      
      Closes #14637 from mgummelt/mesos-module.
      8e5475be
  7. Aug 10, 2016
    • jerryshao's avatar
      [SPARK-14743][YARN] Add a configurable credential manager for Spark running on YARN · ab648c00
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      Add a configurable token manager for Spark on running on yarn.
      
      ### Current Problems ###
      
      1. Supported token provider is hard-coded, currently only hdfs, hbase and hive are supported and it is impossible for user to add new token provider without code changes.
      2. Also this problem exits in timely token renewer and updater.
      
      ### Changes In This Proposal ###
      
      In this proposal, to address the problems mentioned above and make the current code more cleaner and easier to understand, mainly has 3 changes:
      
      1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` interface for token provider. Each service wants to communicate with Spark through token way needs to implement this interface.
      2. Provide a `ConfigurableTokenManager` to manage all the register token providers, also token renewer and updater. Also this class offers the API for other modules to obtain tokens, get renewal interval and so on.
      3. Implement 3 built-in token providers `HDFSTokenProvider`, `HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as supported today. Whether to load in these built-in token providers is controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by default for all the built-in token providers are loaded.
      
      ### Behavior Changes ###
      
      For the end user there's no behavior change, we still use the same configuration `spark.yarn.security.tokens.${service}.enabled` to decide which token provider is enabled (hbase or hive).
      
      For user implemented token provider (assume the name of token provider is "test") needs to add into this class should have two configurations:
      
      1. `spark.yarn.security.tokens.test.enabled` to true
      2. `spark.yarn.security.tokens.test.class` to the full qualified class name.
      
      So we still keep the same semantics as current code while add one new configuration.
      
      ### Current Status ###
      
      - [x] token provider interface and management framework.
      - [x] implement built-in token providers (hdfs, hbase, hive).
      - [x] Coverage of unit test.
      - [x] Integrated test with security cluster.
      
      ## How was this patch tested?
      
      Unit test and integrated test.
      
      Please suggest and review, any comment is greatly appreciated.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #14065 from jerryshao/SPARK-16342.
      ab648c00
  8. Aug 04, 2016
    • Sean Zhong's avatar
      [SPARK-16853][SQL] fixes encoder error in DataSet typed select · 9d7a4740
      Sean Zhong authored
      ## What changes were proposed in this pull request?
      
      For DataSet typed select:
      ```
      def select[U1: Encoder](c1: TypedColumn[T, U1]): Dataset[U1]
      ```
      If type T is a case class or a tuple class that is not atomic, the resulting logical plan's schema will mismatch with `Dataset[T]` encoder's schema, which will cause encoder error and throw AnalysisException.
      
      ### Before change:
      ```
      scala> case class A(a: Int, b: Int)
      scala> Seq((0, A(1,2))).toDS.select($"_2".as[A])
      org.apache.spark.sql.AnalysisException: cannot resolve '`a`' given input columns: [_2];
      ..
      ```
      
      ### After change:
      ```
      scala> case class A(a: Int, b: Int)
      scala> Seq((0, A(1,2))).toDS.select($"_2".as[A]).show
      +---+---+
      |  a|  b|
      +---+---+
      |  1|  2|
      +---+---+
      ```
      
      ## How was this patch tested?
      
      Unit test.
      
      Author: Sean Zhong <seanzhong@databricks.com>
      
      Closes #14474 from clockfly/SPARK-16853.
      9d7a4740
  9. Jul 12, 2016
    • petermaxlee's avatar
      [SPARK-16199][SQL] Add a method to list the referenced columns in data source Filter · c9a67621
      petermaxlee authored
      ## What changes were proposed in this pull request?
      It would be useful to support listing the columns that are referenced by a filter. This can help simplify data source planning, because with this we would be able to implement unhandledFilters method in HadoopFsRelation.
      
      This is based on rxin's patch (#13901) and adds unit tests.
      
      ## How was this patch tested?
      Added a new suite FiltersSuite.
      
      Author: petermaxlee <petermaxlee@gmail.com>
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #14120 from petermaxlee/SPARK-16199.
      c9a67621
  10. Jul 11, 2016
    • Reynold Xin's avatar
      [SPARK-16476] Restructure MimaExcludes for easier union excludes · 52b5bb0b
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      It is currently fairly difficult to have proper mima excludes when we cut a version branch. I'm proposing a small change to take the exclude list out of the exclude function, and put it in a variable so we can easily union excludes.
      
      After this change, we can bump pom.xml version to 2.1.0-SNAPSHOT, without bumping the diff base version. Note that I also deleted all the exclude rules for version 1.x, to cut down the size of the file.
      
      ## How was this patch tested?
      N/A - this is a build infra change.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #14128 from rxin/SPARK-16476.
      52b5bb0b
  11. Jul 06, 2016
  12. Jul 05, 2016
    • cody koeninger's avatar
      [SPARK-16359][STREAMING][KAFKA] unidoc skip kafka 0.10 · 1f0d0213
      cody koeninger authored
      ## What changes were proposed in this pull request?
      during sbt unidoc task, skip the streamingKafka010 subproject and filter kafka 0.10 classes from the classpath, so that at least existing kafka 0.8 doc can be included in unidoc without error
      
      ## How was this patch tested?
      sbt spark/scalaunidoc:doc | grep -i error
      
      Author: cody koeninger <cody@koeninger.org>
      
      Closes #14041 from koeninger/SPARK-16359.
      1f0d0213
  13. Jul 04, 2016
  14. Jun 30, 2016
  15. Jun 27, 2016
  16. Jun 22, 2016
  17. Jun 15, 2016
    • Reynold Xin's avatar
      [SPARK-15851][BUILD] Fix the call of the bash script to enable proper run in Windows · 5a52ba0f
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      The way bash script `build/spark-build-info` is called from core/pom.xml prevents Spark building on Windows. Instead of calling the script directly we call bash and pass the script as an argument. This enables running it on Windows with bash installed which typically comes with Git.
      
      This brings https://github.com/apache/spark/pull/13612 up-to-date and also addresses comments from the code review.
      
      Closes #13612
      
      ## How was this patch tested?
      I built manually (on a Mac) to verify it didn't break Mac compilation.
      
      Author: Reynold Xin <rxin@databricks.com>
      Author: avulanov <nashb@yandex.ru>
      
      Closes #13691 from rxin/SPARK-15851.
      5a52ba0f
  18. Jun 14, 2016
  19. Jun 11, 2016
    • Eric Liang's avatar
      [SPARK-15881] Update microbenchmark results for WideSchemaBenchmark · 5bb4564c
      Eric Liang authored
      ## What changes were proposed in this pull request?
      
      These were not updated after performance improvements. To make updating them easier, I also moved the results from inline comments out into a file, which is auto-generated when the benchmark is re-run.
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #13607 from ericl/sc-3538.
      5bb4564c
  20. Jun 09, 2016
    • Josh Rosen's avatar
      [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central · f74b7771
      Josh Rosen authored
      Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from https://github.com/scrapcodes/sbt-pom-reader/tree/ignore_artifact_id. This unnecessarily slows down the initial build on fresh machines and is also risky because it risks a build breakage in case that GitHub repository ever changes or is deleted.
      
      In order to address these issues, I have published a pre-built binary of our forked sbt-pom-reader plugin to Maven Central under the `org.spark-project` namespace and have updated Spark's build to use that artifact. This published artifact was built from https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark, which contains the contents of ScrapCodes's branch plus an additional patch to configure the build for artifact publication.
      
      /cc srowen ScrapCodes for review.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #13564 from JoshRosen/use-published-fork-of-pom-reader.
      f74b7771
  21. Jun 06, 2016
  22. May 31, 2016
    • Marcelo Vanzin's avatar
      [SPARK-15451][BUILD] Use jdk7's rt.jar when available. · 57adb77e
      Marcelo Vanzin authored
      This helps with preventing jdk8-specific calls being checked in,
      because PR builders are running the compiler with the wrong settings.
      
      If the JAVA_7_HOME env variable is set, assume it points at
      a jdk7 and use its rt.jar when invoking javac. For zinc, just run
      it with jdk7, and disable it when building jdk8-specific code.
      
      A big note for sbt usage: adding the bootstrap options forces sbt
      to fork the compiler, and that disables incremental compilation.
      That means that it's really not convenient to use for normal
      development, but should be ok for automated builds.
      
      Tested with JAVA_HOME=jdk8 and JAVA_7_HOME=jdk7:
      - mvn + zinc
      - mvn sans zinc
      - sbt
      
      Verified that in all cases, jdk8-specific library calls fail to
      compile.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #13272 from vanzin/SPARK-15451.
      57adb77e
  23. May 27, 2016
    • DB Tsai's avatar
      [SPARK-15413][ML][MLLIB] Change `toBreeze` to `asBreeze` in Vector and Matrix · 21b2605d
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      We're using `asML` to convert the mllib vector/matrix to ml vector/matrix now. Using `as` is more correct given that this conversion actually shares the same underline data structure. As a result, in this PR, `toBreeze` will be changed to `asBreeze`. This is a private API, as a result, it will not affect any user's application.
      
      ## How was this patch tested?
      
      unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #13198 from dbtsai/minor.
      21b2605d
  24. May 26, 2016
    • Yin Huai's avatar
      [SPARK-15532][SQL] SQLContext/HiveContext's public constructors should use... · 3ac2363d
      Yin Huai authored
      [SPARK-15532][SQL] SQLContext/HiveContext's public constructors should use SparkSession.build.getOrCreate
      
      ## What changes were proposed in this pull request?
      This PR changes SQLContext/HiveContext's public constructor to use SparkSession.build.getOrCreate and removes isRootContext from SQLContext.
      
      ## How was this patch tested?
      Existing tests.
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #13310 from yhuai/SPARK-15532.
      3ac2363d
    • Reynold Xin's avatar
      [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing · 361ebc28
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      This patch renames various DefaultSources to make their names more self-describing. The choice of "DefaultSource" was from the days when we did not have a good way to specify short names.
      
      They are now named:
      - LibSVMFileFormat
      - CSVFileFormat
      - JdbcRelationProvider
      - JsonFileFormat
      - ParquetFileFormat
      - TextFileFormat
      
      Backward compatibility is maintained through aliasing.
      
      ## How was this patch tested?
      Updated relevant test cases too.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #13311 from rxin/SPARK-15543.
      361ebc28
  25. May 25, 2016
    • Herman van Hovell's avatar
      [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin · 527499b6
      Herman van Hovell authored
      ## What changes were proposed in this pull request?
      The ANTLR4 SBT plugin has been moved from its own repo to one on bintray. The version was also changed from `0.7.10` to `0.7.11`. The latter actually broke our build (ihji has fixed this by also adding `0.7.10` and others to the bin-tray repo).
      
      This PR upgrades the SBT-ANTLR4 plugin and ANTLR4 to their most recent versions (`0.7.11`/`4.5.3`). I have also removed a few obsolete build configurations.
      
      ## How was this patch tested?
      Manually running SBT/Maven builds.
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #13299 from hvanhovell/SPARK-15525.
      527499b6
  26. May 21, 2016
    • Reynold Xin's avatar
      [SPARK-15424][SPARK-15437][SPARK-14807][SQL] Revert Create a hivecontext-compatibility module · 45b7557e
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      I initially asked to create a hivecontext-compatibility module to put the HiveContext there. But we are so close to Spark 2.0 release and there is only a single class in it. It seems overkill to have an entire package, which makes it more inconvenient, for a single class.
      
      ## How was this patch tested?
      Tests were moved.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #13207 from rxin/SPARK-15424.
      45b7557e
  27. May 18, 2016
    • Davies Liu's avatar
      [SPARK-15357] Cooperative spilling should check consumer memory mode · 8fb1d1c7
      Davies Liu authored
      ## What changes were proposed in this pull request?
      
      Since we support forced spilling for Spillable, which only works in OnHeap mode, different from other SQL operators (could be OnHeap or OffHeap), we should considering the mode of consumer before calling trigger forced spilling.
      
      ## How was this patch tested?
      
      Add new test.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #13151 from davies/fix_mode.
      8fb1d1c7
  28. May 17, 2016
    • DB Tsai's avatar
      [SPARK-14615][ML] Use the new ML Vector and Matrix in the ML pipeline based algorithms · e2efe052
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      Once SPARK-14487 and SPARK-14549 are merged, we will migrate to use the new vector and matrix type in the new ml pipeline based apis.
      
      ## How was this patch tested?
      
      Unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      Author: Liang-Chi Hsieh <simonh@tw.ibm.com>
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #12627 from dbtsai/SPARK-14615-NewML.
      e2efe052
    • Sean Owen's avatar
      [SPARK-15290][BUILD] Move annotations, like @Since / @DeveloperApi, into spark-tags · 122302cb
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      (See https://github.com/apache/spark/pull/12416 where most of this was already reviewed and committed; this is just the module structure and move part. This change does not move the annotations into test scope, which was the apparently problem last time.)
      
      Rename `spark-test-tags` -> `spark-tags`; move common annotations like `Since` to `spark-tags`
      
      ## How was this patch tested?
      
      Jenkins tests.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #13074 from srowen/SPARK-15290.
      122302cb
  29. May 11, 2016
    • cody koeninger's avatar
      [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact · 89e67d66
      cody koeninger authored
      ## What changes were proposed in this pull request?
      Renaming the streaming-kafka artifact to include kafka version, in anticipation of needing a different artifact for later kafka versions
      
      ## How was this patch tested?
      Unit tests
      
      Author: cody koeninger <cody@koeninger.org>
      
      Closes #12946 from koeninger/SPARK-15085.
      89e67d66
    • hyukjinkwon's avatar
      [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader · 3ff01205
      hyukjinkwon authored
      ## What changes were proposed in this pull request?
      
      This PR removes the old `json(path: String)` API which is covered by the new `json(paths: String*)`.
      
      ## How was this patch tested?
      
      Jenkins tests (existing tests should cover this)
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      Author: Hyukjin Kwon <gurwls223@gmail.com>
      
      Closes #13040 from HyukjinKwon/SPARK-15250.
      3ff01205
  30. May 10, 2016
    • Sital Kedia's avatar
      [SPARK-14542][CORE] PipeRDD should allow configurable buffer size for… · a019e6ef
      Sital Kedia authored
      ## What changes were proposed in this pull request?
      
      Currently PipedRDD internally uses PrintWriter to write data to the stdin of the piped process, which by default uses a BufferedWriter of buffer size 8k. In our experiment, we have seen that 8k buffer size is too small and the job spends significant amount of CPU time in system calls to copy the data. We should have a way to configure the buffer size for the writer.
      
      ## How was this patch tested?
      Ran PipedRDDSuite tests.
      
      Author: Sital Kedia <skedia@fb.com>
      
      Closes #12309 from sitalkedia/bufferedPipedRDD.
      a019e6ef
  31. May 09, 2016
    • Alex Bozarth's avatar
      [SPARK-10653][CORE] Remove unnecessary things from SparkEnv · c3e23bc0
      Alex Bozarth authored
      ## What changes were proposed in this pull request?
      
      Removed blockTransferService and sparkFilesDir from SparkEnv since they're rarely used and don't need to be in stored in the env. Edited their few usages to accommodate the change.
      
      ## How was this patch tested?
      
      ran dev/run-tests locally
      
      Author: Alex Bozarth <ajbozart@us.ibm.com>
      
      Closes #12970 from ajbozarth/spark10653.
      c3e23bc0
  32. May 06, 2016
    • Luciano Resende's avatar
      [SPARK-14738][BUILD] Separate docker integration tests from main build · a03c5e68
      Luciano Resende authored
      ## What changes were proposed in this pull request?
      
      Create a maven profile for executing the docker integration tests using maven
      Remove docker integration tests from main sbt build
      Update documentation on how to run docker integration tests from sbt
      
      ## How was this patch tested?
      
      Manual test of the docker integration tests as in :
      mvn -Pdocker-integration-tests -pl :spark-docker-integration-tests_2.11 compile test
      
      ## Other comments
      
      Note that the the DB2 Docker Tests are still disabled as there is a kernel version issue on the AMPLab Jenkins slaves and we would need to get them on the right level before enabling those tests. They do run ok locally with the updates from PR #12348
      
      Author: Luciano Resende <lresende@apache.org>
      
      Closes #12508 from lresende/docker.
      a03c5e68
  33. May 05, 2016
    • Luciano Resende's avatar
      [SPARK-14589][SQL] Enhance DB2 JDBC Dialect docker tests · 10443022
      Luciano Resende authored
      ## What changes were proposed in this pull request?
      
      Enhance the DB2 JDBC Dialect docker tests as they seemed to have had some issues on previous merge causing some tests to fail.
      
      ## How was this patch tested?
      
      By running the integration tests locally.
      
      Author: Luciano Resende <lresende@apache.org>
      
      Closes #12348 from lresende/SPARK-14589.
      10443022
  34. Apr 30, 2016
    • Herman van Hovell's avatar
      [SPARK-14952][CORE][ML] Remove methods that were deprecated in 1.6.0 · e5fb78ba
      Herman van Hovell authored
      #### What changes were proposed in this pull request?
      
      This PR removes three methods the were deprecated in 1.6.0:
      - `PortableDataStream.close()`
      - `LinearRegression.weights`
      - `LogisticRegression.weights`
      
      The rationale for doing this is that the impact is small and that Spark 2.0 is a major release.
      
      #### How was this patch tested?
      Compilation succeded.
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #12732 from hvanhovell/SPARK-14952.
      e5fb78ba
  35. Apr 29, 2016
    • Jakob Odersky's avatar
      [SPARK-14511][BUILD] Upgrade genjavadoc to latest upstream · 7226e190
      Jakob Odersky authored
      ## What changes were proposed in this pull request?
      In the past, genjavadoc had issues with package private members which led the spark project to use a forked version. This issue has been fixed upstream (typesafehub/genjavadoc#70) and a release is available for scala versions 2.10, 2.11 **and 2.12**, hence a forked version for spark is no longer necessary.
      This pull request updates the build configuration to use the newest upstream genjavadoc.
      
      ## How was this patch tested?
      The build was run `sbt unidoc`. During the process javadoc emits some errors on the generated java stubs, however these errors were also present before the upgrade. Furthermore, the produced html is fine.
      
      Author: Jakob Odersky <jakob@odersky.com>
      
      Closes #12707 from jodersky/SPARK-14511-genjavadoc.
      7226e190
  36. Apr 28, 2016
Loading