Skip to content
Snippets Groups Projects
  1. Aug 10, 2016
    • jerryshao's avatar
      [SPARK-14743][YARN] Add a configurable credential manager for Spark running on YARN · ab648c00
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      Add a configurable token manager for Spark on running on yarn.
      
      ### Current Problems ###
      
      1. Supported token provider is hard-coded, currently only hdfs, hbase and hive are supported and it is impossible for user to add new token provider without code changes.
      2. Also this problem exits in timely token renewer and updater.
      
      ### Changes In This Proposal ###
      
      In this proposal, to address the problems mentioned above and make the current code more cleaner and easier to understand, mainly has 3 changes:
      
      1. Abstract a `ServiceTokenProvider` as well as `ServiceTokenRenewable` interface for token provider. Each service wants to communicate with Spark through token way needs to implement this interface.
      2. Provide a `ConfigurableTokenManager` to manage all the register token providers, also token renewer and updater. Also this class offers the API for other modules to obtain tokens, get renewal interval and so on.
      3. Implement 3 built-in token providers `HDFSTokenProvider`, `HiveTokenProvider` and `HBaseTokenProvider` to keep the same semantics as supported today. Whether to load in these built-in token providers is controlled by configuration "spark.yarn.security.tokens.${service}.enabled", by default for all the built-in token providers are loaded.
      
      ### Behavior Changes ###
      
      For the end user there's no behavior change, we still use the same configuration `spark.yarn.security.tokens.${service}.enabled` to decide which token provider is enabled (hbase or hive).
      
      For user implemented token provider (assume the name of token provider is "test") needs to add into this class should have two configurations:
      
      1. `spark.yarn.security.tokens.test.enabled` to true
      2. `spark.yarn.security.tokens.test.class` to the full qualified class name.
      
      So we still keep the same semantics as current code while add one new configuration.
      
      ### Current Status ###
      
      - [x] token provider interface and management framework.
      - [x] implement built-in token providers (hdfs, hbase, hive).
      - [x] Coverage of unit test.
      - [x] Integrated test with security cluster.
      
      ## How was this patch tested?
      
      Unit test and integrated test.
      
      Please suggest and review, any comment is greatly appreciated.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #14065 from jerryshao/SPARK-16342.
      ab648c00
  2. Aug 04, 2016
    • Sean Zhong's avatar
      [SPARK-16853][SQL] fixes encoder error in DataSet typed select · 9d7a4740
      Sean Zhong authored
      ## What changes were proposed in this pull request?
      
      For DataSet typed select:
      ```
      def select[U1: Encoder](c1: TypedColumn[T, U1]): Dataset[U1]
      ```
      If type T is a case class or a tuple class that is not atomic, the resulting logical plan's schema will mismatch with `Dataset[T]` encoder's schema, which will cause encoder error and throw AnalysisException.
      
      ### Before change:
      ```
      scala> case class A(a: Int, b: Int)
      scala> Seq((0, A(1,2))).toDS.select($"_2".as[A])
      org.apache.spark.sql.AnalysisException: cannot resolve '`a`' given input columns: [_2];
      ..
      ```
      
      ### After change:
      ```
      scala> case class A(a: Int, b: Int)
      scala> Seq((0, A(1,2))).toDS.select($"_2".as[A]).show
      +---+---+
      |  a|  b|
      +---+---+
      |  1|  2|
      +---+---+
      ```
      
      ## How was this patch tested?
      
      Unit test.
      
      Author: Sean Zhong <seanzhong@databricks.com>
      
      Closes #14474 from clockfly/SPARK-16853.
      9d7a4740
  3. Jul 12, 2016
    • petermaxlee's avatar
      [SPARK-16199][SQL] Add a method to list the referenced columns in data source Filter · c9a67621
      petermaxlee authored
      ## What changes were proposed in this pull request?
      It would be useful to support listing the columns that are referenced by a filter. This can help simplify data source planning, because with this we would be able to implement unhandledFilters method in HadoopFsRelation.
      
      This is based on rxin's patch (#13901) and adds unit tests.
      
      ## How was this patch tested?
      Added a new suite FiltersSuite.
      
      Author: petermaxlee <petermaxlee@gmail.com>
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #14120 from petermaxlee/SPARK-16199.
      c9a67621
  4. Jul 11, 2016
    • Reynold Xin's avatar
      [SPARK-16476] Restructure MimaExcludes for easier union excludes · 52b5bb0b
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      It is currently fairly difficult to have proper mima excludes when we cut a version branch. I'm proposing a small change to take the exclude list out of the exclude function, and put it in a variable so we can easily union excludes.
      
      After this change, we can bump pom.xml version to 2.1.0-SNAPSHOT, without bumping the diff base version. Note that I also deleted all the exclude rules for version 1.x, to cut down the size of the file.
      
      ## How was this patch tested?
      N/A - this is a build infra change.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #14128 from rxin/SPARK-16476.
      52b5bb0b
  5. Jul 06, 2016
  6. Jul 05, 2016
    • cody koeninger's avatar
      [SPARK-16359][STREAMING][KAFKA] unidoc skip kafka 0.10 · 1f0d0213
      cody koeninger authored
      ## What changes were proposed in this pull request?
      during sbt unidoc task, skip the streamingKafka010 subproject and filter kafka 0.10 classes from the classpath, so that at least existing kafka 0.8 doc can be included in unidoc without error
      
      ## How was this patch tested?
      sbt spark/scalaunidoc:doc | grep -i error
      
      Author: cody koeninger <cody@koeninger.org>
      
      Closes #14041 from koeninger/SPARK-16359.
      1f0d0213
  7. Jul 04, 2016
  8. Jun 30, 2016
  9. Jun 27, 2016
  10. Jun 22, 2016
  11. Jun 15, 2016
    • Reynold Xin's avatar
      [SPARK-15851][BUILD] Fix the call of the bash script to enable proper run in Windows · 5a52ba0f
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      The way bash script `build/spark-build-info` is called from core/pom.xml prevents Spark building on Windows. Instead of calling the script directly we call bash and pass the script as an argument. This enables running it on Windows with bash installed which typically comes with Git.
      
      This brings https://github.com/apache/spark/pull/13612 up-to-date and also addresses comments from the code review.
      
      Closes #13612
      
      ## How was this patch tested?
      I built manually (on a Mac) to verify it didn't break Mac compilation.
      
      Author: Reynold Xin <rxin@databricks.com>
      Author: avulanov <nashb@yandex.ru>
      
      Closes #13691 from rxin/SPARK-15851.
      5a52ba0f
  12. Jun 14, 2016
  13. Jun 11, 2016
    • Eric Liang's avatar
      [SPARK-15881] Update microbenchmark results for WideSchemaBenchmark · 5bb4564c
      Eric Liang authored
      ## What changes were proposed in this pull request?
      
      These were not updated after performance improvements. To make updating them easier, I also moved the results from inline comments out into a file, which is auto-generated when the benchmark is re-run.
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #13607 from ericl/sc-3538.
      5bb4564c
  14. Jun 09, 2016
    • Josh Rosen's avatar
      [SPARK-15827][BUILD] Publish Spark's forked sbt-pom-reader to Maven Central · f74b7771
      Josh Rosen authored
      Spark's SBT build currently uses a fork of the sbt-pom-reader plugin but depends on that fork via a SBT subproject which is cloned from https://github.com/scrapcodes/sbt-pom-reader/tree/ignore_artifact_id. This unnecessarily slows down the initial build on fresh machines and is also risky because it risks a build breakage in case that GitHub repository ever changes or is deleted.
      
      In order to address these issues, I have published a pre-built binary of our forked sbt-pom-reader plugin to Maven Central under the `org.spark-project` namespace and have updated Spark's build to use that artifact. This published artifact was built from https://github.com/JoshRosen/sbt-pom-reader/tree/v1.0.0-spark, which contains the contents of ScrapCodes's branch plus an additional patch to configure the build for artifact publication.
      
      /cc srowen ScrapCodes for review.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #13564 from JoshRosen/use-published-fork-of-pom-reader.
      f74b7771
  15. Jun 06, 2016
  16. May 31, 2016
    • Marcelo Vanzin's avatar
      [SPARK-15451][BUILD] Use jdk7's rt.jar when available. · 57adb77e
      Marcelo Vanzin authored
      This helps with preventing jdk8-specific calls being checked in,
      because PR builders are running the compiler with the wrong settings.
      
      If the JAVA_7_HOME env variable is set, assume it points at
      a jdk7 and use its rt.jar when invoking javac. For zinc, just run
      it with jdk7, and disable it when building jdk8-specific code.
      
      A big note for sbt usage: adding the bootstrap options forces sbt
      to fork the compiler, and that disables incremental compilation.
      That means that it's really not convenient to use for normal
      development, but should be ok for automated builds.
      
      Tested with JAVA_HOME=jdk8 and JAVA_7_HOME=jdk7:
      - mvn + zinc
      - mvn sans zinc
      - sbt
      
      Verified that in all cases, jdk8-specific library calls fail to
      compile.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #13272 from vanzin/SPARK-15451.
      57adb77e
  17. May 27, 2016
    • DB Tsai's avatar
      [SPARK-15413][ML][MLLIB] Change `toBreeze` to `asBreeze` in Vector and Matrix · 21b2605d
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      We're using `asML` to convert the mllib vector/matrix to ml vector/matrix now. Using `as` is more correct given that this conversion actually shares the same underline data structure. As a result, in this PR, `toBreeze` will be changed to `asBreeze`. This is a private API, as a result, it will not affect any user's application.
      
      ## How was this patch tested?
      
      unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      
      Closes #13198 from dbtsai/minor.
      21b2605d
  18. May 26, 2016
    • Yin Huai's avatar
      [SPARK-15532][SQL] SQLContext/HiveContext's public constructors should use... · 3ac2363d
      Yin Huai authored
      [SPARK-15532][SQL] SQLContext/HiveContext's public constructors should use SparkSession.build.getOrCreate
      
      ## What changes were proposed in this pull request?
      This PR changes SQLContext/HiveContext's public constructor to use SparkSession.build.getOrCreate and removes isRootContext from SQLContext.
      
      ## How was this patch tested?
      Existing tests.
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #13310 from yhuai/SPARK-15532.
      3ac2363d
    • Reynold Xin's avatar
      [SPARK-15543][SQL] Rename DefaultSources to make them more self-describing · 361ebc28
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      This patch renames various DefaultSources to make their names more self-describing. The choice of "DefaultSource" was from the days when we did not have a good way to specify short names.
      
      They are now named:
      - LibSVMFileFormat
      - CSVFileFormat
      - JdbcRelationProvider
      - JsonFileFormat
      - ParquetFileFormat
      - TextFileFormat
      
      Backward compatibility is maintained through aliasing.
      
      ## How was this patch tested?
      Updated relevant test cases too.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #13311 from rxin/SPARK-15543.
      361ebc28
  19. May 25, 2016
    • Herman van Hovell's avatar
      [SPARK-15525][SQL][BUILD] Upgrade ANTLR4 SBT plugin · 527499b6
      Herman van Hovell authored
      ## What changes were proposed in this pull request?
      The ANTLR4 SBT plugin has been moved from its own repo to one on bintray. The version was also changed from `0.7.10` to `0.7.11`. The latter actually broke our build (ihji has fixed this by also adding `0.7.10` and others to the bin-tray repo).
      
      This PR upgrades the SBT-ANTLR4 plugin and ANTLR4 to their most recent versions (`0.7.11`/`4.5.3`). I have also removed a few obsolete build configurations.
      
      ## How was this patch tested?
      Manually running SBT/Maven builds.
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #13299 from hvanhovell/SPARK-15525.
      527499b6
  20. May 21, 2016
    • Reynold Xin's avatar
      [SPARK-15424][SPARK-15437][SPARK-14807][SQL] Revert Create a hivecontext-compatibility module · 45b7557e
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      I initially asked to create a hivecontext-compatibility module to put the HiveContext there. But we are so close to Spark 2.0 release and there is only a single class in it. It seems overkill to have an entire package, which makes it more inconvenient, for a single class.
      
      ## How was this patch tested?
      Tests were moved.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #13207 from rxin/SPARK-15424.
      45b7557e
  21. May 18, 2016
    • Davies Liu's avatar
      [SPARK-15357] Cooperative spilling should check consumer memory mode · 8fb1d1c7
      Davies Liu authored
      ## What changes were proposed in this pull request?
      
      Since we support forced spilling for Spillable, which only works in OnHeap mode, different from other SQL operators (could be OnHeap or OffHeap), we should considering the mode of consumer before calling trigger forced spilling.
      
      ## How was this patch tested?
      
      Add new test.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #13151 from davies/fix_mode.
      8fb1d1c7
  22. May 17, 2016
    • DB Tsai's avatar
      [SPARK-14615][ML] Use the new ML Vector and Matrix in the ML pipeline based algorithms · e2efe052
      DB Tsai authored
      ## What changes were proposed in this pull request?
      
      Once SPARK-14487 and SPARK-14549 are merged, we will migrate to use the new vector and matrix type in the new ml pipeline based apis.
      
      ## How was this patch tested?
      
      Unit tests
      
      Author: DB Tsai <dbt@netflix.com>
      Author: Liang-Chi Hsieh <simonh@tw.ibm.com>
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #12627 from dbtsai/SPARK-14615-NewML.
      e2efe052
    • Sean Owen's avatar
      [SPARK-15290][BUILD] Move annotations, like @Since / @DeveloperApi, into spark-tags · 122302cb
      Sean Owen authored
      ## What changes were proposed in this pull request?
      
      (See https://github.com/apache/spark/pull/12416 where most of this was already reviewed and committed; this is just the module structure and move part. This change does not move the annotations into test scope, which was the apparently problem last time.)
      
      Rename `spark-test-tags` -> `spark-tags`; move common annotations like `Since` to `spark-tags`
      
      ## How was this patch tested?
      
      Jenkins tests.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #13074 from srowen/SPARK-15290.
      122302cb
  23. May 11, 2016
    • cody koeninger's avatar
      [SPARK-15085][STREAMING][KAFKA] Rename streaming-kafka artifact · 89e67d66
      cody koeninger authored
      ## What changes were proposed in this pull request?
      Renaming the streaming-kafka artifact to include kafka version, in anticipation of needing a different artifact for later kafka versions
      
      ## How was this patch tested?
      Unit tests
      
      Author: cody koeninger <cody@koeninger.org>
      
      Closes #12946 from koeninger/SPARK-15085.
      89e67d66
    • hyukjinkwon's avatar
      [SPARK-15250][SQL] Remove deprecated json API in DataFrameReader · 3ff01205
      hyukjinkwon authored
      ## What changes were proposed in this pull request?
      
      This PR removes the old `json(path: String)` API which is covered by the new `json(paths: String*)`.
      
      ## How was this patch tested?
      
      Jenkins tests (existing tests should cover this)
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      Author: Hyukjin Kwon <gurwls223@gmail.com>
      
      Closes #13040 from HyukjinKwon/SPARK-15250.
      3ff01205
  24. May 10, 2016
    • Sital Kedia's avatar
      [SPARK-14542][CORE] PipeRDD should allow configurable buffer size for… · a019e6ef
      Sital Kedia authored
      ## What changes were proposed in this pull request?
      
      Currently PipedRDD internally uses PrintWriter to write data to the stdin of the piped process, which by default uses a BufferedWriter of buffer size 8k. In our experiment, we have seen that 8k buffer size is too small and the job spends significant amount of CPU time in system calls to copy the data. We should have a way to configure the buffer size for the writer.
      
      ## How was this patch tested?
      Ran PipedRDDSuite tests.
      
      Author: Sital Kedia <skedia@fb.com>
      
      Closes #12309 from sitalkedia/bufferedPipedRDD.
      a019e6ef
  25. May 09, 2016
    • Alex Bozarth's avatar
      [SPARK-10653][CORE] Remove unnecessary things from SparkEnv · c3e23bc0
      Alex Bozarth authored
      ## What changes were proposed in this pull request?
      
      Removed blockTransferService and sparkFilesDir from SparkEnv since they're rarely used and don't need to be in stored in the env. Edited their few usages to accommodate the change.
      
      ## How was this patch tested?
      
      ran dev/run-tests locally
      
      Author: Alex Bozarth <ajbozart@us.ibm.com>
      
      Closes #12970 from ajbozarth/spark10653.
      c3e23bc0
  26. May 06, 2016
    • Luciano Resende's avatar
      [SPARK-14738][BUILD] Separate docker integration tests from main build · a03c5e68
      Luciano Resende authored
      ## What changes were proposed in this pull request?
      
      Create a maven profile for executing the docker integration tests using maven
      Remove docker integration tests from main sbt build
      Update documentation on how to run docker integration tests from sbt
      
      ## How was this patch tested?
      
      Manual test of the docker integration tests as in :
      mvn -Pdocker-integration-tests -pl :spark-docker-integration-tests_2.11 compile test
      
      ## Other comments
      
      Note that the the DB2 Docker Tests are still disabled as there is a kernel version issue on the AMPLab Jenkins slaves and we would need to get them on the right level before enabling those tests. They do run ok locally with the updates from PR #12348
      
      Author: Luciano Resende <lresende@apache.org>
      
      Closes #12508 from lresende/docker.
      a03c5e68
  27. May 05, 2016
    • Luciano Resende's avatar
      [SPARK-14589][SQL] Enhance DB2 JDBC Dialect docker tests · 10443022
      Luciano Resende authored
      ## What changes were proposed in this pull request?
      
      Enhance the DB2 JDBC Dialect docker tests as they seemed to have had some issues on previous merge causing some tests to fail.
      
      ## How was this patch tested?
      
      By running the integration tests locally.
      
      Author: Luciano Resende <lresende@apache.org>
      
      Closes #12348 from lresende/SPARK-14589.
      10443022
  28. Apr 30, 2016
    • Herman van Hovell's avatar
      [SPARK-14952][CORE][ML] Remove methods that were deprecated in 1.6.0 · e5fb78ba
      Herman van Hovell authored
      #### What changes were proposed in this pull request?
      
      This PR removes three methods the were deprecated in 1.6.0:
      - `PortableDataStream.close()`
      - `LinearRegression.weights`
      - `LogisticRegression.weights`
      
      The rationale for doing this is that the impact is small and that Spark 2.0 is a major release.
      
      #### How was this patch tested?
      Compilation succeded.
      
      Author: Herman van Hovell <hvanhovell@questtec.nl>
      
      Closes #12732 from hvanhovell/SPARK-14952.
      e5fb78ba
  29. Apr 29, 2016
    • Jakob Odersky's avatar
      [SPARK-14511][BUILD] Upgrade genjavadoc to latest upstream · 7226e190
      Jakob Odersky authored
      ## What changes were proposed in this pull request?
      In the past, genjavadoc had issues with package private members which led the spark project to use a forked version. This issue has been fixed upstream (typesafehub/genjavadoc#70) and a release is available for scala versions 2.10, 2.11 **and 2.12**, hence a forked version for spark is no longer necessary.
      This pull request updates the build configuration to use the newest upstream genjavadoc.
      
      ## How was this patch tested?
      The build was run `sbt unidoc`. During the process javadoc emits some errors on the generated java stubs, however these errors were also present before the upgrade. Furthermore, the produced html is fine.
      
      Author: Jakob Odersky <jakob@odersky.com>
      
      Closes #12707 from jodersky/SPARK-14511-genjavadoc.
      7226e190
  30. Apr 28, 2016
    • Yin Huai's avatar
    • Pravin Gadakh's avatar
      [SPARK-14613][ML] Add @Since into the matrix and vector classes in spark-mllib-local · dae538a4
      Pravin Gadakh authored
      ## What changes were proposed in this pull request?
      
      This PR adds `since` tag into the matrix and vector classes in spark-mllib-local.
      
      ## How was this patch tested?
      
      Scala-style checks passed.
      
      Author: Pravin Gadakh <prgadakh@in.ibm.com>
      
      Closes #12416 from pravingadakh/SPARK-14613.
      dae538a4
    • Wenchen Fan's avatar
      [SPARK-14654][CORE] New accumulator API · bf5496db
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      This PR introduces a new accumulator API  which is much simpler than before:
      
      1. the type hierarchy is simplified, now we only have an `Accumulator` class
      2. Combine `initialValue` and `zeroValue` concepts into just one concept: `zeroValue`
      3. there in only one `register` method, the accumulator registration and cleanup registration are combined.
      4. the `id`,`name` and `countFailedValues` are combined into an `AccumulatorMetadata`, and is provided during registration.
      
      `SQLMetric` is a good example to show the simplicity of this new API.
      
      What we break:
      
      1. no `setValue` anymore. In the new API, the intermedia type can be different from the result type, it's very hard to implement a general `setValue`
      2. accumulator can't be serialized before registered.
      
      Problems need to be addressed in follow-ups:
      
      1. with this new API, `AccumulatorInfo` doesn't make a lot of sense, the partial output is not partial updates, we need to expose the intermediate value.
      2. `ExceptionFailure` should not carry the accumulator updates. Why do users care about accumulator updates for failed cases? It looks like we only use this feature to update the internal metrics, how about we sending a heartbeat to update internal metrics after the failure event?
      3. the public event `SparkListenerTaskEnd` carries a `TaskMetrics`. Ideally this `TaskMetrics` don't need to carry external accumulators, as the only method of `TaskMetrics` that can access external accumulators is `private[spark]`. However, `SQLListener` use it to retrieve sql metrics.
      
      ## How was this patch tested?
      
      existing tests
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #12612 from cloud-fan/acc.
      bf5496db
  31. Apr 27, 2016
  32. Apr 25, 2016
    • Andrew Or's avatar
      [SPARK-14861][SQL] Replace internal usages of SQLContext with SparkSession · 18c2c925
      Andrew Or authored
      ## What changes were proposed in this pull request?
      
      In Spark 2.0, `SparkSession` is the new thing. Internally we should stop using `SQLContext` everywhere since that's supposed to be not the main user-facing API anymore.
      
      In this patch I took care to not break any public APIs. The one place that's suspect is `o.a.s.ml.source.libsvm.DefaultSource`, but according to mengxr it's not supposed to be public so it's OK to change the underlying `FileFormat` trait.
      
      **Reviewers**: This is a big patch that may be difficult to review but the changes are actually really straightforward. If you prefer I can break it up into a few smaller patches, but it will delay the progress of this issue a little.
      
      ## How was this patch tested?
      
      No change in functionality intended.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #12625 from andrewor14/spark-session-refactor.
      18c2c925
    • Eric Liang's avatar
      [SPARK-14790] Always run scalastyle on sbt compile and test · 761fc46c
      Eric Liang authored
      ## What changes were proposed in this pull request?
      
      Sbt compile and test should also run scalastyle. This makes it less likely you forget to run scalastyle and fail in jenkins. Scalastyle results are cached for efficiency.
      
      This patch was originally written by ahirreddy; I just fixed it up to work with scalastyle 0.8.0.
      
      ## How was this patch tested?
      
      Tested manually with `build/sbt package`.
      
      Author: Eric Liang <ekl@databricks.com>
      
      Closes #12555 from ericl/scalastyle.
      761fc46c
  33. Apr 22, 2016
    • Yin Huai's avatar
      [SPARK-14807] Create a compatibility module · 7dde1da9
      Yin Huai authored
      ## What changes were proposed in this pull request?
      
      This PR creates a compatibility module in sql (called `hive-1-x-compatibility`), which will host HiveContext in Spark 2.0 (moving HiveContext to here will be done separately). This module is not included in assembly because only users who still want to access HiveContext need it.
      
      ## How was this patch tested?
      I manually tested `sbt/sbt -Phive package` and `mvn -Phive package -DskipTests`.
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #12580 from yhuai/compatibility.
      7dde1da9
    • Joan's avatar
      [SPARK-6429] Implement hashCode and equals together · bf95b8da
      Joan authored
      ## What changes were proposed in this pull request?
      
      Implement some `hashCode` and `equals` together in order to enable the scalastyle.
      This is a first batch, I will continue to implement them but I wanted to know your thoughts.
      
      Author: Joan <joan@goyeau.com>
      
      Closes #12157 from joan38/SPARK-6429-HashCode-Equals.
      bf95b8da
Loading