Skip to content
Snippets Groups Projects
  1. Jul 28, 2014
    • Cheng Lian's avatar
      [SPARK-2410][SQL] Merging Hive Thrift/JDBC server (with Maven profile fix) · a7a9d144
      Cheng Lian authored
      JIRA issue: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)
      
      Another try for #1399 & #1600. Those two PR breaks Jenkins builds because we made a separate profile `hive-thriftserver` in sub-project `assembly`, but the `hive-thriftserver` module is defined outside the `hive-thriftserver` profile. Thus every time a pull request that doesn't touch SQL code will also execute test suites defined in `hive-thriftserver`, but tests fail because related .class files are not included in the assembly jar.
      
      In the most recent commit, module `hive-thriftserver` is moved into its own profile to fix this problem. All previous commits are squashed for clarity.
      
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      
      Closes #1620 from liancheng/jdbc-with-maven-fix and squashes the following commits:
      
      629988e [Cheng Lian] Moved hive-thriftserver module definition into its own profile
      ec3c7a7 [Cheng Lian] Cherry picked the Hive Thrift server
      a7a9d144
  2. Jul 27, 2014
    • Rahul Singhal's avatar
      SPARK-2651: Add maven scalastyle plugin · d7eac4c3
      Rahul Singhal authored
      Can be run as: "mvn scalastyle:check"
      
      Author: Rahul Singhal <rahul.singhal@guavus.com>
      
      Closes #1550 from rahulsinghaliitd/SPARK-2651 and squashes the following commits:
      
      53748dd [Rahul Singhal] SPARK-2651: Add maven scalastyle plugin
      d7eac4c3
    • Patrick Wendell's avatar
      Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server" · e5bbce9a
      Patrick Wendell authored
      This reverts commit f6ff2a61.
      e5bbce9a
    • Cheng Lian's avatar
      [SPARK-2410][SQL] Merging Hive Thrift/JDBC server · f6ff2a61
      Cheng Lian authored
      (This is a replacement of #1399, trying to fix potential `HiveThriftServer2` port collision between parallel builds. Please refer to [these comments](https://github.com/apache/spark/pull/1399#issuecomment-50212572) for details.)
      
      JIRA issue: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)
      
      Merging the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc).
      
      Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.
      
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      
      Closes #1600 from liancheng/jdbc and squashes the following commits:
      
      ac4618b [Cheng Lian] Uses random port for HiveThriftServer2 to avoid collision with parallel builds
      090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR
      21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs
      fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd]
      199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver
      1083e9d [Cheng Lian] Fixed failed test suites
      7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic
      9cc0f06 [Cheng Lian] Starts beeline with spark-submit
      cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile
      061880f [Cheng Lian] Addressed all comments by @pwendell
      7755062 [Cheng Lian] Adapts test suites to spark-submit settings
      40bafef [Cheng Lian] Fixed more license header issues
      e214aab [Cheng Lian] Added missing license headers
      b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh
      f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft
      3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit
      a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit
      61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit
      2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
      f6ff2a61
  3. Jul 25, 2014
    • Michael Armbrust's avatar
      Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server" · afd757a2
      Michael Armbrust authored
      This reverts commit 06dc0d2c.
      
      #1399 is making Jenkins fail.  We should investigate and put this back after its passing tests.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #1594 from marmbrus/revertJDBC and squashes the following commits:
      
      59748da [Michael Armbrust] Revert "[SPARK-2410][SQL] Merging Hive Thrift/JDBC server"
      afd757a2
    • Cheng Lian's avatar
      [SPARK-2410][SQL] Merging Hive Thrift/JDBC server · 06dc0d2c
      Cheng Lian authored
      JIRA issue:
      
      - Main: [SPARK-2410](https://issues.apache.org/jira/browse/SPARK-2410)
      - Related: [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678)
      
      Cherry picked the Hive Thrift/JDBC server from [branch-1.0-jdbc](https://github.com/apache/spark/tree/branch-1.0-jdbc).
      
      (Thanks chenghao-intel for his initial contribution of the Spark SQL CLI.)
      
      TODO
      
      - [x] Use `spark-submit` to launch the server, the CLI and beeline
      - [x] Migration guideline draft for Shark users
      
      ----
      
      Hit by a bug in `SparkSubmitArguments` while working on this PR: all application options that are recognized by `SparkSubmitArguments` are stolen as `SparkSubmit` options. For example:
      
      ```bash
      $ spark-submit --class org.apache.hive.beeline.BeeLine spark-internal --help
      ```
      
      This actually shows usage information of `SparkSubmit` rather than `BeeLine`.
      
      ~~Fixed this bug here since the `spark-internal` related stuff also touches `SparkSubmitArguments` and I'd like to avoid conflict.~~
      
      **UPDATE** The bug mentioned above is now tracked by [SPARK-2678](https://issues.apache.org/jira/browse/SPARK-2678). Decided to revert changes to this bug since it involves more subtle considerations and worth a separate PR.
      
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      
      Closes #1399 from liancheng/thriftserver and squashes the following commits:
      
      090beea [Cheng Lian] Revert changes related to SPARK-2678, decided to move them to another PR
      21c6cf4 [Cheng Lian] Updated Spark SQL programming guide docs
      fe0af31 [Cheng Lian] Reordered spark-submit options in spark-shell[.cmd]
      199e3fb [Cheng Lian] Disabled MIMA for hive-thriftserver
      1083e9d [Cheng Lian] Fixed failed test suites
      7db82a1 [Cheng Lian] Fixed spark-submit application options handling logic
      9cc0f06 [Cheng Lian] Starts beeline with spark-submit
      cfcf461 [Cheng Lian] Updated documents and build scripts for the newly added hive-thriftserver profile
      061880f [Cheng Lian] Addressed all comments by @pwendell
      7755062 [Cheng Lian] Adapts test suites to spark-submit settings
      40bafef [Cheng Lian] Fixed more license header issues
      e214aab [Cheng Lian] Added missing license headers
      b8905ba [Cheng Lian] Fixed minor issues in spark-sql and start-thriftserver.sh
      f975d22 [Cheng Lian] Updated docs for Hive compatibility and Shark migration guide draft
      3ad4e75 [Cheng Lian] Starts spark-sql shell with spark-submit
      a5310d1 [Cheng Lian] Make HiveThriftServer2 play well with spark-submit
      61f39f4 [Cheng Lian] Starts Hive Thrift server via spark-submit
      2c4c539 [Cheng Lian] Cherry picked the Hive Thrift server
      06dc0d2c
  4. Jul 21, 2014
    • Cheng Lian's avatar
      [SPARK-2190][SQL] Specialized ColumnType for Timestamp · cd273a23
      Cheng Lian authored
      JIRA issue: [SPARK-2190](https://issues.apache.org/jira/browse/SPARK-2190)
      
      Added specialized in-memory column type for `Timestamp`. Whitelisted all timestamp related Hive tests except `timestamp_udf`, which is timezone sensitive.
      
      Author: Cheng Lian <lian.cs.zju@gmail.com>
      
      Closes #1440 from liancheng/timestamp-column-type and squashes the following commits:
      
      e682175 [Cheng Lian] Enabled more timezone sensitive Hive tests.
      53a358f [Cheng Lian] Fixed failed test suites
      01b592d [Cheng Lian] Fixed SimpleDateFormat thread safety issue
      2a59343 [Cheng Lian] Removed timezone sensitive Hive timestamp tests
      45dd05d [Cheng Lian] Added Timestamp specific in-memory columnar representation
      cd273a23
  5. Jun 11, 2014
    • Prashant Sharma's avatar
      [SPARK-2069] MIMA false positives · 5b754b45
      Prashant Sharma authored
      Fixes SPARK 2070 and 2071
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #1021 from ScrapCodes/SPARK-2070/package-private-methods and squashes the following commits:
      
      7979a57 [Prashant Sharma] addressed code review comments
      558546d [Prashant Sharma] A little fancy error message.
      59275ab [Prashant Sharma] SPARK-2071 Mima ignores classes and its members from previous versions too.
      0c4ff2b [Prashant Sharma] SPARK-2070 Ignore methods along with annotated classes.
      5b754b45
  6. Jun 01, 2014
    • Patrick Wendell's avatar
      Better explanation for how to use MIMA excludes. · d17d2214
      Patrick Wendell authored
      This patch does a few things:
      1. We have a file MimaExcludes.scala exclusively for excludes.
      2. The test runner tells users about that file if a test fails.
      3. I've added back the excludes used from 0.9->1.0. We should keep
         these in the project as an official audit trail of times where
         we decided to make exceptions.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #937 from pwendell/mima and squashes the following commits:
      
      7ee0db2 [Patrick Wendell] Better explanation for how to use MIMA excludes.
      d17d2214
  7. May 08, 2014
    • Prashant Sharma's avatar
      SPARK-1565, update examples to be used with spark-submit script. · 44dd57fb
      Prashant Sharma authored
      Commit for initial feedback, basically I am curious if we should prompt user for providing args esp. when its mandatory. And can we skip if they are not ?
      
      Also few other things that did not work like
      `bin/spark-submit examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop1.0.4.jar --class org.apache.spark.examples.SparkALS --arg 100 500 10 5 2`
      
      Not all the args get passed properly, may be I have messed up something will try to sort it out hopefully.
      
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #552 from ScrapCodes/SPARK-1565/update-examples and squashes the following commits:
      
      669dd23 [Prashant Sharma] Review comments
      2727e70 [Prashant Sharma] SPARK-1565, update examples to be used with spark-submit script.
      44dd57fb
  8. May 04, 2014
    • Michael Armbrust's avatar
      Whitelist Hive Tests · 92b2902c
      Michael Armbrust authored
      This is ready when Jenkins is.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #596 from marmbrus/moreTests and squashes the following commits:
      
      85be703 [Michael Armbrust] Blacklist MR required tests.
      35bc311 [Michael Armbrust] Add hive golden answers.
      ede98fd [Michael Armbrust] More hive gitignore
      da096ea [Michael Armbrust] update whitelist
      92b2902c
  9. May 01, 2014
    • Michael Armbrust's avatar
      [SQL] SPARK-1661 - Fix regex_serde test · a43d9c14
      Michael Armbrust authored
      The JIRA in question is actually reporting a bug with Shark, but I wanted to make sure Spark SQL did not have similar problems.  This fixes a bug in our parsing code that was preventing the test from executing, but it looks like the RegexSerDe is working in Spark SQL.
      
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #595 from marmbrus/fixRegexSerdeTest and squashes the following commits:
      
      a4dc612 [Michael Armbrust] Add files created by hive to gitignore.
      efa6402 [Michael Armbrust] Fix Hive serde_regex test.
      a43d9c14
  10. Apr 25, 2014
    • Patrick Wendell's avatar
      SPARK-1619 Launch spark-shell with spark-submit · dc3b640a
      Patrick Wendell authored
      This simplifies the shell a bunch and passes all arguments through to spark-submit.
      
      There is a tiny incompatibility from 0.9.1 which is that you can't put `-c` _or_ `--cores`, only `--cores`. However, spark-submit will give a good error message in this case, I don't think many people used this, and it's a trivial change for users.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      
      Closes #542 from pwendell/spark-shell and squashes the following commits:
      
      9eb3e6f [Patrick Wendell] Updating Spark docs
      b552459 [Patrick Wendell] Andrew's feedback
      97720fa [Patrick Wendell] Review feedback
      aa2900b [Patrick Wendell] SPARK-1619 Launch spark-shell with spark-submit
      dc3b640a
  11. Mar 30, 2014
    • Prashant Sharma's avatar
      SPARK-1336 Reducing the output of run-tests script. · df1b9f7b
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Prashant Sharma <scrapcodes@gmail.com>
      
      Closes #262 from ScrapCodes/SPARK-1336/ReduceVerbosity and squashes the following commits:
      
      87dfa54 [Prashant Sharma] Further reduction in noise and made pyspark tests to fail fast.
      811170f [Prashant Sharma] Reducing the ouput of run-tests script.
      df1b9f7b
  12. Mar 24, 2014
    • Patrick Wendell's avatar
      SPARK-1094 Support MiMa for reporting binary compatibility accross versions. · dc126f21
      Patrick Wendell authored
      This adds some changes on top of the initial work by @scrapcodes in #20:
      
      The goal here is to do automated checking of Spark commits to determine whether they break binary compatibility.
      
      1. Special case for inner classes of package-private objects.
      2. Made tools classes accessible when running `spark-class`.
      3. Made some declared types in MLLib more general.
      4. Various other improvements to exclude-generation script.
      5. In-code documentation.
      
      Author: Patrick Wendell <pwendell@gmail.com>
      Author: Prashant Sharma <prashant.s@imaginea.com>
      Author: Prashant Sharma <scrapcodes@gmail.com>
      
      Closes #207 from pwendell/mima and squashes the following commits:
      
      22ae267 [Patrick Wendell] New binary changes after upmerge
      6c2030d [Patrick Wendell] Merge remote-tracking branch 'apache/master' into mima
      3666cf1 [Patrick Wendell] Minor style change
      0e0f570 [Patrick Wendell] Small fix and removing directory listings
      647c547 [Patrick Wendell] Reveiw feedback.
      c39f3b5 [Patrick Wendell] Some enhancements to binary checking.
      4c771e0 [Prashant Sharma] Added a tool to generate mima excludes and also adapted build to pick automatically.
      b551519 [Prashant Sharma] adding a new exclude after rebasing with master
      651844c [Prashant Sharma] Support MiMa for reporting binary compatibility accross versions.
      dc126f21
    • Prashant Sharma's avatar
      SPARK-1144 Added license and RAT to check licenses. · 21109fba
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #125 from ScrapCodes/rat-integration and squashes the following commits:
      
      64f7c7d [Prashant Sharma] added license headers.
      fcf28b1 [Prashant Sharma] Review feedback.
      c0648db [Prashant Sharma] SPARK-1144 Added license and RAT to check licenses.
      21109fba
  13. Jan 20, 2014
  14. Jan 04, 2014
  15. Jan 03, 2014
  16. Nov 19, 2013
  17. Nov 13, 2013
  18. Aug 29, 2013
  19. Jul 15, 2013
  20. May 14, 2013
  21. Apr 30, 2013
  22. Mar 11, 2013
  23. Jan 21, 2013
  24. Dec 30, 2012
  25. Dec 18, 2012
  26. Oct 01, 2012
  27. Sep 16, 2012
    • Andy Konwinski's avatar
      - Add docs/api to .gitignore · 52c29071
      Andy Konwinski authored
      - Rework/expand the nav bar with more of the docs site
      - Removing parts of docs about EC2 and Mesos that differentiate between
        running 0.5 and before
          - Merged subheadings from running-on-amazon-ec2.html that are still relevant
            (i.e., "Using a newer version of Spark" and "Accessing Data in S3") into
            ec2-scripts.html and deleted running-on-amazon-ec2.html
      - Added some TODO comments to a few docs
      - Updated the blurb about AMP Camp
      - Renamed programming-guide to spark-programming-guide
      - Fixing typos/etc. in Standalone Spark doc
      52c29071
  28. Sep 12, 2012
  29. Aug 01, 2012
  30. Jun 29, 2012
  31. Feb 08, 2011
  32. Feb 02, 2011
Loading