Skip to content
Snippets Groups Projects
  1. Dec 04, 2015
    • Josh Rosen's avatar
      [SPARK-12112][BUILD] Upgrade to SBT 0.13.9 · b7204e1d
      Josh Rosen authored
      We should upgrade to SBT 0.13.9, since this is a requirement in order to use SBT's new Maven-style resolution features (which will be done in a separate patch, because it's blocked by some binary compatibility issues in the POM reader plugin).
      
      I also upgraded Scalastyle to version 0.8.0, which was necessary in order to fix a Scala 2.10.5 compatibility issue (see https://github.com/scalastyle/scalastyle/issues/156). The newer Scalastyle is slightly stricter about whitespace surrounding tokens, so I fixed the new style violations.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #10112 from JoshRosen/upgrade-to-sbt-0.13.9.
      b7204e1d
    • Dmitry Erastov's avatar
      [SPARK-6990][BUILD] Add Java linting script; fix minor warnings · d0d82227
      Dmitry Erastov authored
      This replaces https://github.com/apache/spark/pull/9696
      
      Invoke Checkstyle and print any errors to the console, failing the step.
      Use Google's style rules modified according to
      https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide
      Some important checks are disabled (see TODOs in `checkstyle.xml`) due to
      multiple violations being present in the codebase.
      
      Suggest fixing those TODOs in a separate PR(s).
      
      More on Checkstyle can be found on the [official website](http://checkstyle.sourceforge.net/).
      
      Sample output (from [build 46345](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/46345/consoleFull)) (duplicated because I run the build twice with different profiles):
      
      > Checkstyle checks failed at following occurrences:
      [ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java:[217,7] (coding) MissingSwitchDefault: switch without "default" clause.
      > [ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:[198,10] (modifier) ModifierOrder: 'protected' modifier out of order with the JLS suggestions.
      > [ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/UnsafeRowParquetRecordReader.java:[217,7] (coding) MissingSwitchDefault: switch without "default" clause.
      > [ERROR] src/main/java/org/apache/spark/sql/execution/datasources/parquet/SpecificParquetRecordReaderBase.java:[198,10] (modifier) ModifierOrder: 'protected' modifier out of order with the JLS suggestions.
      > [error] running /home/jenkins/workspace/SparkPullRequestBuilder2/dev/lint-java ; received return code 1
      
      Also fix some of the minor violations that didn't require sweeping changes.
      
      Apologies for the previous botched PRs - I finally figured out the issue.
      
      cr: JoshRosen, pwendell
      
      > I state that the contribution is my original work, and I license the work to the project under the project's open source license.
      
      Author: Dmitry Erastov <derastov@gmail.com>
      
      Closes #9867 from dskrvk/master.
      d0d82227
  2. Nov 23, 2015
    • Josh Rosen's avatar
      [SPARK-4424] Remove spark.driver.allowMultipleContexts override in tests · 1b6e938b
      Josh Rosen authored
      This patch removes `spark.driver.allowMultipleContexts=true` from our test configuration. The multiple SparkContexts check was originally disabled because certain tests suites in SQL needed to create multiple contexts. As far as I know, this configuration change is no longer necessary, so we should remove it in order to make it easier to find test cleanup bugs.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #9865 from JoshRosen/SPARK-4424.
      1b6e938b
  3. Nov 18, 2015
  4. Nov 17, 2015
  5. Nov 16, 2015
  6. Nov 11, 2015
  7. Nov 10, 2015
    • Josh Rosen's avatar
      [SPARK-9818] Re-enable Docker tests for JDBC data source · 1dde39d7
      Josh Rosen authored
      This patch re-enables tests for the Docker JDBC data source. These tests were reverted in #4872 due to transitive dependency conflicts introduced by the `docker-client` library. This patch should avoid those problems by using a version of `docker-client` which shades its transitive dependencies and by performing some build-magic to work around problems with that shaded JAR.
      
      In addition, I significantly refactored the tests to simplify the setup and teardown code and to fix several Docker networking issues which caused problems when running in `boot2docker`.
      
      Closes #8101.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      Author: Yijie Shen <henry.yijieshen@gmail.com>
      
      Closes #9503 from JoshRosen/docker-jdbc-tests.
      1dde39d7
  8. Nov 09, 2015
    • Burak Yavuz's avatar
      [SPARK-11198][STREAMING][KINESIS] Support de-aggregation of records during recovery · 26062d22
      Burak Yavuz authored
      While the KCL handles de-aggregation during the regular operation, during recovery we use the lower level api, and therefore need to de-aggregate the records.
      
      tdas Testing is an issue, we need protobuf magic to do the aggregated records. Maybe we could depend on KPL for tests?
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #9403 from brkyvz/kinesis-deaggregation.
      26062d22
  9. Nov 04, 2015
    • Josh Rosen's avatar
      [SPARK-11491] Update build to use Scala 2.10.5 · ce5e6a28
      Josh Rosen authored
      Spark should build against Scala 2.10.5, since that includes a fix for Scaladoc that will fix doc snapshot publishing: https://issues.scala-lang.org/browse/SI-8479
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #9450 from JoshRosen/upgrade-to-scala-2.10.5.
      ce5e6a28
    • Adam Roberts's avatar
      [SPARK-10949] Update Snappy version to 1.1.2 · 701fb505
      Adam Roberts authored
      This is an updated version of #8995 by a-roberts. Original description follows:
      
      Snappy now supports concatenation of serialized streams, this patch contains a version number change and the "does not support" test is now a "supports" test.
      
      Snappy 1.1.2 changelog mentions:
      
      > snappy-java-1.1.2 (22 September 2015)
      > This is a backward compatible release for 1.1.x.
      > Add AIX (32-bit) support.
      > There is no upgrade for the native libraries of the other platforms.
      
      > A major change since 1.1.1 is a support for reading concatenated results of SnappyOutputStream(s)
      > snappy-java-1.1.2-RC2 (18 May 2015)
      > Fix #107: SnappyOutputStream.close() is not idempotent
      > snappy-java-1.1.2-RC1 (13 May 2015)
      > SnappyInputStream now supports reading concatenated compressed results of SnappyOutputStream
      > There has been no compressed format change since 1.0.5.x. So You can read the compressed results > interchangeablly between these versions.
      > Fixes a problem when java.io.tmpdir does not exist.
      
      Closes #8995.
      
      Author: Adam Roberts <aroberts@uk.ibm.com>
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #9439 from JoshRosen/update-snappy.
      701fb505
  10. Nov 02, 2015
  11. Oct 25, 2015
    • Xiangrui Meng's avatar
      [SPARK-11127][STREAMING] upgrade AWS SDK and Kinesis Client Library (KCL) · 87f82a5f
      Xiangrui Meng authored
      AWS SDK 1.9.40 is the latest 1.9.x release. KCL 1.5.1 is the latest release that using AWS SDK 1.9.x. The main goal is to have Kinesis consumer be able to read messages generated from Kinesis Producer Library (KPL). The API should be compatible with old versions.
      
      tdas brkyvz
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #9153 from mengxr/SPARK-11127.
      87f82a5f
  12. Oct 07, 2015
  13. Oct 04, 2015
  14. Sep 15, 2015
  15. Sep 13, 2015
  16. Aug 28, 2015
    • Marcelo Vanzin's avatar
      [SPARK-9284] [TESTS] Allow all tests to run without an assembly. · c53c902f
      Marcelo Vanzin authored
      This change aims at speeding up the dev cycle a little bit, by making
      sure that all tests behave the same w.r.t. where the code to be tested
      is loaded from. Namely, that means that tests don't rely on the assembly
      anymore, rather loading all needed classes from the build directories.
      
      The main change is to make sure all build directories (classes and test-classes)
      are added to the classpath of child processes when running tests.
      
      YarnClusterSuite required some custom code since the executors are run
      differently (i.e. not through the launcher library, like standalone and
      Mesos do).
      
      I also found a couple of tests that could leak a SparkContext on failure,
      and added code to handle those.
      
      With this patch, it's possible to run the following command from a clean
      source directory and have all tests pass:
      
        mvn -Pyarn -Phadoop-2.4 -Phive-thriftserver install
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #7629 from vanzin/SPARK-9284.
      c53c902f
  17. Aug 25, 2015
  18. Aug 21, 2015
    • Imran Rashid's avatar
      [SPARK-9439] [YARN] External shuffle service robust to NM restarts using leveldb · 708036c1
      Imran Rashid authored
      https://issues.apache.org/jira/browse/SPARK-9439
      
      In general, Yarn apps should be robust to NodeManager restarts.  However, if you run spark with the external shuffle service on, after a NM restart all shuffles fail, b/c the shuffle service has lost some state with info on each executor.  (Note the shuffle data is perfectly fine on disk across a NM restart, the problem is we've lost the small bit of state that lets us *find* those files.)
      
      The solution proposed here is that the external shuffle service can write out its state to leveldb (backed by a local file) every time an executor is added.  When running with yarn, that file is in the NM's local dir.  Whenever the service is started, it looks for that file, and if it exists, it reads the file and re-registers all executors there.
      
      Nothing is changed in non-yarn modes with this patch.  The service is not given a place to save the state to, so it operates the same as before.  This should make it easy to update other cluster managers as well, by just supplying the right file & the equivalent of yarn's `initializeApplication` -- I'm not familiar enough with those modes to know how to do that.
      
      Author: Imran Rashid <irashid@cloudera.com>
      
      Closes #7943 from squito/leveldb_external_shuffle_service_NM_restart and squashes the following commits:
      
      0d285d3 [Imran Rashid] review feedback
      70951d6 [Imran Rashid] Merge branch 'master' into leveldb_external_shuffle_service_NM_restart
      5c71c8c [Imran Rashid] save executor to db before registering; style
      2499c8c [Imran Rashid] explicit dependency on jackson-annotations
      795d28f [Imran Rashid] review feedback
      81f80e2 [Imran Rashid] Merge branch 'master' into leveldb_external_shuffle_service_NM_restart
      594d520 [Imran Rashid] use json to serialize application executor info
      1a7980b [Imran Rashid] version
      8267d2a [Imran Rashid] style
      e9f99e8 [Imran Rashid] cleanup the handling of bad dbs a little
      9378ba3 [Imran Rashid] fail gracefully on corrupt leveldb files
      acedb62 [Imran Rashid] switch to writing out one record per executor
      79922b7 [Imran Rashid] rely on yarn to call stopApplication; assorted cleanup
      12b6a35 [Imran Rashid] save registered executors when apps are removed; add tests
      c878fbe [Imran Rashid] better explanation of shuffle service port handling
      694934c [Imran Rashid] only open leveldb connection once per service
      d596410 [Imran Rashid] store executor data in leveldb
      59800b7 [Imran Rashid] Files.move in case renaming is unsupported
      32fe5ae [Imran Rashid] Merge branch 'master' into external_shuffle_service_NM_restart
      d7450f0 [Imran Rashid] style
      f729e2b [Imran Rashid] debugging
      4492835 [Imran Rashid] lol, dont use a PrintWriter b/c of scalastyle checks
      0a39b98 [Imran Rashid] Merge branch 'master' into external_shuffle_service_NM_restart
      55f49fc [Imran Rashid] make sure the service doesnt die if the registered executor file is corrupt; add tests
      245db19 [Imran Rashid] style
      62586a6 [Imran Rashid] just serialize the whole executors map
      bdbbf0d [Imran Rashid] comments, remove some unnecessary changes
      857331a [Imran Rashid] better tests & comments
      bb9d1e6 [Imran Rashid] formatting
      bdc4b32 [Imran Rashid] rename
      86e0cb9 [Imran Rashid] for tests, shuffle service finds an open port
      23994ff [Imran Rashid] style
      7504de8 [Imran Rashid] style
      a36729c [Imran Rashid] cleanup
      efb6195 [Imran Rashid] proper unit test, and no longer leak if apps stop during NM restart
      dd93dc0 [Imran Rashid] test for shuffle service w/ NM restarts
      d596969 [Imran Rashid] cleanup imports
      0e9d69b [Imran Rashid] better names
      9eae119 [Imran Rashid] cleanup lots of duplication
      1136f44 [Imran Rashid] test needs to have an actual shuffle
      0b588bd [Imran Rashid] more fixes ...
      ad122ef [Imran Rashid] more fixes
      5e5a7c3 [Imran Rashid] fix build
      c69f46b [Imran Rashid] maybe working version, needs tests & cleanup ...
      bb3ba49 [Imran Rashid] minor cleanup
      36127d3 [Imran Rashid] wip
      b9d2ced [Imran Rashid] incomplete setup for external shuffle service tests
      708036c1
  19. Aug 18, 2015
  20. Aug 17, 2015
    • Cheng Lian's avatar
      [SPARK-9974] [BUILD] [SQL] Makes sure com.twitter:parquet-hadoop-bundle:1.6.0... · 52ae9525
      Cheng Lian authored
      [SPARK-9974] [BUILD] [SQL] Makes sure com.twitter:parquet-hadoop-bundle:1.6.0 is in SBT assembly jar
      
      PR #7967 enables Spark SQL to persist Parquet tables in Hive compatible format when possible. One of the consequence is that, we have to set input/output classes to `MapredParquetInputFormat`/`MapredParquetOutputFormat`, which rely on com.twitter:parquet-hadoop:1.6.0 bundled with Hive 1.2.1.
      
      When loading such a table in Spark SQL, `o.a.h.h.ql.metadata.Table` first loads these input/output format classes, and thus classes in com.twitter:parquet-hadoop:1.6.0.  However, the scope of this dependency is defined as "runtime", and is not packaged into Spark assembly jar.  This results in a `ClassNotFoundException`.
      
      This issue can be worked around by asking users to add parquet-hadoop 1.6.0 via the `--driver-class-path` option.  However, considering Maven build is immune to this problem, I feel it can be confusing and inconvenient for users.
      
      So this PR fixes this issue by changing scope of parquet-hadoop 1.6.0 to "compile".
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #8198 from liancheng/spark-9974/bundle-parquet-1.6.0.
      52ae9525
  21. Aug 11, 2015
    • Andrew Or's avatar
      [SPARK-9649] Fix flaky test MasterSuite again - disable REST · ca8f70e9
      Andrew Or authored
      The REST server is not actually used in most tests and so we can disable it. It is a source of flakiness because it tries to bind to a specific port in vain. There was also some code that avoided the shuffle service in tests. This is actually not necessary because the shuffle service is already off by default.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #8084 from andrewor14/fix-master-suite-again.
      ca8f70e9
  22. Aug 10, 2015
    • Prabeesh K's avatar
      [SPARK-5155] [PYSPARK] [STREAMING] Mqtt streaming support in Python · 853809e9
      Prabeesh K authored
      This PR is based on #4229, thanks prabeesh.
      
      Closes #4229
      
      Author: Prabeesh K <prabsmails@gmail.com>
      Author: zsxwing <zsxwing@gmail.com>
      Author: prabs <prabsmails@gmail.com>
      Author: Prabeesh K <prabeesh.k@namshi.com>
      
      Closes #7833 from zsxwing/pr4229 and squashes the following commits:
      
      9570bec [zsxwing] Fix the variable name and check null in finally
      4a9c79e [zsxwing] Fix pom.xml indentation
      abf5f18 [zsxwing] Merge branch 'master' into pr4229
      935615c [zsxwing] Fix the flaky MQTT tests
      47278c5 [zsxwing] Include the project class files
      478f844 [zsxwing] Add unpack
      5f8a1d4 [zsxwing] Make the maven build generate the test jar for Python MQTT tests
      734db99 [zsxwing] Merge branch 'master' into pr4229
      126608a [Prabeesh K] address the comments
      b90b709 [Prabeesh K] Merge pull request #1 from zsxwing/pr4229
      d07f454 [zsxwing] Register StreamingListerner before starting StreamingContext; Revert unncessary changes; fix the python unit test
      a6747cb [Prabeesh K] wait for starting the receiver before publishing data
      87fc677 [Prabeesh K] address the comments:
      97244ec [zsxwing] Make sbt build the assembly test jar for streaming mqtt
      80474d1 [Prabeesh K] fix
      1f0cfe9 [Prabeesh K] python style fix
      e1ee016 [Prabeesh K] scala style fix
      a5a8f9f [Prabeesh K] added Python test
      9767d82 [Prabeesh K] implemented Python-friendly class
      a11968b [Prabeesh K] fixed python style
      795ec27 [Prabeesh K] address comments
      ee387ae [Prabeesh K] Fix assembly jar location of mqtt-assembly
      3f4df12 [Prabeesh K] updated version
      b34c3c1 [prabs] adress comments
      3aa7fff [prabs] Added Python streaming mqtt word count example
      b7d42ff [prabs] Mqtt streaming support in Python
      853809e9
  23. Aug 04, 2015
    • tedyu's avatar
      [SPARK-8064] [BUILD] Follow-up. Undo change from SPARK-9507 that was accidentally reverted · b211cbc7
      tedyu authored
      This PR removes the dependency reduced POM hack brought back by #7191
      
      Author: tedyu <yuzhihong@gmail.com>
      
      Closes #7919 from tedyu/master and squashes the following commits:
      
      1bfbd7b [tedyu] [BUILD] Remove dependency reduced POM hack
      b211cbc7
    • Sean Owen's avatar
      [SPARK-9534] [BUILD] Enable javac lint for scalac parity; fix a lot of build... · 76d74090
      Sean Owen authored
      [SPARK-9534] [BUILD] Enable javac lint for scalac parity; fix a lot of build warnings, 1.5.0 edition
      
      Enable most javac lint warnings; fix a lot of build warnings. In a few cases, touch up surrounding code in the process.
      
      I'll explain several of the changes inline in comments.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #7862 from srowen/SPARK-9534 and squashes the following commits:
      
      ea51618 [Sean Owen] Enable most javac lint warnings; fix a lot of build warnings. In a few cases, touch up surrounding code in the process.
      76d74090
  24. Aug 03, 2015
    • Steve Loughran's avatar
      [SPARK-8064] [SQL] Build against Hive 1.2.1 · a2409d1c
      Steve Loughran authored
      Cherry picked the parts of the initial SPARK-8064 WiP branch needed to get sql/hive to compile against hive 1.2.1. That's the ASF release packaged under org.apache.hive, not any fork.
      
      Tests not run yet: that's what the machines are for
      
      Author: Steve Loughran <stevel@hortonworks.com>
      Author: Cheng Lian <lian@databricks.com>
      Author: Michael Armbrust <michael@databricks.com>
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #7191 from steveloughran/stevel/feature/SPARK-8064-hive-1.2-002 and squashes the following commits:
      
      7556d85 [Cheng Lian] Updates .q files and corresponding golden files
      ef4af62 [Steve Loughran] Merge commit '6a92bb09f46a04d6cd8c41bdba3ecb727ebb9030' into stevel/feature/SPARK-8064-hive-1.2-002
      6a92bb0 [Cheng Lian] Overrides HiveConf time vars
      dcbb391 [Cheng Lian] Adds com.twitter:parquet-hadoop-bundle:1.6.0 for Hive Parquet SerDe
      0bbe475 [Steve Loughran] SPARK-8064 scalastyle rejects the standard Hadoop ASF license header...
      fdf759b [Steve Loughran] SPARK-8064 classpath dependency suite to be in sync with shading in final (?) hive-exec spark
      7a6c727 [Steve Loughran] SPARK-8064 switch to second staging repo of the spark-hive artifacts. This one has the protobuf-shaded hive-exec jar
      376c003 [Steve Loughran] SPARK-8064 purge duplicate protobuf declaration
      2c74697 [Steve Loughran] SPARK-8064 switch to the protobuf shaded hive-exec jar with tests to chase it down
      cc44020 [Steve Loughran] SPARK-8064 remove hadoop.version from runtest.py, as profile will fix that automatically.
      6901fa9 [Steve Loughran] SPARK-8064 explicit protobuf import
      da310dc [Michael Armbrust] Fixes for Hive tests.
      a775a75 [Steve Loughran] SPARK-8064 cherry-pick-incomplete
      7404f34 [Patrick Wendell] Add spark-hive staging repo
      832c164 [Steve Loughran] SPARK-8064 try to supress compiler warnings on Complex.java pasted-thrift-code
      312c0d4 [Steve Loughran] SPARK-8064  maven/ivy dependency purge; calcite declaration needed
      fa5ae7b [Steve Loughran] HIVE-8064 fix up hive-thriftserver dependencies and cut back on evicted references in the hive- packages; this keeps mvn and ivy resolution compatible, as the reconciliation policy is "by hand"
      c188048 [Steve Loughran] SPARK-8064 manage the Hive depencencies to that -things that aren't needed are excluded -sql/hive built with ivy is in sync with the maven reconciliation policy, rather than latest-first
      4c8be8d [Cheng Lian] WIP: Partial fix for Thrift server and CLI tests
      314eb3c [Steve Loughran] SPARK-8064 deprecation warning  noise in one of the tests
      17b0341 [Steve Loughran] SPARK-8064 IDE-hinted cleanups of Complex.java to reduce compiler warnings. It's all autogenerated code, so still ugly.
      d029b92 [Steve Loughran] SPARK-8064 rely on unescaping to have already taken place, so go straight to map of serde options
      23eca7e [Steve Loughran] HIVE-8064 handle raw and escaped property tokens
      54d9b06 [Steve Loughran] SPARK-8064 fix compilation regression surfacing from rebase
      0b12d5f [Steve Loughran] HIVE-8064 use subset of hive complex type whose types deserialize
      fce73b6 [Steve Loughran] SPARK-8064 poms rely implicitly on the version of kryo chill provides
      fd3aa5d [Steve Loughran] SPARK-8064 version of hive to d/l from ivy is 1.2.1
      dc73ece [Steve Loughran] SPARK-8064 revert to master's determinstic pushdown strategy
      d3c1e4a [Steve Loughran] SPARK-8064 purge UnionType
      051cc21 [Steve Loughran] SPARK-8064 switch to an unshaded version of hive-exec-core, which must have been built with Kryo 2.21. This currently looks for a (locally built) version 1.2.1.spark
      6684c60 [Steve Loughran] SPARK-8064 ignore RTE raised in blocking process.exitValue() call
      e6121e5 [Steve Loughran] SPARK-8064 address review comments
      aa43dc6 [Steve Loughran] SPARK-8064  more robust teardown on JavaMetastoreDatasourcesSuite
      f2bff01 [Steve Loughran] SPARK-8064 better takeup of asynchronously caught error text
      8b1ef38 [Steve Loughran] SPARK-8064: on failures executing spark-submit in HiveSparkSubmitSuite, print command line and all logged output.
      5a9ce6b [Steve Loughran] SPARK-8064 add explicit reason for kv split failure, rather than array OOB. *does not address the issue*
      642b63a [Steve Loughran] SPARK-8064 reinstate something cut briefly during rebasing
      97194dc [Steve Loughran] SPARK-8064 add extra logging to the YarnClusterSuite classpath test. There should be no reason why this is failing on jenkins, but as it is (and presumably its CP-related), improve the logging including any exception raised.
      335357f [Steve Loughran] SPARK-8064 fail fast on thrive process spawning tests on exit codes and/or error string patterns seen in log.
      3ed872f [Steve Loughran] SPARK-8064 rename field double to  dbl
      bca55e5 [Steve Loughran] SPARK-8064 missed one of the `date` escapes
      41d6479 [Steve Loughran] SPARK-8064 wrap tests with withTable() calls to avoid table-exists exceptions
      2bc29a4 [Steve Loughran] SPARK-8064 ParquetSuites to escape `date` field name
      1ab9bc4 [Steve Loughran] SPARK-8064 TestHive to use sered2.thrift.test.Complex
      bf3a249 [Steve Loughran] SPARK-8064: more resubmit than fix; tighten startup timeout to 60s. Still no obvious reason why jersey server code in spark-assembly isn't being picked up -it hasn't been shaded
      c829b8f [Steve Loughran] SPARK-8064: reinstate yarn-rm-server dependencies to hive-exec to ensure that jersey server is on classpath on hadoop versions < 2.6
      0b0f738 [Steve Loughran] SPARK-8064: thrift server startup to fail fast on any exception in the main thread
      13abaf1 [Steve Loughran] SPARK-8064 Hive compatibilty tests sin sync with explain/show output from Hive 1.2.1
      d14d5ea [Steve Loughran] SPARK-8064: DATE is now a predicate; you can't use it as a field in select ops
      26eef1c [Steve Loughran] SPARK-8064: HIVE-9039 renamed TOK_UNION => TOK_UNIONALL while adding TOK_UNIONDISTINCT
      3d64523 [Steve Loughran] SPARK-8064 improve diagns on uknown token; fix scalastyle failure
      d0360f6 [Steve Loughran] SPARK-8064: delicate merge in of the branch vanzin/hive-1.1
      1126e5a [Steve Loughran] SPARK-8064: name of unrecognized file format wasn't appearing in error text
      8cb09c4 [Steve Loughran] SPARK-8064: test resilience/assertion improvements. Independent of the rest of the work; can be backported to earlier versions
      dec12cb [Steve Loughran] SPARK-8064: when a CLI suite test fails include the full output text in the raised exception; this ensures that the stdout/stderr is included in jenkins reports, so it becomes possible to diagnose the cause.
      463a670 [Steve Loughran] SPARK-8064 run-tests.py adds a hadoop-2.6 profile, and changes info messages to say "w/Hive 1.2.1" in console output
      2531099 [Steve Loughran] SPARK-8064 successful attempt to get rid of pentaho as a transitive dependency of hive-exec
      1d59100 [Steve Loughran] SPARK-8064 (unsuccessful) attempt to get rid of pentaho as a transitive dependency of hive-exec
      75733fc [Steve Loughran] SPARK-8064 change thrift binary startup message to "Starting ThriftBinaryCLIService on port"
      3ebc279 [Steve Loughran] SPARK-8064 move strings used to check for http/bin thrift services up into constants
      c80979d [Steve Loughran] SPARK-8064: SparkSQLCLIDriver drops remote mode support. CLISuite Tests pass instead of timing out: undetected regression?
      27e8370 [Steve Loughran] SPARK-8064 fix some style & IDE warnings
      00e50d6 [Steve Loughran] SPARK-8064 stop excluding hive shims from dependency (commented out , for now)
      cb4f142 [Steve Loughran] SPARK-8054 cut pentaho dependency from calcite
      f7aa9cb [Steve Loughran] SPARK-8064 everything compiles with some commenting and moving of classes into a hive package
      6c310b4 [Steve Loughran] SPARK-8064 subclass  Hive ServerOptionsProcessor to make it public again
      f61a675 [Steve Loughran] SPARK-8064 thrift server switched to Hive 1.2.1, though it doesn't compile everywhere
      4890b9d [Steve Loughran] SPARK-8064, build against Hive 1.2.1
      a2409d1c
  25. Aug 02, 2015
    • Sean Owen's avatar
      [SPARK-9521] [BUILD] Require Maven 3.3.3+ in the build · 9d1c0252
      Sean Owen authored
      Enforce Maven 3.3.3+ in the build. (Also update the scala compiler plugin while we're at it.)
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #7852 from srowen/SPARK-9521 and squashes the following commits:
      
      3093039 [Sean Owen] Enforce Maven 3.3.3+ in the build. (Also update the scala compiler plugin while we're at it.)
      9d1c0252
  26. Jul 31, 2015
    • Sean Owen's avatar
      [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated · 6e5fd613
      Sean Owen authored
      Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix management of shade plugin version so children inherit it; bump assembly plugin version while here
      
      See https://issues.apache.org/jira/browse/SPARK-8819
      
      I verified that `mvn clean package -DskipTests` works with Maven 3.3.3.
      
      pwendell are you up for trying this for the 1.5.0 release?
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #7826 from srowen/SPARK-9507 and squashes the following commits:
      
      e0b0fd2 [Sean Owen] Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix management of shade plugin version so children inherit it; bump assembly plugin version while here
      6e5fd613
    • zsxwing's avatar
      [SPARK-8564] [STREAMING] Add the Python API for Kinesis · 3afc1de8
      zsxwing authored
      This PR adds the Python API for Kinesis, including a Python example and a simple unit test.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6955 from zsxwing/kinesis-python and squashes the following commits:
      
      e42e471 [zsxwing] Merge branch 'master' into kinesis-python
      455f7ea [zsxwing] Remove streaming_kinesis_asl_assembly module and simply add the source folder to streaming_kinesis_asl module
      32e6451 [zsxwing] Merge remote-tracking branch 'origin/master' into kinesis-python
      5082d28 [zsxwing] Fix the syntax error for Python 2.6
      fca416b [zsxwing] Fix wrong comparison
      96670ff [zsxwing] Fix the compilation error after merging master
      756a128 [zsxwing] Merge branch 'master' into kinesis-python
      6c37395 [zsxwing] Print stack trace for debug
      7c5cfb0 [zsxwing] RUN_KINESIS_TESTS -> ENABLE_KINESIS_TESTS
      cc9d071 [zsxwing] Fix the python test errors
      466b425 [zsxwing] Add python tests for Kinesis
      e33d505 [zsxwing] Merge remote-tracking branch 'origin/master' into kinesis-python
      3da2601 [zsxwing] Fix the kinesis folder
      687446b [zsxwing] Fix the error message and the maven output path
      add2beb [zsxwing] Merge branch 'master' into kinesis-python
      4957c0b [zsxwing] Add the Python API for Kinesis
      3afc1de8
  27. Jul 23, 2015
  28. Jul 21, 2015
    • Michael Allman's avatar
      [SPARK-8401] [BUILD] Scala version switching build enhancements · f5b6dc5e
      Michael Allman authored
      These commits address a few minor issues in the Scala cross-version support in the build:
      
        1. Correct two missing `${scala.binary.version}` pom file substitutions.
        2. Don't update `scala.binary.version` in parent POM. This property is set through profiles.
        3. Update the source of the generated scaladocs in `docs/_plugins/copy_api_dirs.rb`.
        4. Factor common code out of `dev/change-version-to-*.sh` and add some validation. We also test `sed` to see if it's GNU sed and try `gsed` as an alternative if not. This prevents the script from running with a non-GNU sed.
      
      This is my original work and I license this work to the Spark project under the Apache License.
      
      Author: Michael Allman <michael@videoamp.com>
      
      Closes #6832 from mallman/scala-versions and squashes the following commits:
      
      cde2f17 [Michael Allman] Delete dev/change-version-to-*.sh, replacing them with single dev/change-scala-version.sh script that takes a version as argument
      02296f2 [Michael Allman] Make the scala version change scripts cross-platform by restricting ourselves to POSIX sed syntax instead of looking for GNU sed
      ad9b40a [Michael Allman] Factor change-scala-version.sh out of change-version-to-*.sh, adding command line argument validation and testing for GNU sed
      bdd20bf [Michael Allman] Update source of scaladocs when changing Scala version
      475088e [Michael Allman] Replace jackson-module-scala_2.10 with jackson-module-scala_${scala.binary.version}
      f5b6dc5e
  29. Jul 19, 2015
  30. Jul 16, 2015
    • Jan Prach's avatar
      [SPARK-9015] [BUILD] Clean project import in scala ide · b536d5dc
      Jan Prach authored
      Cleanup maven for a clean import in scala-ide / eclipse.
      
      * remove groovy plugin which is really not needed at all
      * add-source from build-helper-maven-plugin is not needed as recent version of scala-maven-plugin do it automatically
      * add lifecycle-mapping plugin to hide a few useless warnings from ide
      
      Author: Jan Prach <jendap@gmail.com>
      
      Closes #7375 from jendap/clean-project-import-in-scala-ide and squashes the following commits:
      
      c4b4c0f [Jan Prach] fix whitespaces
      5a83e07 [Jan Prach] Revert "remove java compiler warnings from java tests"
      312007e [Jan Prach] scala-maven-plugin itself add scala sources by default
      f47d856 [Jan Prach] remove spark-1.4-staging repository
      c8a54db [Jan Prach] remove java compiler warnings from java tests
      999a068 [Jan Prach] remove some maven warnings in scala ide
      80fbdc5 [Jan Prach] remove groovy and gmavenplus plugin
      b536d5dc
  31. Jul 15, 2015
    • zsxwing's avatar
      [SPARK-6602][Core]Replace Akka Serialization with Spark Serializer · b9a922e2
      zsxwing authored
      Replace Akka Serialization with Spark Serializer and add unit tests.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #7159 from zsxwing/remove-akka-serialization and squashes the following commits:
      
      fc0fca3 [zsxwing] Merge branch 'master' into remove-akka-serialization
      cf81a58 [zsxwing] Fix the code style
      73251c6 [zsxwing] Add test scope
      9ef4af9 [zsxwing] Add AkkaRpcEndpointRef.hashCode
      433115c [zsxwing] Remove final
      be3edb0 [zsxwing] Support deserializing RpcEndpointRef
      ecec410 [zsxwing] Replace Akka Serialization with Spark Serializer
      b9a922e2
  32. Jul 13, 2015
    • Hari Shreedharan's avatar
      [SPARK-8533] [STREAMING] Upgrade Flume to 1.6.0 · 0aed38e4
      Hari Shreedharan authored
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #6939 from harishreedharan/upgrade-flume-1.6.0 and squashes the following commits:
      
      94b80ae [Hari Shreedharan] [SPARK-8533][Streaming] Upgrade Flume to 1.6.0
      0aed38e4
  33. Jul 10, 2015
    • Iulian Dragos's avatar
      [SPARK-7944] [SPARK-8013] Remove most of the Spark REPL fork for Scala 2.11 · 11e22b74
      Iulian Dragos authored
      This PR removes most of the code in the Spark REPL for Scala 2.11 and leaves just a couple of overridden methods in `SparkILoop` in order to:
      
      - change welcome message
      - restrict available commands (like `:power`)
      - initialize Spark context
      
      The two codebases have diverged and it's extremely hard to backport fixes from the upstream REPL. This somewhat radical step is absolutely necessary in order to fix other REPL tickets (like SPARK-8013 - Hive Thrift server for 2.11). BTW, the Scala REPL has fixed the serialization-unfriendly wrappers thanks to ScrapCodes's work in [#4522](https://github.com/scala/scala/pull/4522)
      
      All tests pass and I tried the `spark-shell` on our Mesos cluster with some simple jobs (including with additional jars), everything looked good.
      
      As soon as Scala 2.11.7 is out we need to upgrade and get a shaded `jline` dependency, clearing the way for SPARK-8013.
      
      /cc pwendell
      
      Author: Iulian Dragos <jaguarul@gmail.com>
      
      Closes #6903 from dragos/issue/no-spark-repl-fork and squashes the following commits:
      
      c596c6f [Iulian Dragos] Merge branch 'master' into issue/no-spark-repl-fork
      2b1a305 [Iulian Dragos] Removed spaces around multiple imports.
      0ce67a6 [Iulian Dragos] Remove -verbose flag for java compiler (added by mistake in an earlier commit).
      10edaf9 [Iulian Dragos] Keep the jline dependency only in the 2.10 build.
      529293b [Iulian Dragos] Add back Spark REPL files to rat-excludes, since they are part of the 2.10 real.
      d85370d [Iulian Dragos] Remove jline dependency from the Spark REPL.
      b541930 [Iulian Dragos] Merge branch 'master' into issue/no-spark-repl-fork
      2b15962 [Iulian Dragos] Change jline dependency and bump Scala version.
      b300183 [Iulian Dragos] Rename package and add license on top of the file, remove files from rat-excludes and removed `-Yrepl-sync` per reviewer’s request.
      9d46d85 [Iulian Dragos] Fix SPARK-7944.
      abcc7cb [Iulian Dragos] Remove the REPL forked code.
      11e22b74
Loading