Skip to content
Snippets Groups Projects
  1. Jun 03, 2015
    • Patrick Wendell's avatar
      [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0 · 2c4d550e
      Patrick Wendell authored
      Author: Patrick Wendell <patrick@databricks.com>
      
      Closes #6328 from pwendell/spark-1.5-update and squashes the following commits:
      
      2f42d02 [Patrick Wendell] A few more excludes
      4bebcf0 [Patrick Wendell] Update to RC4
      61aaf46 [Patrick Wendell] Using new release candidate
      55f1610 [Patrick Wendell] Another exclude
      04b4f04 [Patrick Wendell] More issues with transient 1.4 changes
      36f549b [Patrick Wendell] [SPARK-7801] [BUILD] Updating versions to SPARK 1.5.0
      2c4d550e
  2. Jun 02, 2015
    • Marcelo Vanzin's avatar
      [SPARK-8015] [FLUME] Remove Guava dependency from flume-sink. · 0071bd8d
      Marcelo Vanzin authored
      The minimal change would be to disable shading of Guava in the module,
      and rely on the transitive dependency from other libraries instead. But
      since Guava's use is so localized, I think it's better to just not use
      it instead, so I replaced that code and removed all traces of Guava from
      the module's build.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #6555 from vanzin/SPARK-8015 and squashes the following commits:
      
      c0ceea8 [Marcelo Vanzin] Add comments about dependency management.
      c38228d [Marcelo Vanzin] Add guava dep in test scope.
      b7a0349 [Marcelo Vanzin] Add libthrift exclusion.
      6e0942d [Marcelo Vanzin] Add comment in pom.
      2d79260 [Marcelo Vanzin] [SPARK-8015] [flume] Remove Guava dependency from flume-sink.
      0071bd8d
  3. May 31, 2015
    • Reynold Xin's avatar
      [SPARK-3850] Trim trailing spaces for examples/streaming/yarn. · 564bc11e
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6530 from rxin/trim-whitespace-1 and squashes the following commits:
      
      7b7b3a0 [Reynold Xin] Reset again.
      dc14597 [Reynold Xin] Reset scalastyle.
      cd556c4 [Reynold Xin] YARN, Kinesis, Flume.
      4223fe1 [Reynold Xin] [SPARK-3850] Trim trailing spaces for examples/streaming.
      564bc11e
  4. May 30, 2015
    • Andrew Or's avatar
      [SPARK-7558] Guard against direct uses of FunSuite / FunSuiteLike · 609c4923
      Andrew Or authored
      This is a follow-up patch to #6441.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6510 from andrewor14/extends-funsuite-check and squashes the following commits:
      
      6618b46 [Andrew Or] Exempt SparkSinkSuite from the FunSuite check
      99d02ac [Andrew Or] Merge branch 'master' of github.com:apache/spark into extends-funsuite-check
      48874dd [Andrew Or] Guard against direct uses of FunSuite / FunSuiteLike
      609c4923
  5. May 29, 2015
    • Andrew Or's avatar
      [HOT FIX] [BUILD] Fix maven build failures · a4f24123
      Andrew Or authored
      This patch fixes a build break in maven caused by #6441.
      
      Note that this patch reverts the changes in flume-sink because
      this module does not currently depend on Spark core, but the
      tests require it. There is not an easy way to make this work
      because mvn test dependencies are not transitive (MNG-1378).
      
      For now, we will leave the one test suite in flume-sink out
      until we figure out a better solution. This patch is mainly
      intended to unbreak the maven build.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6511 from andrewor14/fix-build-mvn and squashes the following commits:
      
      3d53643 [Andrew Or] [HOT FIX #6441] Fix maven build failures
      a4f24123
    • Andrew Or's avatar
      [SPARK-7558] Demarcate tests in unit-tests.log · 9eb222c1
      Andrew Or authored
      Right now `unit-tests.log` are not of much value because we can't tell where the test boundaries are easily. This patch adds log statements before and after each test to outline the test boundaries, e.g.:
      
      ```
      ===== TEST OUTPUT FOR o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' =====
      
      15/05/27 12:36:39.596 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO SparkContext: Starting job: count at KryoSerializerSuite.scala:230
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Got job 3 (count at KryoSerializerSuite.scala:230) with 4 output partitions (allowLocal=false)
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Final stage: ResultStage 3(count at KryoSerializerSuite.scala:230)
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Parents of final stage: List()
      15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Missing parents: List()
      15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Submitting ResultStage 3 (ParallelCollectionRDD[5] at parallelize at KryoSerializerSuite.scala:230), which has no missing parents
      
      ...
      
      15/05/27 12:36:39.624 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO DAGScheduler: Job 3 finished: count at KryoSerializerSuite.scala:230, took 0.028563 s
      15/05/27 12:36:39.625 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO KryoSerializerSuite:
      
      ***** FINISHED o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' *****
      
      ...
      ```
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6441 from andrewor14/demarcate-tests and squashes the following commits:
      
      879b060 [Andrew Or] Fix compile after rebase
      d622af7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      017c8ba [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      7790b6c [Andrew Or] Fix tests after logical merge conflict
      c7460c0 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      c43ffc4 [Andrew Or] Fix tests?
      8882581 [Andrew Or] Fix tests
      ee22cda [Andrew Or] Fix log message
      fa9450e [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      12d1e1b [Andrew Or] Various whitespace changes (minor)
      69cbb24 [Andrew Or] Make all test suites extend SparkFunSuite instead of FunSuite
      bbce12e [Andrew Or] Fix manual things that cannot be covered through automation
      da0b12f [Andrew Or] Add core tests as dependencies in all modules
      f7d29ce [Andrew Or] Introduce base abstract class for all test suites
      9eb222c1
    • Reynold Xin's avatar
      [SPARK-7929] Turn whitespace checker on for more token types. · 97a60cf7
      Reynold Xin authored
      This is the last batch of changes to complete SPARK-7929.
      
      Previous related PRs:
      https://github.com/apache/spark/pull/6480
      https://github.com/apache/spark/pull/6478
      https://github.com/apache/spark/pull/6477
      https://github.com/apache/spark/pull/6476
      https://github.com/apache/spark/pull/6475
      https://github.com/apache/spark/pull/6474
      https://github.com/apache/spark/pull/6473
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6487 from rxin/whitespace-lint and squashes the following commits:
      
      b33d43d [Reynold Xin] [SPARK-7929] Turn whitespace checker on for more token types.
      97a60cf7
  6. May 18, 2015
    • jerluc's avatar
      [SPARK-7621] [STREAMING] Report Kafka errors to StreamingListeners · 0a7a94ea
      jerluc authored
      PR per [SPARK-7621](https://issues.apache.org/jira/browse/SPARK-7621), which makes both `KafkaReceiver` and `ReliableKafkaReceiver` report its errors to the `ReceiverTracker`, which in turn will add the events to the bus to fire off any registered `StreamingListener`s.
      
      Author: jerluc <jeremyalucas@gmail.com>
      
      Closes #6204 from jerluc/master and squashes the following commits:
      
      82439a5 [jerluc] [SPARK-7621] [STREAMING] Report Kafka errors to StreamingListeners
      0a7a94ea
    • Andrew Or's avatar
      [SPARK-7501] [STREAMING] DAG visualization: show DStream operations · b93c97d7
      Andrew Or authored
      This is similar to #5999, but for streaming. Roughly 200 lines are tests.
      
      One thing to note here is that we already do some kind of scoping thing for call sites, so this patch adds the new RDD operation scoping logic in the same place. Also, this patch adds a `try finally` block to set the relevant variables in a safer way.
      
      tdas zsxwing
      
      ------------------------
      **Before**
      <img src="https://cloud.githubusercontent.com/assets/2133137/7625996/d88211b8-f9b4-11e4-90b9-e11baa52d6d7.png" width="450px"/>
      
      --------------------------
      **After**
      <img src="https://cloud.githubusercontent.com/assets/2133137/7625997/e0878f8c-f9b4-11e4-8df3-7dd611b13c87.png" width="650px"/>
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6034 from andrewor14/dag-viz-streaming and squashes the following commits:
      
      932a64a [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      e685df9 [Andrew Or] Rename createRDDWith
      84d0656 [Andrew Or] Review feedback
      697c086 [Andrew Or] Fix tests
      53b9936 [Andrew Or] Set scopes for foreachRDD properly
      1881802 [Andrew Or] Refactor DStream scope names again
      af4ba8d [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      fd07d22 [Andrew Or] Make MQTT lower case
      f6de871 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      0ca1801 [Andrew Or] Remove a few unnecessary withScopes on aliases
      fa4e5fb [Andrew Or] Pass in input stream name rather than defining it from within
      1af0b0e [Andrew Or] Fix style
      074c00b [Andrew Or] Review comments
      d25a324 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      e4a93ac [Andrew Or] Fix tests?
      25416dc [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      9113183 [Andrew Or] Add tests for DStream scopes
      b3806ab [Andrew Or] Fix test
      bb80bbb [Andrew Or] Fix MIMA?
      5c30360 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      5703939 [Andrew Or] Rename operations that create InputDStreams
      7c4513d [Andrew Or] Group RDDs by DStream operations and batches
      bf0ab6e [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      05c2676 [Andrew Or] Wrap many more methods in withScope
      c121047 [Andrew Or] Merge branch 'master' of github.com:apache/spark into dag-viz-streaming
      65ef3e9 [Andrew Or] Fix NPE
      a0d3263 [Andrew Or] Scope streaming operations instead of RDD operations
      b93c97d7
  7. May 13, 2015
    • Hari Shreedharan's avatar
      [SPARK-7356] [STREAMING] Fix flakey tests in FlumePollingStreamSuite using... · 61d1e87c
      Hari Shreedharan authored
      [SPARK-7356] [STREAMING] Fix flakey tests in FlumePollingStreamSuite using SparkSink's batch CountDownLatch.
      
      This is meant to make the FlumePollingStreamSuite deterministic. Now we basically count the number of batches that have been completed - and then verify the results rather than sleeping for random periods of time.
      
      Author: Hari Shreedharan <hshreedharan@apache.org>
      
      Closes #5918 from harishreedharan/flume-test-fix and squashes the following commits:
      
      93f24f3 [Hari Shreedharan] Add an eventually block to ensure that all received data is processed. Refactor the dstream creation and remove redundant code.
      1108804 [Hari Shreedharan] [SPARK-7356][STREAMING] Fix flakey tests in FlumePollingStreamSuite using SparkSink's batch CountDownLatch.
      61d1e87c
  8. May 05, 2015
  9. May 01, 2015
    • cody koeninger's avatar
      [SPARK-2808][Streaming][Kafka] update kafka to 0.8.2 · 47864840
      cody koeninger authored
      i don't think this should be merged until after 1.3.0 is final
      
      Author: cody koeninger <cody@koeninger.org>
      Author: Helena Edelson <helena.edelson@datastax.com>
      
      Closes #4537 from koeninger/wip-2808-kafka-0.8.2-upgrade and squashes the following commits:
      
      803aa2c [cody koeninger] [SPARK-2808][Streaming][Kafka] code cleanup per TD
      e6dfaf6 [cody koeninger] [SPARK-2808][Streaming][Kafka] pointless whitespace change to trigger jenkins again
      1770abc [cody koeninger] [SPARK-2808][Streaming][Kafka] make waitUntilLeaderOffset easier to call, call it from python tests as well
      d4267e9 [cody koeninger] [SPARK-2808][Streaming][Kafka] fix stderr redirect in python test script
      30d991d [cody koeninger] [SPARK-2808][Streaming][Kafka] remove stderr prints since it breaks python 3 syntax
      1d896e2 [cody koeninger] [SPARK-2808][Streaming][Kafka] add even even more logging to python test
      4c4557f [cody koeninger] [SPARK-2808][Streaming][Kafka] add even more logging to python test
      115aeee [cody koeninger] Merge branch 'master' into wip-2808-kafka-0.8.2-upgrade
      2712649 [cody koeninger] [SPARK-2808][Streaming][Kafka] add more logging to python test, see why its timing out in jenkins
      2b92d3f [cody koeninger] [SPARK-2808][Streaming][Kafka] wait for leader offsets in the java test as well
      3824ce3 [cody koeninger] [SPARK-2808][Streaming][Kafka] naming / comments per tdas
      61b3464 [cody koeninger] [SPARK-2808][Streaming][Kafka] delay for second send in boundary condition test
      af6f3ec [cody koeninger] [SPARK-2808][Streaming][Kafka] delay test until latest leader offset matches expected value
      9edab4c [cody koeninger] [SPARK-2808][Streaming][Kafka] more shots in the dark on jenkins failing test
      c70ee43 [cody koeninger] [SPARK-2808][Streaming][Kafka] add more asserts to test, try to figure out why it fails on jenkins but not locally
      1d10751 [cody koeninger] Merge branch 'master' into wip-2808-kafka-0.8.2-upgrade
      ed02d2c [cody koeninger] [SPARK-2808][Streaming][Kafka] move default argument for api version to overloaded method, for binary compat
      407382e [cody koeninger] [SPARK-2808][Streaming][Kafka] update kafka to 0.8.2.1
      77de6c2 [cody koeninger] Merge branch 'master' into wip-2808-kafka-0.8.2-upgrade
      6953429 [cody koeninger] [SPARK-2808][Streaming][Kafka] update kafka to 0.8.2
      2e67c66 [Helena Edelson] #SPARK-2808 Update to Kafka 0.8.2.0 GA from beta.
      d9dc2bc [Helena Edelson] Merge remote-tracking branch 'upstream/master' into wip-2808-kafka-0.8.2-upgrade
      e768164 [Helena Edelson] #2808 update kafka to version 0.8.2
      47864840
  10. Apr 29, 2015
    • Tathagata Das's avatar
      [SPARK-7056] [STREAMING] Make the Write Ahead Log pluggable · 1868bd40
      Tathagata Das authored
      Users may want the WAL data to be written to non-HDFS data storage systems. To allow that, we have to make the WAL pluggable. The following design doc outlines the plan.
      
      https://docs.google.com/a/databricks.com/document/d/1A2XaOLRFzvIZSi18i_luNw5Rmm9j2j4AigktXxIYxmY/edit?usp=sharing
      
      Things to add.
      * Unit tests for WriteAheadLogUtils
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #5645 from tdas/wal-pluggable and squashes the following commits:
      
      2c431fd [Tathagata Das] Minor fixes.
      c2bc7384 [Tathagata Das] More changes based on PR comments.
      569a416 [Tathagata Das] fixed long line
      bde26b1 [Tathagata Das] Renamed segment to record handle everywhere
      b65e155 [Tathagata Das] More changes based on PR comments.
      d7cd15b [Tathagata Das] Fixed test
      1a32a4b [Tathagata Das] Fixed test
      e0d19fb [Tathagata Das] Fixed defaults
      9310cbf [Tathagata Das] style fix.
      86abcb1 [Tathagata Das] Refactored WriteAheadLogUtils, and consolidated all WAL related configuration into it.
      84ce469 [Tathagata Das] Added unit test and fixed compilation error.
      bce5e75 [Tathagata Das] Fixed long lines.
      837c4f5 [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into wal-pluggable
      754fbf8 [Tathagata Das] Added license and docs.
      09bc6fe [Tathagata Das] Merge remote-tracking branch 'apache-github/master' into wal-pluggable
      7dd2d4b [Tathagata Das] Added pluggable WriteAheadLog interface, and refactored all code along with it
      1868bd40
  11. Apr 28, 2015
    • jerryshao's avatar
      [SPARK-5946] [STREAMING] Add Python API for direct Kafka stream · 9e4e82b7
      jerryshao authored
      Currently only added `createDirectStream` API, I'm not sure if `createRDD` is also needed, since some Java object needs to be wrapped in Python. Please help to review, thanks a lot.
      
      Author: jerryshao <saisai.shao@intel.com>
      Author: Saisai Shao <saisai.shao@intel.com>
      
      Closes #4723 from jerryshao/direct-kafka-python-api and squashes the following commits:
      
      a1fe97c [jerryshao] Fix rebase issue
      eebf333 [jerryshao] Address the comments
      da40f4e [jerryshao] Fix Python 2.6 Syntax error issue
      5c0ee85 [jerryshao] Style fix
      4aeac18 [jerryshao] Fix bug in example code
      7146d86 [jerryshao] Add unit test
      bf3bdd6 [jerryshao] Add more APIs and address the comments
      f5b3801 [jerryshao] Small style fix
      8641835 [Saisai Shao] Rebase and update the code
      589c05b [Saisai Shao] Fix the style
      d6fcb6a [Saisai Shao] Address the comments
      dfda902 [Saisai Shao] Style fix
      0f7d168 [Saisai Shao] Add the doc and fix some style issues
      67e6880 [Saisai Shao] Fix test bug
      917b0db [Saisai Shao] Add Python createRDD API for Kakfa direct stream
      c3fc11d [jerryshao] Modify the docs
      2c00936 [Saisai Shao] address the comments
      3360f44 [jerryshao] Fix code style
      e0e0f0d [jerryshao] Code clean and bug fix
      338c41f [Saisai Shao] Add python API and example for direct kafka stream
      9e4e82b7
  12. Apr 27, 2015
    • Sean Owen's avatar
      [SPARK-7145] [CORE] commons-lang (2.x) classes used instead of commons-lang3... · ab5adb7a
      Sean Owen authored
      [SPARK-7145] [CORE] commons-lang (2.x) classes used instead of commons-lang3 (3.x); commons-io used without dependency
      
      Remove use of commons-lang in favor of commons-lang3 classes; remove commons-io use in favor of Guava
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #5703 from srowen/SPARK-7145 and squashes the following commits:
      
      21fbe03 [Sean Owen] Remove use of commons-lang in favor of commons-lang3 classes; remove commons-io use in favor of Guava
      ab5adb7a
  13. Apr 22, 2015
  14. Apr 16, 2015
  15. Apr 12, 2015
    • cody koeninger's avatar
      [SPARK-6431][Streaming][Kafka] Error message for partition metadata requ... · 6ac8eea2
      cody koeninger authored
      ...ests
      
      The original reported problem was misdiagnosed; the topic just didn't exist yet.  Agreed upon solution was to improve error handling / message
      
      Author: cody koeninger <cody@koeninger.org>
      
      Closes #5454 from koeninger/spark-6431-master and squashes the following commits:
      
      44300f8 [cody koeninger] [SPARK-6431][Streaming][Kafka] Error message for partition metadata requests
      6ac8eea2
  16. Apr 10, 2015
    • jerryshao's avatar
      [SPARK-6211][Streaming] Add Python Kafka API unit test · 3290d2d1
      jerryshao authored
      Refactor the Kafka unit test and add Python API support. CC tdas davies please help to review, thanks a lot.
      
      Author: jerryshao <saisai.shao@intel.com>
      Author: Saisai Shao <saisai.shao@intel.com>
      
      Closes #4961 from jerryshao/SPARK-6211 and squashes the following commits:
      
      ee4b919 [jerryshao] Fixed newly merged issue
      82c756e [jerryshao] Address the comments
      92912d1 [jerryshao] Address the commits
      0708bb1 [jerryshao] Fix rebase issue
      40b47a3 [Saisai Shao] Style fix
      f889657 [Saisai Shao] Update the code according
      8a2f3e2 [jerryshao] Address the issues
      0f1b7ce [jerryshao] Still fix the bug
      61a04f0 [jerryshao] Fix bugs and address the issues
      64d9877 [jerryshao] Fix rebase bugs
      8ad442f [jerryshao] Add kafka-assembly in run-tests
      6020b00 [jerryshao] Add more debug info in Shell
      8102d6e [jerryshao] Fix bug in Jenkins test
      fde1213 [jerryshao] Code style changes
      5536f95 [jerryshao] Refactor the Kafka unit test and add Python Kafka unittest support
      3290d2d1
  17. Apr 09, 2015
  18. Apr 08, 2015
    • Reynold Xin's avatar
      [SPARK-6765] Fix test code style for streaming. · 15e0d2bd
      Reynold Xin authored
      So we can turn style checker on for test code.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5409 from rxin/test-style-streaming and squashes the following commits:
      
      7aea69b [Reynold Xin] [SPARK-6765] Fix test code style for streaming.
      15e0d2bd
  19. Apr 06, 2015
  20. Apr 03, 2015
    • Reynold Xin's avatar
      [SPARK-6428] Turn on explicit type checking for public methods. · 82701ee2
      Reynold Xin authored
      This builds on my earlier pull requests and turns on the explicit type checking in scalastyle.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #5342 from rxin/SPARK-6428 and squashes the following commits:
      
      7b531ab [Reynold Xin] import ordering
      2d9a8a5 [Reynold Xin] jl
      e668b1c [Reynold Xin] override
      9b9e119 [Reynold Xin] Parenthesis.
      82e0cf5 [Reynold Xin] [SPARK-6428] Turn on explicit type checking for public methods.
      82701ee2
  21. Mar 24, 2015
    • Kousuke Saruta's avatar
      [SPARK-5559] [Streaming] [Test] Remove oppotunity we met flakiness when running FlumeStreamSuite · 85cf0636
      Kousuke Saruta authored
      When we run FlumeStreamSuite on Jenkins, sometimes we get error like as follows.
      
          sbt.ForkMain$ForkError: The code passed to eventually never returned normally. Attempted 52 times over 10.094849836 seconds. Last failure message: Error connecting to localhost/127.0.0.1:23456.
      	    at org.scalatest.concurrent.Eventually$class.tryTryAgain$1(Eventually.scala:420)
      	    at org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:438)
      	    at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:478)
      	    at org.scalatest.concurrent.Eventually$class.eventually(Eventually.scala:307)
      	   at org.scalatest.concurrent.Eventually$.eventually(Eventually.scala:478)
      	   at org.apache.spark.streaming.flume.FlumeStreamSuite.writeAndVerify(FlumeStreamSuite.scala:116)
                 at org.apache.spark.streaming.flume.FlumeStreamSuite.org$apache$spark$streaming$flume$FlumeStreamSuite$$testFlumeStream(FlumeStreamSuite.scala:74)
      	   at org.apache.spark.streaming.flume.FlumeStreamSuite$$anonfun$3.apply$mcV$sp(FlumeStreamSuite.scala:66)
      	    at org.apache.spark.streaming.flume.FlumeStreamSuite$$anonfun$3.apply(FlumeStreamSuite.scala:66)
      	    at org.apache.spark.streaming.flume.FlumeStreamSuite$$anonfun$3.apply(FlumeStreamSuite.scala:66)
      	    at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
      	    at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      	    at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      	    at org.scalatest.Transformer.apply(Transformer.scala:22)
      	    at org.scalatest.Transformer.apply(Transformer.scala:20)
          	    at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
      	    at org.scalatest.Suite$class.withFixture(Suite.scala:1122)
      	    at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555)
      	    at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163)
      	   at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      	    at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175)
      	    at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306)
      	    at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175)
      
      This error is caused by check-then-act logic  when it find free-port .
      
            /** Find a free port */
            private def findFreePort(): Int = {
              Utils.startServiceOnPort(23456, (trialPort: Int) => {
                val socket = new ServerSocket(trialPort)
                socket.close()
                (null, trialPort)
              }, conf)._2
            }
      
      Removing the check-then-act is not easy but we can reduce the chance of having the error by choosing random value for initial port instead of 23456.
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #4337 from sarutak/SPARK-5559 and squashes the following commits:
      
      16f109f [Kousuke Saruta] Added `require` to Utils#startServiceOnPort
      c39d8b6 [Kousuke Saruta] Merge branch 'SPARK-5559' of github.com:sarutak/spark into SPARK-5559
      1610ba2 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5559
      33357e3 [Kousuke Saruta] Changed "findFreePort" method in MQTTStreamSuite and FlumeStreamSuite so that it can choose valid random port
      a9029fe [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5559
      9489ef9 [Kousuke Saruta] Merge branch 'master' of git://git.apache.org/spark into SPARK-5559
      8212e42 [Kousuke Saruta] Modified default port used in FlumeStreamSuite from 23456 to random value
      85cf0636
  22. Mar 20, 2015
    • Marcelo Vanzin's avatar
      [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT. · a7456459
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #5056 from vanzin/SPARK-6371 and squashes the following commits:
      
      63220df [Marcelo Vanzin] Merge branch 'master' into SPARK-6371
      6506f75 [Marcelo Vanzin] Use more fine-grained exclusion.
      178ba71 [Marcelo Vanzin] Oops.
      75b2375 [Marcelo Vanzin] Exclude VertexRDD in MiMA.
      a45a62c [Marcelo Vanzin] Work around MIMA warning.
      1d8a670 [Marcelo Vanzin] Re-group jetty exclusion.
      0e8e909 [Marcelo Vanzin] Ignore ml, don't ignore graphx.
      cef4603 [Marcelo Vanzin] Indentation.
      296cf82 [Marcelo Vanzin] [SPARK-6371] [build] Update version to 1.4.0-SNAPSHOT.
      a7456459
    • Sean Owen's avatar
      SPARK-6338 [CORE] Use standard temp dir mechanisms in tests to avoid orphaned temp files · 6f80c3e8
      Sean Owen authored
      Use `Utils.createTempDir()` to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #5029 from srowen/SPARK-6338 and squashes the following commits:
      
      27b740a [Sean Owen] Fix hive-thriftserver tests that don't expect an existing dir
      4a212fa [Sean Owen] Standardize a bit more temp dir management
      9004081 [Sean Owen] Revert some added recursive-delete calls
      57609e4 [Sean Owen] Use Utils.createTempDir() to replace other temp file mechanisms used in some tests, to further ensure they are cleaned up, and simplify
      6f80c3e8
  23. Mar 11, 2015
    • Sean Owen's avatar
      SPARK-6225 [CORE] [SQL] [STREAMING] Resolve most build warnings, 1.3.0 edition · 6e94c4ea
      Sean Owen authored
      Resolve javac, scalac warnings of various types -- deprecations, Scala lang, unchecked cast, etc.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4950 from srowen/SPARK-6225 and squashes the following commits:
      
      3080972 [Sean Owen] Ordered imports: Java, Scala, 3rd party, Spark
      c67985b [Sean Owen] Resolve javac, scalac warnings of various types -- deprecations, Scala lang, unchecked cast, etc.
      6e94c4ea
    • zzcclp's avatar
      [SPARK-6279][Streaming]In KafkaRDD.scala, Miss expressions flag "s" at logging string · ec30c178
      zzcclp authored
      In KafkaRDD.scala, Miss expressions flag "s" at logging string
      In logging file, it print `Beginning offset $
      {part.fromOffset}
      is the same as ending offset ` but not `Beginning offset 111 is the same as ending offset `.
      
      Author: zzcclp <xm_zzc@sina.com>
      
      Closes #4979 from zzcclp/SPARK-6279 and squashes the following commits:
      
      768f88e [zzcclp] Miss expressions flag "s"
      ec30c178
  24. Mar 05, 2015
  25. Feb 27, 2015
  26. Feb 26, 2015
    • Tathagata Das's avatar
      [SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for KafkaUtils... · aa63f633
      Tathagata Das authored
      [SPARK-6027][SPARK-5546] Fixed --jar and --packages not working for KafkaUtils and improved error message
      
      The problem with SPARK-6027 in short is that JARs like the kafka-assembly.jar does not work in python as the added JAR is not visible in the classloader used by Py4J. Py4J uses Class.forName(), which does not uses the systemclassloader, but the JARs are only visible in the Thread's contextclassloader. So this back uses the context class loader to create the KafkaUtils dstream object. This works for both cases where the Kafka libraries are added with --jars spark-streaming-kafka-assembly.jar or with --packages spark-streaming-kafka
      
      Also improves the error message.
      
      davies
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #4779 from tdas/kafka-python-fix and squashes the following commits:
      
      fb16b04 [Tathagata Das] Removed import
      c1fdf35 [Tathagata Das] Fixed long line and improved documentation
      7b88be8 [Tathagata Das] Fixed --jar not working for KafkaUtils and improved error message
      aa63f633
  27. Feb 25, 2015
    • prabs's avatar
      [SPARK-5666][streaming][MQTT streaming] some trivial fixes · d51ed263
      prabs authored
      modified to adhere to accepted coding standards as pointed by tdas in PR #3844
      
      Author: prabs <prabsmails@gmail.com>
      Author: Prabeesh K <prabsmails@gmail.com>
      
      Closes #4178 from prabeesh/master and squashes the following commits:
      
      bd2cb49 [Prabeesh K] adress the comment
      ccc0765 [prabs] adress the comment
      46f9619 [prabs] adress the comment
      c035bdc [prabs] adress the comment
      22dd7f7 [prabs] address the comments
      0cc67bd [prabs] adress the comment
      838c38e [prabs] adress the comment
      cd57029 [prabs] address the comments
      66919a3 [Prabeesh K] changed MqttDefaultFilePersistence to MemoryPersistence
      5857989 [prabs] modified to adhere to accepted coding standards
      d51ed263
  28. Feb 24, 2015
    • Tathagata Das's avatar
      [SPARK-5993][Streaming][Build] Fix assembly jar location of kafka-assembly · 922b43b3
      Tathagata Das authored
      Published Kafka-assembly JAR was empty in 1.3.0-RC1
      This is because the maven build generated two Jars-
      1. an empty JAR file (since kafka-assembly has no code of its own)
      2. a assembly JAR file containing everything in a different location as 1
      The maven publishing plugin uploaded 1 and not 2.
      Instead if 2 is not configure to generate in a different location, there is only 1 jar containing everything, which gets published.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #4753 from tdas/SPARK-5993 and squashes the following commits:
      
      c390db8 [Tathagata Das] Fix assembly jar location of kafka-assembly
      922b43b3
  29. Feb 19, 2015
    • Sean Owen's avatar
      SPARK-4682 [CORE] Consolidate various 'Clock' classes · 34b7c353
      Sean Owen authored
      Another one from JoshRosen 's wish list. The first commit is much smaller and removes 2 of the 4 Clock classes. The second is much larger, necessary for consolidating the streaming one. I put together implementations in the way that seemed simplest. Almost all the change is standardizing class and method names.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4514 from srowen/SPARK-4682 and squashes the following commits:
      
      5ed3a03 [Sean Owen] Javadoc Clock classes; make ManualClock private[spark]
      169dd13 [Sean Owen] Add support for legacy org.apache.spark.streaming clock class names
      277785a [Sean Owen] Reduce the net change in this patch by reversing some unnecessary syntax changes along the way
      b5e53df [Sean Owen] FakeClock -> ManualClock; getTime() -> getTimeMillis()
      160863a [Sean Owen] Consolidate Streaming Clock class into common util Clock
      7c956b2 [Sean Owen] Consolidate Clocks except for Streaming Clock
      34b7c353
  30. Feb 18, 2015
    • Tathagata Das's avatar
      [SPARK-5731][Streaming][Test] Fix incorrect test in DirectKafkaStreamSuite · 3912d332
      Tathagata Das authored
      The test was incorrect. Instead of counting the number of records, it counted the number of partitions of RDD generated by DStream. Which is not its intention. I will be testing this patch multiple times to understand its flakiness.
      
      PS: This was caused by my refactoring in https://github.com/apache/spark/pull/4384/
      
      koeninger check it out.
      
      Author: Tathagata Das <tathagata.das1565@gmail.com>
      
      Closes #4597 from tdas/kafka-flaky-test and squashes the following commits:
      
      d236235 [Tathagata Das] Unignored last test.
      e9a1820 [Tathagata Das] fix test
      3912d332
  31. Feb 16, 2015
  32. Feb 13, 2015
  33. Feb 11, 2015
    • Sean Owen's avatar
      SPARK-5728 [STREAMING] MQTTStreamSuite leaves behind ActiveMQ database files · da89720b
      Sean Owen authored
      Use temp dir for ActiveMQ database
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4517 from srowen/SPARK-5728 and squashes the following commits:
      
      1d3aeb8 [Sean Owen] Use temp dir for ActiveMQ database
      da89720b
    • cody koeninger's avatar
      [SPARK-4964] [Streaming] refactor createRDD to take leaders via map instead of array · 658687b2
      cody koeninger authored
      Author: cody koeninger <cody@koeninger.org>
      
      Closes #4511 from koeninger/kafkaRdd-leader-to-broker and squashes the following commits:
      
      f7151d4 [cody koeninger] [SPARK-4964] test refactoring
      6f8680b [cody koeninger] [SPARK-4964] add test of the scala api for KafkaUtils.createRDD
      f81e016 [cody koeninger] [SPARK-4964] leave KafkaStreamSuite host and port as private
      5173f3f [cody koeninger] [SPARK-4964] test the Java variations of createRDD
      e9cece4 [cody koeninger] [SPARK-4964] pass leaders as a map to ensure 1 leader per TopicPartition
      658687b2
Loading