Skip to content
Snippets Groups Projects
  1. Nov 11, 2014
    • Andrew Or's avatar
      2ddb1415
    • Timothy Chen's avatar
      SPARK-2269 Refactor mesos scheduler resourceOffers and add unit test · a878660d
      Timothy Chen authored
      Author: Timothy Chen <tnachen@gmail.com>
      
      Closes #1487 from tnachen/resource_offer_refactor and squashes the following commits:
      
      4ea5dec [Timothy Chen] Rebase from master and address comments
      9ccab09 [Timothy Chen] Address review comments
      e6494dc [Timothy Chen] Refactor class loading
      8207428 [Timothy Chen] Refactor mesos scheduler resourceOffers and add unit test
      a878660d
    • Kousuke Saruta's avatar
      [SPARK-4282][YARN] Stopping flag in YarnClientSchedulerBackend should be volatile · 7f371884
      Kousuke Saruta authored
      In YarnClientSchedulerBackend, a variable "stopping" is used as a flag and it's accessed by some threads so it should be volatile.
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #3143 from sarutak/stopping-flag-volatile and squashes the following commits:
      
      58fdcc9 [Kousuke Saruta] Marked stoppig flag as volatile
      7f371884
    • Sean Owen's avatar
      SPARK-4305 [BUILD] yarn-alpha profile won't build due to network/yarn module · f820b563
      Sean Owen authored
      SPARK-3797 introduced the `network/yarn` module, but its YARN code depends on YARN APIs not present in older versions covered by the `yarn-alpha` profile. As a result builds like `mvn -Pyarn-alpha -Phadoop-0.23 -Dhadoop.version=0.23.7 -DskipTests clean package` fail.
      
      The solution is just to not build `network/yarn` with profile `yarn-alpha`.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3167 from srowen/SPARK-4305 and squashes the following commits:
      
      88938cb [Sean Owen] Don't build network/yarn in yarn-alpha profile as it won't compile
      f820b563
    • Prashant Sharma's avatar
      SPARK-1830 Deploy failover, Make Persistence engine and LeaderAgent Pluggable · deefd9d7
      Prashant Sharma authored
      Author: Prashant Sharma <prashant.s@imaginea.com>
      
      Closes #771 from ScrapCodes/deploy-failover-pluggable and squashes the following commits:
      
      29ba440 [Prashant Sharma] fixed a compilation error
      fef35ec [Prashant Sharma] Code review
      57ee6f0 [Prashant Sharma] SPARK-1830 Deploy failover, Make Persistence engine and LeaderAgent Pluggable.
      deefd9d7
    • huangzhaowei's avatar
      [Streaming][Minor]Replace some 'if-else' in Clock · 6e03de30
      huangzhaowei authored
      Replace some 'if-else' statement by math.min and math.max in Clock.scala
      
      Author: huangzhaowei <carlmartinmax@gmail.com>
      
      Closes #3088 from SaintBacchus/StreamingClock and squashes the following commits:
      
      7b7f8e7 [huangzhaowei] [Streaming][Minor]Replace some 'if-else' in Clock
      6e03de30
    • jerryshao's avatar
      [SPARK-2492][Streaming] kafkaReceiver minor changes to align with Kafka 0.8 · c8850a3d
      jerryshao authored
      Update the KafkaReceiver's behavior when auto.offset.reset is set.
      
      In Kafka 0.8, `auto.offset.reset` is a hint for out-range offset to seek to the beginning or end of the partition. While in the previous code `auto.offset.reset` is a enforcement to seek to the beginning or end immediately, this is different from Kafka 0.8 defined behavior.
      
      Also deleting extesting ZK metadata in Receiver when multiple consumers are launched will introduce issue as mentioned in [SPARK-2383](https://issues.apache.org/jira/browse/SPARK-2383).
      
      So Here we change to offer user to API to explicitly reset offset before create Kafka stream, while in the meantime keep the same behavior as Kafka 0.8 for parameter `auto.offset.reset`.
      
      @tdas, would you please review this PR? Thanks a lot.
      
      Author: jerryshao <saisai.shao@intel.com>
      
      Closes #1420 from jerryshao/kafka-fix and squashes the following commits:
      
      d6ae94d [jerryshao] Address the comment to remove the resetOffset() function
      de3a4c8 [jerryshao] Fix compile error
      4a1c3f9 [jerryshao] Doc changes
      b2c1430 [jerryshao] Move offset reset to a helper function to let user explicitly delete ZK metadata by calling this API
      fac8fd6 [jerryshao] Changes to align with Kafka 0.8
      c8850a3d
    • maji2014's avatar
      [SPARK-4295][External]Fix exception in SparkSinkSuite · f8811a56
      maji2014 authored
      Handle exception in SparkSinkSuite, please refer to [SPARK-4295]
      
      Author: maji2014 <maji3@asiainfo.com>
      
      Closes #3177 from maji2014/spark-4295 and squashes the following commits:
      
      312620a [maji2014] change a new statement for spark-4295
      24c3d21 [maji2014] add log4j.properties for SparkSinkSuite and spark-4295
      c807bf6 [maji2014] Fix exception in SparkSinkSuite
      f8811a56
    • Reynold Xin's avatar
      [SPARK-4307] Initialize FileDescriptor lazily in FileRegion. · ef29a9a9
      Reynold Xin authored
      Netty's DefaultFileRegion requires a FileDescriptor in its constructor, which means we need to have a opened file handle. In super large workloads, this could lead to too many open files due to the way these file descriptors are cleaned. This pull request creates a new LazyFileRegion that initializes the FileDescriptor when we are sending data for the first time.
      
      Author: Reynold Xin <rxin@databricks.com>
      Author: Reynold Xin <rxin@apache.org>
      
      Closes #3172 from rxin/lazyFD and squashes the following commits:
      
      0bdcdc6 [Reynold Xin] Added reference to Netty's DefaultFileRegion
      d4564ae [Reynold Xin] Added SparkConf to the ctor argument of IndexShuffleBlockManager.
      6ed369e [Reynold Xin] Code review feedback.
      04cddc8 [Reynold Xin] [SPARK-4307] Initialize FileDescriptor lazily in FileRegion.
      ef29a9a9
    • Davies Liu's avatar
      [SPARK-4324] [PySpark] [MLlib] support numpy.array for all MLlib API · 65083e93
      Davies Liu authored
      This PR check all of the existing Python MLlib API to make sure that numpy.array is supported as Vector (also RDD of numpy.array).
      
      It also improve some docstring and doctest.
      
      cc mateiz mengxr
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #3189 from davies/numpy and squashes the following commits:
      
      d5057c4 [Davies Liu] fix tests
      6987611 [Davies Liu] support numpy.array for all MLlib API
      65083e93
    • Kousuke Saruta's avatar
      [SPARK-4330][Doc] Link to proper URL for YARN overview · 3c07b8f0
      Kousuke Saruta authored
      In running-on-yarn.md, a link to YARN overview is here.
      But the URL is to YARN alpha's.
      It should be stable's.
      
      Author: Kousuke Saruta <sarutak@oss.nttdata.co.jp>
      
      Closes #3196 from sarutak/SPARK-4330 and squashes the following commits:
      
      30baa21 [Kousuke Saruta] Fixed running-on-yarn.md to point proper URL for YARN
      3c07b8f0
  2. Nov 10, 2014
    • Ankur Dave's avatar
      [SPARK-3649] Remove GraphX custom serializers · 300887bd
      Ankur Dave authored
      As [reported][1] on the mailing list, GraphX throws
      
      ```
      java.lang.ClassCastException: java.lang.Long cannot be cast to scala.Tuple2
              at org.apache.spark.graphx.impl.RoutingTableMessageSerializer$$anon$1$$anon$2.writeObject(Serializers.scala:39)
              at org.apache.spark.storage.DiskBlockObjectWriter.write(BlockObjectWriter.scala:195)
              at org.apache.spark.util.collection.ExternalSorter.spillToMergeableFile(ExternalSorter.scala:329)
      ```
      
      when sort-based shuffle attempts to spill to disk. This is because GraphX defines custom serializers for shuffling pair RDDs that assume Spark will always serialize the entire pair object rather than breaking it up into its components. However, the spill code path in sort-based shuffle [violates this assumption][2].
      
      GraphX uses the custom serializers to compress vertex ID keys using variable-length integer encoding. However, since the serializer can no longer rely on the key and value being serialized and deserialized together, performing such encoding would either require writing a tag byte (costly) or maintaining state in the serializer and assuming that serialization calls will alternate between key and value (fragile).
      
      Instead, this PR simply removes the custom serializers. This causes a **10% slowdown** (494 s to 543 s) and **16% increase in per-iteration communication** (2176 MB to 2518 MB) for PageRank (averages across 3 trials, 10 iterations per trial, uk-2007-05 graph, 16 r3.2xlarge nodes).
      
      [1]: http://apache-spark-user-list.1001560.n3.nabble.com/java-lang-ClassCastException-java-lang-Long-cannot-be-cast-to-scala-Tuple2-td13926.html#a14501
      [2]: https://github.com/apache/spark/blob/f9d6220c792b779be385f3022d146911a22c2130/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala#L329
      
      Author: Ankur Dave <ankurdave@gmail.com>
      
      Closes #2503 from ankurdave/SPARK-3649 and squashes the following commits:
      
      a49c2ad [Ankur Dave] [SPARK-3649] Remove GraphX custom serializers
      300887bd
    • Cheng Hao's avatar
      [SPARK-4274] [SQL] Fix NPE in printing the details of the query plan · c764d0ac
      Cheng Hao authored
      Author: Cheng Hao <hao.cheng@intel.com>
      
      Closes #3139 from chenghao-intel/comparison_test and squashes the following commits:
      
      f5d7146 [Cheng Hao] avoid exception in printing the codegen enabled
      c764d0ac
    • surq's avatar
      [SPARK-3954][Streaming] Optimization to FileInputDStream · ce6ed2ab
      surq authored
      about convert files to RDDS there are 3 loops with files sequence in spark source.
      loops files sequence:
      1.files.map(...)
      2.files.zip(fileRDDs)
      3.files-size.foreach
      It's will very time consuming when lots of files.So I do the following correction:
      3 loops with files sequence => only one loop
      
      Author: surq <surq@asiainfo.com>
      
      Closes #2811 from surq/SPARK-3954 and squashes the following commits:
      
      321bbe8 [surq]  updated the code style.The style from [for...yield]to [files.map(file=>{})]
      88a2c20 [surq] Merge branch 'master' of https://github.com/apache/spark into SPARK-3954
      178066f [surq] modify code's style. [Exceeds 100 columns]
      626ef97 [surq] remove redundant import(ArrayBuffer)
      739341f [surq] promote the speed of convert files to RDDS
      ce6ed2ab
    • Daoyuan Wang's avatar
      [SPARK-4149][SQL] ISO 8601 support for json date time strings · a1fc059b
      Daoyuan Wang authored
      This implement the feature davies mentioned in https://github.com/apache/spark/pull/2901#discussion-diff-19313312
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #3012 from adrian-wang/iso8601 and squashes the following commits:
      
      50df6e7 [Daoyuan Wang] json data timestamp ISO8601 support
      a1fc059b
    • Cheng Hao's avatar
      [SPARK-4250] [SQL] Fix bug of constant null value mapping to ConstantObjectInspector · fa777833
      Cheng Hao authored
      Author: Cheng Hao <hao.cheng@intel.com>
      
      Closes #3114 from chenghao-intel/constant_null_oi and squashes the following commits:
      
      e603bda [Cheng Hao] fix the bug of null value for primitive types
      50a13ba [Cheng Hao] fix the timezone issue
      f54f369 [Cheng Hao] fix bug of constant null value for ObjectInspector
      fa777833
    • Xiangrui Meng's avatar
      [SQL] remove a decimal case branch that has no effect at runtime · d793d80c
      Xiangrui Meng authored
      it generates warnings at compile time marmbrus
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #3192 from mengxr/dtc-decimal and squashes the following commits:
      
      955e9fb [Xiangrui Meng] remove a decimal case branch that has no effect
      d793d80c
    • Cheng Lian's avatar
      [SPARK-4308][SQL] Sets SQL operation state to ERROR when exception is thrown · acb55aed
      Cheng Lian authored
      In `HiveThriftServer2`, when an exception is thrown during a SQL execution, the SQL operation state should be set to `ERROR`, but now it remains `RUNNING`. This affects the result of the `GetOperationStatus` Thrift API.
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #3175 from liancheng/fix-op-state and squashes the following commits:
      
      6d4c1fe [Cheng Lian] Sets SQL operation state to ERROR when exception is thrown
      acb55aed
    • Cheng Lian's avatar
      [SPARK-4000][Build] Uploads HiveCompatibilitySuite logs · 534b2314
      Cheng Lian authored
      This is a follow up of #2845. In addition to unit-tests.log files, also upload failure output files generated by `HiveCompatibilitySuite` to Jenkins master. These files can be very helpful to debug Hive compatibility test failures.
      
      /cc pwendell marmbrus
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #2993 from liancheng/upload-hive-compat-logs and squashes the following commits:
      
      8e6247f [Cheng Lian] Uploads HiveCompatibilitySuite logs
      534b2314
    • Takuya UESHIN's avatar
      [SPARK-4319][SQL] Enable an ignored test "null count". · dbf10588
      Takuya UESHIN authored
      Author: Takuya UESHIN <ueshin@happy-camper.st>
      
      Closes #3185 from ueshin/issues/SPARK-4319 and squashes the following commits:
      
      a44a38e [Takuya UESHIN] Enable an ignored test "null count".
      dbf10588
    • Patrick Wendell's avatar
      Revert "[SPARK-2703][Core]Make Tachyon related unit tests execute without... · 6e7a309b
      Patrick Wendell authored
      Revert "[SPARK-2703][Core]Make Tachyon related unit tests execute without deploying a Tachyon system locally."
      
      This reverts commit bd86cb17.
      6e7a309b
    • Varadharajan Mukundan's avatar
      [SPARK-4047] - Generate runtime warnings for example implementation of PageRank · 974d334c
      Varadharajan Mukundan authored
      Based on SPARK-2434, this PR generates runtime warnings for example implementations (Python, Scala) of PageRank.
      
      Author: Varadharajan Mukundan <srinathsmn@gmail.com>
      
      Closes #2894 from varadharajan/SPARK-4047 and squashes the following commits:
      
      5f9406b [Varadharajan Mukundan] [SPARK-4047] - Point users to LogisticRegressionWithSGD and LogisticRegressionWithLBFGS instead of LogisticRegressionModel
      252f595 [Varadharajan Mukundan] a. Generate runtime warnings for
      05a018b [Varadharajan Mukundan] Fix PageRank implementation's package reference
      5c2bf54 [Varadharajan Mukundan] [SPARK-4047] - Generate runtime warnings for example implementation of PageRank
      974d334c
    • tedyu's avatar
      SPARK-1297 Upgrade HBase dependency to 0.98 · b32734e1
      tedyu authored
      pwendell rxin
      Please take a look
      
      Author: tedyu <yuzhihong@gmail.com>
      
      Closes #3115 from tedyu/master and squashes the following commits:
      
      2b079c8 [tedyu] SPARK-1297 Upgrade HBase dependency to 0.98
      b32734e1
    • Sandy Ryza's avatar
      SPARK-4230. Doc for spark.default.parallelism is incorrect · c6f4e704
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      Closes #3107 from sryza/sandy-spark-4230 and squashes the following commits:
      
      37a1d19 [Sandy Ryza] Clear up a couple things
      34d53de [Sandy Ryza] SPARK-4230. Doc for spark.default.parallelism is incorrect
      c6f4e704
    • Jey Kottalam's avatar
      [SPARK-4312] bash doesn't have "die" · c5db8e2c
      Jey Kottalam authored
      sbt-launch-lib.bash includes `die` command but it's not valid command for Linux, MacOS X or Windows.
      
      Closes #2898
      
      Author: Jey Kottalam <jey@kottalam.net>
      
      Closes #3182 from sarutak/SPARK-4312 and squashes the following commits:
      
      24c6677 [Jey Kottalam] bash doesn't have "die"
      c5db8e2c
    • comcmipi's avatar
      Update RecoverableNetworkWordCount.scala · 0340c56a
      comcmipi authored
      Trying this example, I missed the moment when the checkpoint was iniciated
      
      Author: comcmipi <pitonak@fns.uniba.sk>
      
      Closes #2735 from comcmipi/patch-1 and squashes the following commits:
      
      b6d8001 [comcmipi] Update RecoverableNetworkWordCount.scala
      96fe274 [comcmipi] Update RecoverableNetworkWordCount.scala
      0340c56a
    • Sean Owen's avatar
      SPARK-2548 [STREAMING] JavaRecoverableWordCount is missing · 3a02d416
      Sean Owen authored
      Here's my attempt to re-port `RecoverableNetworkWordCount` to Java, following the example of its Scala and Java siblings. I fixed a few minor doc/formatting issues along the way I believe.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #2564 from srowen/SPARK-2548 and squashes the following commits:
      
      0d0bf29 [Sean Owen] Update checkpoint call as in https://github.com/apache/spark/pull/2735
      35f23e3 [Sean Owen] Remove old comment about running in standalone mode
      179b3c2 [Sean Owen] Re-port RecoverableNetworkWordCount to Java example, and touch up doc / formatting in related examples
      3a02d416
    • Niklas Wilcke's avatar
      [SPARK-4169] [Core] Accommodate non-English Locales in unit tests · ed8bf1ea
      Niklas Wilcke authored
      For me the core tests failed because there are two locale dependent parts in the code.
      Look at the Jira ticket for details.
      
      Why is it necessary to check the exception message in isBindCollision in
      https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/util/Utils.scala#L1686
      ?
      
      Author: Niklas Wilcke <1wilcke@informatik.uni-hamburg.de>
      
      Closes #3036 from numbnut/core-test-fix and squashes the following commits:
      
      1fb0d04 [Niklas Wilcke] Fixing locale dependend code and tests
      ed8bf1ea
    • Xiangrui Meng's avatar
      [SQL] support udt to hive types conversion (hive->udt is not supported) · 894a7245
      Xiangrui Meng authored
      marmbrus
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #3164 from mengxr/hive-udt and squashes the following commits:
      
      57c7519 [Xiangrui Meng] support udt->hive types (hive->udt is not supported)
      894a7245
    • RongGu's avatar
      [SPARK-2703][Core]Make Tachyon related unit tests execute without deploying a... · bd86cb17
      RongGu authored
      [SPARK-2703][Core]Make Tachyon related unit tests execute without deploying a Tachyon system locally.
      
      Make Tachyon related unit tests execute without deploying a Tachyon system locally.
      
      Author: RongGu <gurongwalker@gmail.com>
      
      Closes #3030 from RongGu/SPARK-2703 and squashes the following commits:
      
      ad08827 [RongGu] Make Tachyon related unit tests execute without deploying a Tachyon system locally
      bd86cb17
    • Patrick Wendell's avatar
      MAINTENANCE: Automated closing of pull requests. · 227488d8
      Patrick Wendell authored
      This commit exists to close the following pull requests on Github:
      
      Closes #2898 (close requested by 'pwendell')
      Closes #2212 (close requested by 'pwendell')
      Closes #2102 (close requested by 'pwendell')
      227488d8
    • Sandy Ryza's avatar
      SPARK-3179. Add task OutputMetrics. · 3c2cff4b
      Sandy Ryza authored
      Author: Sandy Ryza <sandy@cloudera.com>
      
      This patch had conflicts when merged, resolved by
      Committer: Kay Ousterhout <kayousterhout@gmail.com>
      
      Closes #2968 from sryza/sandy-spark-3179 and squashes the following commits:
      
      dce4784 [Sandy Ryza] More review feedback
      8d350d1 [Sandy Ryza] Fix test against Hadoop 2.5+
      e7c74d0 [Sandy Ryza] More review feedback
      6cff9c4 [Sandy Ryza] Review feedback
      fb2dde0 [Sandy Ryza] SPARK-3179
      3c2cff4b
    • Sean Owen's avatar
      SPARK-1209 [CORE] (Take 2) SparkHadoop{MapRed,MapReduce}Util should not use... · f8e57323
      Sean Owen authored
      SPARK-1209 [CORE] (Take 2) SparkHadoop{MapRed,MapReduce}Util should not use package org.apache.hadoop
      
      andrewor14 Another try at SPARK-1209, to address https://github.com/apache/spark/pull/2814#issuecomment-61197619
      
      I successfully tested with `mvn -Dhadoop.version=1.0.4 -DskipTests clean package; mvn -Dhadoop.version=1.0.4 test` I assume that is what failed Jenkins last time. I also tried `-Dhadoop.version1.2.1` and `-Phadoop-2.4 -Pyarn -Phive` for more coverage.
      
      So this is why the class was put in `org.apache.hadoop` to begin with, I assume. One option is to leave this as-is for now and move it only when Hadoop 1.0.x support goes away.
      
      This is the other option, which adds a call to force the constructor to be public at run-time. It's probably less surprising than putting Spark code in `org.apache.hadoop`, but, does involve reflection. A `SecurityManager` might forbid this, but it would forbid a lot of stuff Spark does. This would also only affect Hadoop 1.0.x it seems.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3048 from srowen/SPARK-1209 and squashes the following commits:
      
      0d48f4b [Sean Owen] For Hadoop 1.0.x, make certain constructors public, which were public in later versions
      466e179 [Sean Owen] Disable MIMA warnings resulting from moving the class -- this was also part of the PairRDDFunctions type hierarchy though?
      eb61820 [Sean Owen] Move SparkHadoopMapRedUtil / SparkHadoopMapReduceUtil from org.apache.hadoop to org.apache.spark
      f8e57323
  3. Nov 09, 2014
    • Patrick Wendell's avatar
      MAINTENANCE: Automated closing of pull requests. · f73b56f5
      Patrick Wendell authored
      This commit exists to close the following pull requests on Github:
      
      Closes #464 (close requested by 'JoshRosen')
      Closes #283 (close requested by 'pwendell')
      Closes #449 (close requested by 'pwendell')
      Closes #907 (close requested by 'pwendell')
      Closes #2478 (close requested by 'JoshRosen')
      Closes #2192 (close requested by 'tdas')
      Closes #918 (close requested by 'pwendell')
      Closes #1465 (close requested by 'pwendell')
      Closes #3135 (close requested by 'JoshRosen')
      Closes #1693 (close requested by 'tdas')
      Closes #1279 (close requested by 'pwendell')
      f73b56f5
    • Sean Owen's avatar
      SPARK-1344 [DOCS] Scala API docs for top methods · d1362659
      Sean Owen authored
      Use "k" in javadoc of top and takeOrdered to avoid confusion with type K in pair RDDs. I think this resolves the discussion in SPARK-1344.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3168 from srowen/SPARK-1344 and squashes the following commits:
      
      6963fcc [Sean Owen] Use "k" in javadoc of top and takeOrdered to avoid confusion with type K in pair RDDs
      d1362659
    • Sean Owen's avatar
      SPARK-971 [DOCS] Link to Confluence wiki from project website / documentation · 8c99a47a
      Sean Owen authored
      This is a trivial change to add links to the wiki from `README.md` and the main docs page. It is already linked to from spark.apache.org.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #3169 from srowen/SPARK-971 and squashes the following commits:
      
      dcb84d0 [Sean Owen] Add link to wiki from README, docs home page
      8c99a47a
  4. Nov 08, 2014
    • Josh Rosen's avatar
      [SPARK-4301] StreamingContext should not allow start() to be called after calling stop() · 7b41b17f
      Josh Rosen authored
      In Spark 1.0.0+, calling `stop()` on a StreamingContext that has not been started is a no-op which has no side-effects. This allows users to call `stop()` on a fresh StreamingContext followed by `start()`. I believe that this almost always indicates an error and is not behavior that we should support. Since we don't allow `start() stop() start()` then I don't think it makes sense to allow `stop() start()`.
      
      The current behavior can lead to resource leaks when StreamingContext constructs its own SparkContext: if I call `stop(stopSparkContext=True)`, then I expect StreamingContext's underlying SparkContext to be stopped irrespective of whether the StreamingContext has been started. This is useful when writing unit test fixtures.
      
      Prior discussions:
      - https://github.com/apache/spark/pull/3053#discussion-diff-19710333R490
      - https://github.com/apache/spark/pull/3121#issuecomment-61927353
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #3160 from JoshRosen/SPARK-4301 and squashes the following commits:
      
      dbcc929 [Josh Rosen] Address more review comments
      bdbe5da [Josh Rosen] Stop SparkContext after stopping scheduler, not before.
      03e9c40 [Josh Rosen] Always stop SparkContext, even if stop(false) has already been called.
      832a7f4 [Josh Rosen] Address review comment
      5142517 [Josh Rosen] Add tests; improve Scaladoc.
      813e471 [Josh Rosen] Revert workaround added in https://github.com/apache/spark/pull/3053/files#diff-e144dbee130ed84f9465853ddce65f8eR49
      5558e70 [Josh Rosen] StreamingContext.stop() should stop SparkContext even if StreamingContext has not been started yet.
      7b41b17f
    • Aaron Davidson's avatar
      [Minor] [Core] Don't NPE on closeQuietly(null) · 4af5c7e2
      Aaron Davidson authored
      Author: Aaron Davidson <aaron@databricks.com>
      
      Closes #3166 from aarondav/closeQuietlyer and squashes the following commits:
      
      78096b5 [Aaron Davidson] Don't NPE on closeQuietly(null)
      4af5c7e2
    • Andrew Or's avatar
      [SPARK-4291][Build] Rename network module projects · 7afc8564
      Andrew Or authored
      The names of the recently introduced network modules are inconsistent with those of the other modules in the project. We should just drop the "Code" suffix since it doesn't sacrifice any meaning, especially before they get into an official release.
      
      ```
      [INFO] Reactor Build Order:
      [INFO]
      [INFO] Spark Project Parent POM
      [INFO] Spark Project Common Network Code
      [INFO] Spark Project Shuffle Streaming Service Code
      [INFO] Spark Project Core
      [INFO] Spark Project Bagel
      [INFO] Spark Project GraphX
      [INFO] Spark Project Streaming
      [INFO] Spark Project Catalyst
      [INFO] Spark Project SQL
      [INFO] Spark Project ML Library
      [INFO] Spark Project Tools
      [INFO] Spark Project Hive
      [INFO] Spark Project REPL
      [INFO] Spark Project YARN Parent POM
      [INFO] Spark Project YARN Stable API
      [INFO] Spark Project Assembly
      [INFO] Spark Project External Twitter
      [INFO] Spark Project External Kafka
      [INFO] Spark Project External Flume Sink
      [INFO] Spark Project External Flume
      [INFO] Spark Project External ZeroMQ
      [INFO] Spark Project External MQTT
      [INFO] Spark Project Examples
      [INFO] Spark Project Yarn Shuffle Service Code
      ```
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #3148 from andrewor14/build-drop-code and squashes the following commits:
      
      eac839b [Andrew Or] Network -> Networking
      d01ad47 [Andrew Or] Rename network module project names
      7afc8564
    • Michelangelo D'Agostino's avatar
      [MLLIB] [PYTHON] SPARK-4221: Expose nonnegative ALS in the python API · 7e9d9756
      Michelangelo D'Agostino authored
      SPARK-1553 added alternating nonnegative least squares to MLLib, however it's not possible to access it via the python API.  This pull request resolves that.
      
      Author: Michelangelo D'Agostino <mdagostino@civisanalytics.com>
      
      Closes #3095 from mdagost/python_nmf and squashes the following commits:
      
      a6743ad [Michelangelo D'Agostino] Use setters instead of static methods in PythonMLLibAPI.  Remove the new static methods I added.  Set seed in tests.  Change ratings to ratingsRDD in both train and trainImplicit for consistency.
      7cffd39 [Michelangelo D'Agostino] Swapped nonnegative and seed in a few more places.
      3fdc851 [Michelangelo D'Agostino] Moved seed to the end of the python parameter list.
      bdcc154 [Michelangelo D'Agostino] Change seed type to java.lang.Long so that it can handle null.
      cedf043 [Michelangelo D'Agostino] Added in ability to set the seed from python and made that play nice with the nonnegative changes.  Also made the python ALS tests more exact.
      a72fdc9 [Michelangelo D'Agostino] Expose nonnegative ALS in the python API.
      7e9d9756
Loading