Skip to content
Snippets Groups Projects
  1. May 31, 2015
  2. May 30, 2015
    • Cheng Lian's avatar
      [SQL] [MINOR] Fixes a minor comment mistake in IsolatedClientLoader · f7fe9e47
      Cheng Lian authored
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #6521 from liancheng/classloader-comment-fix and squashes the following commits:
      
      fc09606 [Cheng Lian] Addresses @srowen's comment
      59945c5 [Cheng Lian] Fixes a minor comment mistake in IsolatedClientLoader
      f7fe9e47
    • Reynold Xin's avatar
      Update documentation for the new DataFrame reader/writer interface. · 00a71379
      Reynold Xin authored
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6522 from rxin/sql-doc-1.4 and squashes the following commits:
      
      c227be7 [Reynold Xin] Updated link.
      040b6d7 [Reynold Xin] Update documentation for the new DataFrame reader/writer interface.
      00a71379
    • Reynold Xin's avatar
      [SPARK-7971] Add JavaDoc style deprecation for deprecated DataFrame methods · c63e1a74
      Reynold Xin authored
      Scala deprecated annotation actually doesn't show up in JavaDoc.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6523 from rxin/df-deprecated-javadoc and squashes the following commits:
      
      26da2b2 [Reynold Xin] [SPARK-7971] Add JavaDoc style deprecation for deprecated DataFrame methods.
      c63e1a74
    • Reynold Xin's avatar
      [SQL] Tighten up visibility for JavaDoc. · 14b314dc
      Reynold Xin authored
      I went through all the JavaDocs and tightened up visibility.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6526 from rxin/sql-1.4-visibility-for-docs and squashes the following commits:
      
      bc37d1e [Reynold Xin] Tighten up visibility for JavaDoc.
      14b314dc
    • Xiangrui Meng's avatar
      [SPARK-5610] [DOC] update genjavadocSettings to use the patched version of genjavadoc · 2b258e1c
      Xiangrui Meng authored
      This PR updates `genjavadocSettings` to use a patched version of `genjavadoc-plugin` that hides package private classes/methods/interfaces in the generated Java API doc. The patch can be found at: https://github.com/typesafehub/genjavadoc/compare/master...mengxr:spark-1.4.
      
      It wasn't merged into the main repo because there exist corner cases where a package private Scala class has to be a Java public class in order to compile. This doesn't seem to apply to the Spark codebase. So we release a patched version under `org.spark-project` and use it in the Spark build. brkyvz is publishing the artifacts to Maven Central.
      
      Need more people audit the generated APIs and make sure we don't have false negatives.
      
      Current listed classes under `org.apache.spark.rdd`:
      ![screen shot 2015-05-29 at 12 48 52 pm](https://cloud.githubusercontent.com/assets/829644/7891396/28fb9daa-0601-11e5-8ed8-4e9522d25a71.png)
      
      After this PR:
      ![screen shot 2015-05-29 at 12 48 23 pm](https://cloud.githubusercontent.com/assets/829644/7891408/408e210e-0601-11e5-975c-ff0a02eb5c91.png)
      
      cc: pwendell rxin srowen
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #6506 from mengxr/SPARK-5610 and squashes the following commits:
      
      489c785 [Xiangrui Meng] update genjavadocSettings to use the patched version of genjavadoc
      2b258e1c
    • Josh Rosen's avatar
      [HOTFIX] Replace FunSuite with SparkFunSuite. · 66a53a69
      Josh Rosen authored
      This fixes a build break introduced by merging a6430028,
      which fails the new style checks that ensure that we use SparkFunSuite instead
      of FunSuite.
      66a53a69
    • Mike Dusenberry's avatar
      [SPARK-7920] [MLLIB] Make MLlib ChiSqSelector Serializable (& Fix Related Documentation Example). · 1281a351
      Mike Dusenberry authored
      The MLlib ChiSqSelector class is not serializable, and so the example in the ChiSqSelector documentation fails. Also, that example is missing the import of ChiSqSelector.
      
      This PR makes ChiSqSelector extend Serializable in MLlib, and adds the ChiSqSelector import statement to the associated example in the documentation.
      
      Author: Mike Dusenberry <dusenberrymw@gmail.com>
      
      Closes #6462 from dusenberrymw/Make_ChiSqSelector_Serializable_and_Fix_Related_Docs_Example and squashes the following commits:
      
      9cb2f94 [Mike Dusenberry] Make MLlib ChiSqSelector Serializable.
      d9003bf [Mike Dusenberry] Add missing import in MLlib ChiSqSelector Docs Scala example.
      1281a351
    • Yanbo Liang's avatar
      [SPARK-7918] [MLLIB] MLlib Python doc parity check for evaluation and feature · 1617363f
      Yanbo Liang authored
      Check then make the MLlib Python evaluation and feature doc to be as complete as the Scala doc.
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #6461 from yanboliang/spark-7918 and squashes the following commits:
      
      940e3f1 [Yanbo Liang] truncate too long line and remove extra sparse
      a80ae58 [Yanbo Liang] MLlib Python doc parity check for evaluation and feature
      1617363f
    • Josh Rosen's avatar
      [SPARK-7855] Move bypassMergeSort-handling from ExternalSorter to own component · a6430028
      Josh Rosen authored
      Spark's `ExternalSorter` writes shuffle output files during sort-based shuffle. Sort-shuffle contains a configuration, `spark.shuffle.sort.bypassMergeThreshold`, which causes ExternalSorter to skip sorting and merging and simply write separate files per partition, which are then concatenated together to form the final map output file.
      
      The code paths used during this bypass are almost completely separate from ExternalSorter's other code paths, so refactoring them into a separate file can significantly simplify the code.
      
      In addition to re-arranging code, this patch deletes a bunch of dead code.  The main entry point into ExternalSorter is `insertAll()` and in SPARK-4479 / #3422 this method was modified to completely bypass in-memory buffering of records when `bypassMergeSort` takes effect. As a result, some of the spilling and merging code paths will no longer be called when `bypassMergeSort` is used, so we should be able to safely remove that code.
      
      There's an open JIRA ([SPARK-6026](https://issues.apache.org/jira/browse/SPARK-6026)) for removing the `bypassMergeThreshold` parameter and code paths; I have not done that here, but the changes in this patch will make removing that parameter significantly easier if we ever decide to do that.
      
      This patch also makes several improvements to shuffle-related tests and adds more defensive checks to certain shuffle classes:
      
      - DiskBlockObjectWriter now throws an exception if `fileSegment()` is called before `commitAndClose()` has been called.
      - DiskBlockObjectWriter's close methods are now idempotent, so calling any of the close methods twice in a row will no longer result in incorrect shuffle write metrics changes.  Calling `revertPartialWritesAndClose()` on a closed DiskBlockObjectWriter now has no effect (before, it might mess up the metrics).
      - The end-to-end shuffle record count metrics tests have been moved from InputOutputMetricsSuite to ShuffleSuite.  This means that these tests will now be run against all shuffle implementations rather than just the default shuffle configuration.
      - The end-to-end metrics tests now include a test of a job which performs aggregation in the shuffle.
      - Our tests now check that `shuffleBytesWritten == totalShuffleBytesRead`.
      - FileSegment now throws IllegalArgumentException if it is constructed with a negative length or offset.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #6397 from JoshRosen/external-sorter-bypass-cleanup and squashes the following commits:
      
      bf3f3f6 [Josh Rosen] Merge remote-tracking branch 'origin/master' into external-sorter-bypass-cleanup
      8b216c4 [Josh Rosen] Guard against negative offsets and lengths in FileSegment
      03f35a4 [Josh Rosen] Minor fix to cleanup logic.
      b5cc35b [Josh Rosen] Move shuffle metrics tests to ShuffleSuite.
      8b8fb9e [Josh Rosen] Add more tests + defensive programming to DiskBlockObjectWriter.
      16564eb [Josh Rosen] Guard against calling fileSegment() before commitAndClose() has been called.
      96811b4 [Josh Rosen] Remove confusing taskMetrics.shuffleWriteMetrics() optional call
      8522b6a [Josh Rosen] Do not perform a map-side sort unless we're also doing map-side aggregation
      08e40f3 [Josh Rosen] Remove excessively clever (and wrong) implementation of newBuffer()
      d7f9938 [Josh Rosen] Add missing overrides; fix compilation
      71d76ff [Josh Rosen] Update Javadoc
      bf0d98f [Josh Rosen] Add comment to clarify confusing factory code
      5197f73 [Josh Rosen] Add missing private[this]
      30ef2c8 [Josh Rosen] Convert BypassMergeSortShuffleWriter to Java
      bc1a820 [Josh Rosen] Fix bug when aggregator is used but map-side combine is disabled
      0d3dcc0 [Josh Rosen] Remove unnecessary overloaded methods
      25b964f [Josh Rosen] Rename SortShuffleSorter to SortShuffleFileWriter
      0d9848c [Josh Rosen] Make it more clear that curWriteMetrics is now only used for spill metrics
      7af7aea [Josh Rosen] Combine spill() and spillToMergeableFile()
      6320112 [Josh Rosen] Add missing negation in deletion success check.
      d267e0d [Josh Rosen] Fix style issue
      7f15f7b [Josh Rosen] Back out extra cleanup-handling code, since this is already covered in stop()
      25aa3bd [Josh Rosen] Make sure to delete outputFile after errors.
      931ca68 [Josh Rosen] Refactor tests.
      6a35716 [Josh Rosen] Refactor logic for deciding when to bypass
      4b03539 [Josh Rosen] Move conf prior to first use
      1265b25 [Josh Rosen] Fix some style errors and comments.
      02355ef [Josh Rosen] More simplification
      d4cb536 [Josh Rosen] Delete more unused code
      bb96678 [Josh Rosen] Add missing interface file
      b6cc1eb [Josh Rosen] Realize that bypass never buffers; proceed to delete tons of code
      6185ee2 [Josh Rosen] WIP towards moving bypass code into own file.
      8d0678c [Josh Rosen] Move diskBytesSpilled getter next to variable
      19bccd6 [Josh Rosen] Remove duplicated buffer creation code.
      18959bb [Josh Rosen] Move comparator methods closer together.
      a6430028
    • Reynold Xin's avatar
    • Cheng Lian's avatar
      [SPARK-7849] [SQL] [Docs] Updates SQL programming guide for 1.4 · 6e3f0c78
      Cheng Lian authored
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #6520 from liancheng/spark-7849 and squashes the following commits:
      
      705264b [Cheng Lian] Updates SQL programming guide for 1.4
      6e3f0c78
    • Reynold Xin's avatar
      Closes #4685 · d34b43bd
      Reynold Xin authored
      d34b43bd
    • Taka Shinagawa's avatar
      [DOCS] [MINOR] Update for the Hadoop versions table with hadoop-2.6 · 3ab71eb9
      Taka Shinagawa authored
      Updated the doc for the hadoop-2.6 profile, which is new to Spark 1.4
      
      Author: Taka Shinagawa <taka.epsilon@gmail.com>
      
      Closes #6450 from mrt/docfix2 and squashes the following commits:
      
      db1c43b [Taka Shinagawa] Updated the hadoop versions for hadoop-2.6 profile
      323710e [Taka Shinagawa] The hadoop-2.6 profile is added to the Hadoop versions table
      3ab71eb9
    • zhichao.li's avatar
      [SPARK-7717] [WEBUI] Only showing total memory and cores for alive workers · 2b35c99c
      zhichao.li authored
      Author: zhichao.li <zhichao.li@intel.com>
      
      Closes #6317 from zhichao-li/workers and squashes the following commits:
      
      d68bf11 [zhichao.li] change prefix
      99b6768 [zhichao.li] remove extra space and add 'Alive' prefix
      1e8eb06 [zhichao.li] only showing alive workers
      2b35c99c
    • WangTaoTheTonic's avatar
      [SPARK-7945] [CORE] Do trim to values in properties file · 9d8aadb7
      WangTaoTheTonic authored
      https://issues.apache.org/jira/browse/SPARK-7945
      
      Now applications submited by org.apache.spark.launcher.Main read properties file without doing trim to values in it.
      If user left a space after a value(say spark.driver.extraClassPath) then it probably affect global functions(like some jar could not be included in the classpath), so we should do it like Utils.getPropertiesFromFile.
      
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      Author: Tao Wang <wangtao111@huawei.com>
      
      Closes #6496 from WangTaoTheTonic/SPARK-7945 and squashes the following commits:
      
      bb41b4b [Tao Wang] indent 4 to 2
      6dd1cf2 [WangTaoTheTonic] use a simpler way
      2c053a1 [WangTaoTheTonic] Do trim to values in properties file
      9d8aadb7
    • Sean Owen's avatar
      [SPARK-7890] [DOCS] Document that Spark 2.11 now supports Kafka · 8c8de3ed
      Sean Owen authored
      Remove caveat about Kafka / JDBC not being supported for Scala 2.11
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #6470 from srowen/SPARK-7890 and squashes the following commits:
      
      4652634 [Sean Owen] One more rewording
      7b7f3c8 [Sean Owen] Restore note about JDBC component
      126744d [Sean Owen] Remove caveat about Kafka / JDBC not being supported for Scala 2.11
      8c8de3ed
    • Wenchen Fan's avatar
      [SPARK-7964][SQL] remove unnecessary type coercion rule · 0978aec9
      Wenchen Fan authored
      We have defined these logics in `Cast` already, I think we should remove this rule.
      
      Author: Wenchen Fan <cloud0fan@outlook.com>
      
      Closes #6516 from cloud-fan/tmp2 and squashes the following commits:
      
      d5035a4 [Wenchen Fan] remove useless rule
      0978aec9
    • Octavian Geagla's avatar
      [SPARK-7459] [MLLIB] ElementwiseProduct Java example · e3a43748
      Octavian Geagla authored
      Author: Octavian Geagla <ogeagla@gmail.com>
      
      Closes #6008 from ogeagla/elementwise-prod-doc and squashes the following commits:
      
      72e6dc0 [Octavian Geagla] [SPARK-7459] [MLLIB] Java example import.
      cf2afbd [Octavian Geagla] [SPARK-7459] [MLLIB] Update description of example.
      b66431b [Octavian Geagla] [SPARK-7459] [MLLIB] Add override annotation to java example, make scala example use same data as java.
      6b26b03 [Octavian Geagla] [SPARK-7459] [MLLIB] Fix line which is too long.
      79af020 [Octavian Geagla] [SPARK-7459] [MLLIB] Actually don't use Java 8.
      9d5b31a [Octavian Geagla] [SPARK-7459] [MLLIB] Don't use Java 8
      4f0c92f [Octavian Geagla] [SPARK-7459] [MLLIB] ElementwiseProduct Java example.
      e3a43748
    • Timothy Chen's avatar
      [SPARK-7962] [MESOS] Fix master url parsing in rest submission client. · 78657d53
      Timothy Chen authored
      Only parse standalone master url when master url starts with spark://
      
      Author: Timothy Chen <tnachen@gmail.com>
      
      Closes #6517 from tnachen/fix_mesos_client and squashes the following commits:
      
      61a1198 [Timothy Chen] Fix master url parsing in rest submission client.
      78657d53
    • Octavian Geagla's avatar
      [SPARK-7576] [MLLIB] Add spark.ml user guide doc/example for ElementwiseProduct · da2112ae
      Octavian Geagla authored
      Author: Octavian Geagla <ogeagla@gmail.com>
      
      Closes #6501 from ogeagla/ml-guide-elemwiseprod and squashes the following commits:
      
      4ad93d5 [Octavian Geagla] [SPARK-7576] [MLLIB] Incorporate code review feedback.
      f7be7ad [Octavian Geagla] [SPARK-7576] [MLLIB] Add spark.ml user guide doc/example for ElementwiseProduct.
      da2112ae
    • Andrew Or's avatar
      [TRIVIAL] Typo fix for last commit · 193dba01
      Andrew Or authored
      193dba01
    • Andrew Or's avatar
      [SPARK-7558] Guard against direct uses of FunSuite / FunSuiteLike · 609c4923
      Andrew Or authored
      This is a follow-up patch to #6441.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6510 from andrewor14/extends-funsuite-check and squashes the following commits:
      
      6618b46 [Andrew Or] Exempt SparkSinkSuite from the FunSuite check
      99d02ac [Andrew Or] Merge branch 'master' of github.com:apache/spark into extends-funsuite-check
      48874dd [Andrew Or] Guard against direct uses of FunSuite / FunSuiteLike
      609c4923
    • Burak Yavuz's avatar
      [SPARK-7957] Preserve partitioning when using randomSplit · 7ed06c39
      Burak Yavuz authored
      cc JoshRosen
      Thanks for noticing this!
      
      Author: Burak Yavuz <brkyvz@gmail.com>
      
      Closes #6509 from brkyvz/sample-perf-reg and squashes the following commits:
      
      497465d [Burak Yavuz] addressed code review
      293f95f [Burak Yavuz] [SPARK-7957] Preserve partitioning when using randomSplit
      7ed06c39
  3. May 29, 2015
    • Taka Shinagawa's avatar
      [DOCS][Tiny] Added a missing dash(-) in docs/configuration.md · 3792d258
      Taka Shinagawa authored
      The first line had only two dashes (--) instead of three(---). Because of this missing dash(-), 'jekyll build' command was not converting configuration.md to _site/configuration.html
      
      Author: Taka Shinagawa <taka.epsilon@gmail.com>
      
      Closes #6513 from mrt/docfix3 and squashes the following commits:
      
      c470e2c [Taka Shinagawa] Added a missing dash(-) preventing jekyll from converting configuration.md to html format
      3792d258
    • Andrew Or's avatar
      [HOT FIX] [BUILD] Fix maven build failures · a4f24123
      Andrew Or authored
      This patch fixes a build break in maven caused by #6441.
      
      Note that this patch reverts the changes in flume-sink because
      this module does not currently depend on Spark core, but the
      tests require it. There is not an easy way to make this work
      because mvn test dependencies are not transitive (MNG-1378).
      
      For now, we will leave the one test suite in flume-sink out
      until we figure out a better solution. This patch is mainly
      intended to unbreak the maven build.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6511 from andrewor14/fix-build-mvn and squashes the following commits:
      
      3d53643 [Andrew Or] [HOT FIX #6441] Fix maven build failures
      a4f24123
    • Andrew Or's avatar
      [HOTFIX] [SQL] Maven test compilation issue · 8c997933
      Andrew Or authored
      Tests compile in SBT but not Maven.
      8c997933
    • Ram Sriharsha's avatar
      [SPARK-6013] [ML] Add more Python ML examples for spark.ml · dbf8ff38
      Ram Sriharsha authored
      Author: Ram Sriharsha <rsriharsha@hw11853.local>
      
      Closes #6443 from harsha2010/SPARK-6013 and squashes the following commits:
      
      732506e [Ram Sriharsha] Code Review Feedback
      121c211 [Ram Sriharsha] python style fix
      5f9b8c3 [Ram Sriharsha] python style fixes
      925ca86 [Ram Sriharsha] Simple Params Example
      8b372b1 [Ram Sriharsha] GBT Example
      965ec14 [Ram Sriharsha] Random Forest Example
      dbf8ff38
    • Shivaram Venkataraman's avatar
      [SPARK-7954] [SPARKR] Create SparkContext in sparkRSQL init · 5fb97dca
      Shivaram Venkataraman authored
      cc davies
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #6507 from shivaram/sparkr-init and squashes the following commits:
      
      6fdd169 [Shivaram Venkataraman] Create SparkContext in sparkRSQL init
      5fb97dca
    • Holden Karau's avatar
      [SPARK-7910] [TINY] [JAVAAPI] expose partitioner information in javardd · 82a396c2
      Holden Karau authored
      Author: Holden Karau <holden@pigscanfly.ca>
      
      Closes #6464 from holdenk/SPARK-7910-expose-partitioner-information-in-javardd and squashes the following commits:
      
      de1e644 [Holden Karau] Fix the test to get the partitioner
      bdb31cc [Holden Karau] Add Mima exclude for the new method
      347ef4c [Holden Karau] Add a quick little test for the partitioner JavaAPI
      f49dca9 [Holden Karau] Add partitoner information to JavaRDDLike and fix some whitespace
      82a396c2
    • Michael Nazario's avatar
      [SPARK-7899] [PYSPARK] Fix Python 3 pyspark/sql/types module conflict · 1c5b1982
      Michael Nazario authored
      This PR makes the types module in `pyspark/sql/types` work with pylint static analysis by removing the dynamic naming of the `pyspark/sql/_types` module to `pyspark/sql/types`.
      
      Tests are now loaded using `$PYSPARK_DRIVER_PYTHON -m module` rather than `$PYSPARK_DRIVER_PYTHON module.py`. The old method adds the location of `module.py` to `sys.path`, so this change prevents accidental use of relative paths in Python.
      
      Author: Michael Nazario <mnazario@palantir.com>
      
      Closes #6439 from mnazario/feature/SPARK-7899 and squashes the following commits:
      
      366ef30 [Michael Nazario] Remove hack on random.py
      bb8b04d [Michael Nazario] Make doctests consistent with other tests
      6ee4f75 [Michael Nazario] Change test scripts to use "-m"
      673528f [Michael Nazario] Move _types back to types
      1c5b1982
    • Shivaram Venkataraman's avatar
      [SPARK-6806] [SPARKR] [DOCS] Add a new SparkR programming guide · 5f48e5c3
      Shivaram Venkataraman authored
      This PR adds a new SparkR programming guide at the top-level. This will be useful for R users as our APIs don't directly match the Scala/Python APIs and as we need to explain SparkR without using RDDs as examples etc.
      
      cc rxin davies pwendell
      
      cc cafreeman -- Would be great if you could also take a look at this !
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #6490 from shivaram/sparkr-guide and squashes the following commits:
      
      d5ff360 [Shivaram Venkataraman] Add a section on HiveContext, HQL queries
      408dce5 [Shivaram Venkataraman] Fix link
      dbb86e3 [Shivaram Venkataraman] Fix minor typo
      9aff5e0 [Shivaram Venkataraman] Address comments, use dplyr-like syntax in example
      d09703c [Shivaram Venkataraman] Fix default argument in read.df
      ea816a1 [Shivaram Venkataraman] Add a new SparkR programming guide Also update write.df, read.df to handle defaults better
      5f48e5c3
    • Andrew Or's avatar
      [SPARK-7558] Demarcate tests in unit-tests.log · 9eb222c1
      Andrew Or authored
      Right now `unit-tests.log` are not of much value because we can't tell where the test boundaries are easily. This patch adds log statements before and after each test to outline the test boundaries, e.g.:
      
      ```
      ===== TEST OUTPUT FOR o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' =====
      
      15/05/27 12:36:39.596 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO SparkContext: Starting job: count at KryoSerializerSuite.scala:230
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Got job 3 (count at KryoSerializerSuite.scala:230) with 4 output partitions (allowLocal=false)
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Final stage: ResultStage 3(count at KryoSerializerSuite.scala:230)
      15/05/27 12:36:39.596 dag-scheduler-event-loop INFO DAGScheduler: Parents of final stage: List()
      15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Missing parents: List()
      15/05/27 12:36:39.597 dag-scheduler-event-loop INFO DAGScheduler: Submitting ResultStage 3 (ParallelCollectionRDD[5] at parallelize at KryoSerializerSuite.scala:230), which has no missing parents
      
      ...
      
      15/05/27 12:36:39.624 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO DAGScheduler: Job 3 finished: count at KryoSerializerSuite.scala:230, took 0.028563 s
      15/05/27 12:36:39.625 pool-1-thread-1-ScalaTest-running-KryoSerializerSuite INFO KryoSerializerSuite:
      
      ***** FINISHED o.a.s.serializer.KryoSerializerSuite: 'kryo with parallelize for primitive arrays' *****
      
      ...
      ```
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #6441 from andrewor14/demarcate-tests and squashes the following commits:
      
      879b060 [Andrew Or] Fix compile after rebase
      d622af7 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      017c8ba [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      7790b6c [Andrew Or] Fix tests after logical merge conflict
      c7460c0 [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      c43ffc4 [Andrew Or] Fix tests?
      8882581 [Andrew Or] Fix tests
      ee22cda [Andrew Or] Fix log message
      fa9450e [Andrew Or] Merge branch 'master' of github.com:apache/spark into demarcate-tests
      12d1e1b [Andrew Or] Various whitespace changes (minor)
      69cbb24 [Andrew Or] Make all test suites extend SparkFunSuite instead of FunSuite
      bbce12e [Andrew Or] Fix manual things that cannot be covered through automation
      da0b12f [Andrew Or] Add core tests as dependencies in all modules
      f7d29ce [Andrew Or] Introduce base abstract class for all test suites
      9eb222c1
    • Reynold Xin's avatar
      [SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH,... · 94f62a49
      Reynold Xin authored
      [SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, LARROW, RARROW in style checker.
      
      …
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #6491 from rxin/more-whitespace and squashes the following commits:
      
      f6e63dc [Reynold Xin] [SPARK-7940] Enforce whitespace checking for DO, TRY, CATCH, FINALLY, MATCH, LARROW, RARROW in style checker.
      94f62a49
    • MechCoder's avatar
      [SPARK-7946] [MLLIB] DecayFactor wrongly set in StreamingKMeans · 6181937f
      MechCoder authored
      Author: MechCoder <manojkumarsivaraj334@gmail.com>
      
      Closes #6497 from MechCoder/spark-7946 and squashes the following commits:
      
      2fdd0a3 [MechCoder] Add non-regression test
      8c988c6 [MechCoder] [SPARK-7946] DecayFactor wrongly set in StreamingKMeans
      6181937f
    • Cheng Lian's avatar
      [SQL] [TEST] [MINOR] Uses a temporary log4j.properties in... · 4782e130
      Cheng Lian authored
      [SQL] [TEST] [MINOR] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior
      
      The `HiveThriftServer2Test` relies on proper logging behavior to assert whether the Thrift server daemon process is started successfully. However, some other jar files listed in the classpath may potentially contain an unexpected Log4J configuration file which overrides the logging behavior.
      
      This PR writes a temporary `log4j.properties` and prepend it to driver classpath before starting the testing Thrift server process to ensure proper logging behavior.
      
      cc andrewor14 yhuai
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #6493 from liancheng/override-log4j and squashes the following commits:
      
      c489e0e [Cheng Lian] Fixes minor Scala styling issue
      b46ef0d [Cheng Lian] Uses a temporary log4j.properties in HiveThriftServer2Test to ensure expected logging behavior
      4782e130
    • Cheng Lian's avatar
      [SPARK-7950] [SQL] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext() · e7b61775
      Cheng Lian authored
      When starting `HiveThriftServer2` via `startWithContext`, property `spark.sql.hive.version` isn't set. This causes Simba ODBC driver 1.0.8.1006 behaves differently and fails simple queries.
      
      Hive2 JDBC driver works fine in this case. Also, when starting the server with `start-thriftserver.sh`, both Hive2 JDBC driver and Simba ODBC driver works fine.
      
      Please refer to [SPARK-7950] [1] for details.
      
      [1]: https://issues.apache.org/jira/browse/SPARK-7950
      
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #6500 from liancheng/odbc-bugfix and squashes the following commits:
      
      051e3a3 [Cheng Lian] Fixes import order
      3a97376 [Cheng Lian] Sets spark.sql.hive.version in HiveThriftServer2.startWithContext()
      e7b61775
    • WangTaoTheTonic's avatar
      [SPARK-7524] [SPARK-7846] add configs for keytab and principal, pass these two... · a51b133d
      WangTaoTheTonic authored
      [SPARK-7524] [SPARK-7846] add configs for keytab and principal, pass these two configs with different way in different modes
      
      * As spark now supports long running service by updating tokens for namenode, but only accept parameters passed with "--k=v" format which is not very convinient. This patch add spark.* configs in properties file and system property.
      
      *  --principal and --keytabl options are passed to client but when we started thrift server or spark-shell these two are also passed into the Main class (org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 and org.apache.spark.repl.Main).
      In these two main class, arguments passed in will be processed with some 3rd libraries, which will lead to some error: "Invalid option: --principal" or "Unrecgnised option: --principal".
      We should pass these command args in different forms, say system properties.
      
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      
      Closes #6051 from WangTaoTheTonic/SPARK-7524 and squashes the following commits:
      
      e65699a [WangTaoTheTonic] change logic to loadEnvironments
      ebd9ea0 [WangTaoTheTonic] merge master
      ecfe43a [WangTaoTheTonic] pass keytab and principal seperately in different mode
      33a7f40 [WangTaoTheTonic] expand the use of the current configs
      08bb4e8 [WangTaoTheTonic] fix wrong cite
      73afa64 [WangTaoTheTonic] add configs for keytab and principal, move originals to internal
      a51b133d
    • zsxwing's avatar
      [SPARK-7863] [CORE] Create SimpleDateFormat for every SimpleDateParam instance... · 8db40f67
      zsxwing authored
      [SPARK-7863] [CORE] Create SimpleDateFormat for every SimpleDateParam instance because it's not thread-safe
      
      SimpleDateFormat is not thread-safe. This PR creates new `SimpleDateFormat` for each `SimpleDateParam` instance.
      
      Author: zsxwing <zsxwing@gmail.com>
      
      Closes #6406 from zsxwing/SPARK-7863 and squashes the following commits:
      
      aeed4c1 [zsxwing] Rewrite SimpleDateParam
      8cdd986 [zsxwing] Inline formats
      9680a15 [zsxwing] Create SimpleDateFormat for each SimpleDateParam instance because it's not thread-safe
      8db40f67
Loading