Skip to content
Snippets Groups Projects
  1. Feb 15, 2015
    • Sean Owen's avatar
      SPARK-5815 [MLLIB] Deprecate SVDPlusPlus APIs that expose DoubleMatrix from JBLAS · acf2558d
      Sean Owen authored
      Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix
      
      CC mengxr
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4614 from srowen/SPARK-5815 and squashes the following commits:
      
      288cb05 [Sean Owen] Clarify deprecation plans in scaladoc
      497458e [Sean Owen] Deprecate SVDPlusPlus.run and introduce SVDPlusPlus.runSVDPlusPlus with return type that doesn't include DoubleMatrix
      acf2558d
    • Xiangrui Meng's avatar
      [SPARK-5769] Set params in constructors and in setParams in Python ML pipelines · cd4a1536
      Xiangrui Meng authored
      This PR allow Python users to set params in constructors and in setParams, where we use decorator `keyword_only` to force keyword arguments. The trade-off is discussed in the design doc of SPARK-4586.
      
      Generated doc:
      ![screen shot 2015-02-12 at 3 06 58 am](https://cloud.githubusercontent.com/assets/829644/6166491/9cfcd06a-b265-11e4-99ea-473d866634fc.png)
      
      CC: davies rxin
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #4564 from mengxr/py-pipeline-kw and squashes the following commits:
      
      fedf720 [Xiangrui Meng] use toDF
      d565f2c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into py-pipeline-kw
      cbc15d3 [Xiangrui Meng] fix style
      5032097 [Xiangrui Meng] update pipeline signature
      950774e [Xiangrui Meng] simplify keyword_only and update constructor/setParams signatures
      fdde5fc [Xiangrui Meng] fix style
      c9384b8 [Xiangrui Meng] fix sphinx doc
      8e59180 [Xiangrui Meng] add setParams and make constructors take params, where we force keyword args
      cd4a1536
    • Sean Owen's avatar
      SPARK-5669 [BUILD] Spark assembly includes incompatibly licensed libgfortran, libgcc code via JBLAS · 836577b3
      Sean Owen authored
      Exclude libgfortran, libgcc bundled by JBLAS for Windows. This much is simple, and solves the essential license issue. But the more important question is whether MLlib works on Windows then.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4453 from srowen/SPARK-5669 and squashes the following commits:
      
      734dd86 [Sean Owen] Exclude libgfortran, libgcc bundled by JBLAS, affecting Windows / OS X / Linux 32-bit (not Linux 64-bit)
      836577b3
    • martinzapletal's avatar
      [MLLIB][SPARK-5502] User guide for isotonic regression · 61eb1267
      martinzapletal authored
      User guide for isotonic regression added to docs/mllib-regression.md including code examples for Scala and Java.
      
      Author: martinzapletal <zapletal-martin@email.cz>
      
      Closes #4536 from zapletal-martin/SPARK-5502 and squashes the following commits:
      
      67fe773 [martinzapletal] SPARK-5502 reworded model prediction rules to use more general language rather than the code/implementation specific terms
      80bd4c3 [martinzapletal] SPARK-5502 created docs page for isotonic regression, added links to the page, updated data and examples
      7d8136e [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java
      504b5c3 [martinzapletal] SPARK-5502 Added documentation for Isotonic regression including examples for Scala and Java
      61eb1267
    • Takeshi Yamamuro's avatar
      [SPARK-5827][SQL] Add missing import in the example of SqlContext · c771e475
      Takeshi Yamamuro authored
      If one tries an example by using copy&paste, throw an exception.
      
      Author: Takeshi Yamamuro <linguin.m.s@gmail.com>
      
      Closes #4615 from maropu/AddMissingImportInSqlContext and squashes the following commits:
      
      ab21b66 [Takeshi Yamamuro] Add missing import in the example of SqlContext
      c771e475
  2. Feb 14, 2015
    • gli's avatar
      SPARK-5822 [BUILD] cannot import src/main/scala & src/test/scala into eclipse as source folder · ed5f4bb7
      gli authored
         When import the whole project into eclipse as maven project, found that the
         src/main/scala & src/test/scala can not be set as source folder as default
         behavior, so add a "add-source" goal in scala-maven-plugin to let this work.
      
      Author: gli <gli@redhat.com>
      
      Closes #4531 from ligangty/addsource and squashes the following commits:
      
      4e4db4c [gli] [IDE] cannot import src/main/scala & src/test/scala into eclipse as source folder
      ed5f4bb7
    • Sean Owen's avatar
      Revise formatting of previous commit f80e2629 · 15a2ab5f
      Sean Owen authored
      15a2ab5f
    • gasparms's avatar
      [SPARK-5800] Streaming Docs. Change linked files according the selected language · f80e2629
      gasparms authored
      Currently, Spark Streaming Programming Guide after updateStateByKey  explanation links to file stateful_network_wordcount.py and note "For the complete Scala code ..." for any language tab selected. This is an incoherence.
      
      I've changed the guide and link its pertinent example file. JavaStatefulNetworkWordCount.java example was not created so I added to the commit.
      
      Author: gasparms <gmunoz@stratio.com>
      
      Closes #4589 from gasparms/feature/streaming-guide and squashes the following commits:
      
      7f37f89 [gasparms] More style changes
      ec202b0 [gasparms] Follow spark style guide
      f527328 [gasparms] Improve example to look like scala example
      4d8785c [gasparms] Remove throw exception
      e92e6b8 [gasparms] Fix incoherence
      92db405 [gasparms] Fix Streaming Programming Guide. Change files according the selected language
      f80e2629
    • Reynold Xin's avatar
      [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames · e98dfe62
      Reynold Xin authored
      - The old implicit would convert RDDs directly to DataFrames, and that added too many methods.
      - toDataFrame -> toDF
      - Dsl -> functions
      - implicits moved into SQLContext.implicits
      - addColumn -> withColumn
      - renameColumn -> withColumnRenamed
      
      Python changes:
      - toDataFrame -> toDF
      - Dsl -> functions package
      - addColumn -> withColumn
      - renameColumn -> withColumnRenamed
      - add toDF functions to RDD on SQLContext init
      - add flatMap to DataFrame
      
      Author: Reynold Xin <rxin@databricks.com>
      Author: Davies Liu <davies@databricks.com>
      
      Closes #4556 from rxin/SPARK-5752 and squashes the following commits:
      
      5ef9910 [Reynold Xin] More fix
      61d3fca [Reynold Xin] Merge branch 'df5' of github.com:davies/spark into SPARK-5752
      ff5832c [Reynold Xin] Fix python
      749c675 [Reynold Xin] count(*) fixes.
      5806df0 [Reynold Xin] Fix build break again.
      d941f3d [Reynold Xin] Fixed explode compilation break.
      fe1267a [Davies Liu] flatMap
      c4afb8e [Reynold Xin] style
      d9de47f [Davies Liu] add comment
      b783994 [Davies Liu] add comment for toDF
      e2154e5 [Davies Liu] schema() -> schema
      3a1004f [Davies Liu] Dsl -> functions, toDF()
      fb256af [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
      0dd74eb [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
      97dd47c [Davies Liu] fix mistake
      6168f74 [Davies Liu] fix test
      1fc0199 [Davies Liu] fix test
      a075cd5 [Davies Liu] clean up, toPandas
      663d314 [Davies Liu] add test for agg('*')
      9e214d5 [Reynold Xin] count(*) fixes.
      1ed7136 [Reynold Xin] Fix build break again.
      921b2e3 [Reynold Xin] Fixed explode compilation break.
      14698d4 [Davies Liu] flatMap
      ba3e12d [Reynold Xin] style
      d08c92d [Davies Liu] add comment
      5c8b524 [Davies Liu] add comment for toDF
      a4e5e66 [Davies Liu] schema() -> schema
      d377fc9 [Davies Liu] Dsl -> functions, toDF()
      6b3086c [Reynold Xin] - toDataFrame -> toDF - Dsl -> functions - implicits moved into SQLContext.implicits - addColumn -> withColumn - renameColumn -> withColumnRenamed
      807e8b1 [Reynold Xin] [SPARK-5752][SQL] Don't implicitly convert RDDs directly to DataFrames
      e98dfe62
  3. Feb 13, 2015
    • Sean Owen's avatar
      SPARK-3290 [GRAPHX] No unpersist callls in SVDPlusPlus · 0ce4e430
      Sean Owen authored
      This just unpersist()s each RDD in this code that was cache()ed.
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #4234 from srowen/SPARK-3290 and squashes the following commits:
      
      66c1e11 [Sean Owen] unpersist() each RDD that was cache()ed
      0ce4e430
    • Josh Rosen's avatar
      [SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRecordReaderSuite · d06d5ee9
      Josh Rosen authored
      This patch fixes two difficult-to-reproduce Jenkins test failures in InputOutputMetricsSuite (SPARK-5227 and SPARK-5679).  The problem was that WholeTextFileRecordReaderSuite modifies the `fs.local.block.size` Hadoop configuration and this change was affecting subsequent test suites due to Hadoop's caching of FileSystem instances (see HADOOP-8490 for more details).
      
      The fix implemented here is to disable FileSystem caching in WholeTextFileRecordReaderSuite.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #4599 from JoshRosen/inputoutputsuite-fix and squashes the following commits:
      
      47dc447 [Josh Rosen] [SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRecordReaderSuite
      d06d5ee9
    • Xiangrui Meng's avatar
      [SPARK-5730][ML] add doc groups to spark.ml components · 4f4c6d5a
      Xiangrui Meng authored
      This PR adds three groups to the ScalaDoc: `param`, `setParam`, and `getParam`. Params will show up in the generated Scala API doc as the top group. Setters/getters will be at the bottom.
      
      Preview:
      
      ![screen shot 2015-02-13 at 2 47 49 pm](https://cloud.githubusercontent.com/assets/829644/6196657/5740c240-b38f-11e4-94bb-bd8ef5a796c5.png)
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #4600 from mengxr/SPARK-5730 and squashes the following commits:
      
      febed9a [Xiangrui Meng] add doc groups to spark.ml components
      4f4c6d5a
    • Xiangrui Meng's avatar
      [SPARK-5803][MLLIB] use ArrayBuilder to build primitive arrays · d50a91d5
      Xiangrui Meng authored
      because ArrayBuffer is not specialized.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #4594 from mengxr/SPARK-5803 and squashes the following commits:
      
      1261bd5 [Xiangrui Meng] merge master
      a4ea872 [Xiangrui Meng] use ArrayBuilder to build primitive arrays
      d50a91d5
    • Xiangrui Meng's avatar
      [SPARK-5806] re-organize sections in mllib-clustering.md · cc56c872
      Xiangrui Meng authored
      Put example code close to the algorithm description.
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #4598 from mengxr/SPARK-5806 and squashes the following commits:
      
      a137872 [Xiangrui Meng] re-organize sections in mllib-clustering.md
      cc56c872
    • Yin Huai's avatar
      [SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson encounters... · 2e0c0845
      Yin Huai authored
      [SPARK-5789][SQL]Throw a better error message if JsonRDD.parseJson encounters unrecoverable parsing errors.
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4582 from yhuai/jsonErrorMessage and squashes the following commits:
      
      152dbd4 [Yin Huai] Update error message.
      1466256 [Yin Huai] Throw a better error message when a JSON object in the input dataset span multiple records (lines for files or strings for an RDD of strings).
      2e0c0845
    • Daoyuan Wang's avatar
      [SPARK-5642] [SQL] Apply column pruning on unused aggregation fields · 2cbb3e43
      Daoyuan Wang authored
      select k from (select key k, max(value) v from src group by k) t
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #4415 from adrian-wang/groupprune and squashes the following commits:
      
      5d2d8a3 [Daoyuan Wang] address Michael's comments
      61f8ef7 [Daoyuan Wang] add a unit test
      80ddcc6 [Daoyuan Wang] keep project
      b69d385 [Daoyuan Wang] add a prune rule for grouping set
      2cbb3e43
    • Andrew Or's avatar
      5d3cc6b3
    • Reynold Xin's avatar
      [HOTFIX] Ignore DirectKafkaStreamSuite. · 378c7eb0
      Reynold Xin authored
      378c7eb0
    • Emre Sevinç's avatar
      SPARK-5805 Fixed the type error in documentation. · 9f31db06
      Emre Sevinç authored
      Fixes SPARK-5805 : Fix the type error in the final example given in MLlib - Clustering documentation.
      
      Author: Emre Sevinç <emre.sevinc@gmail.com>
      
      Closes #4596 from emres/SPARK-5805 and squashes the following commits:
      
      1029f66 [Emre Sevinç] SPARK-5805 Fixed the type error in documentation.
      9f31db06
    • Josh Rosen's avatar
      [SPARK-5735] Replace uses of EasyMock with Mockito · 077eec2d
      Josh Rosen authored
      This patch replaces all uses of EasyMock with Mockito.  There are two motivations for this:
      
      1. We should use a single mocking framework in our tests in order to keep things consistent.
      2. EasyMock may be responsible for non-deterministic unit test failures due to its Objensis dependency (see SPARK-5626 for more details).
      
      Most of these changes are fairly mechanical translations of Mockito code to EasyMock, although I made a small change that strengthens the assertions in one test in KinesisReceiverSuite.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #4578 from JoshRosen/SPARK-5735-remove-easymock and squashes the following commits:
      
      0ab192b [Josh Rosen] Import sorting plus two minor changes to more closely match old semantics.
      977565b [Josh Rosen] Remove EasyMock from build.
      fae1d8f [Josh Rosen] Remove EasyMock usage in KinesisReceiverSuite.
      7cca486 [Josh Rosen] Remove EasyMock usage in MesosSchedulerBackendSuite
      fc5e94d [Josh Rosen] Remove EasyMock in CacheManagerSuite
      077eec2d
    • Ryan Williams's avatar
      [SPARK-5783] Better eventlog-parsing error messages · fc6d3e79
      Ryan Williams authored
      Author: Ryan Williams <ryan.blake.williams@gmail.com>
      
      Closes #4573 from ryan-williams/history and squashes the following commits:
      
      a8647ec [Ryan Williams] fix test calls to .replay()
      98aa3fe [Ryan Williams] include filename in history-parsing error message
      8deecf0 [Ryan Williams] add line number to history-parsing error message
      b668b52 [Ryan Williams] add log info line to history-eventlog parsing
      fc6d3e79
    • sboeschhuawei's avatar
      [SPARK-5503][MLLIB] Example code for Power Iteration Clustering · e1a1ff81
      sboeschhuawei authored
      Author: sboeschhuawei <stephen.boesch@huawei.com>
      
      Closes #4495 from javadba/picexamples and squashes the following commits:
      
      3c84b14 [sboeschhuawei] PIC Examples updates from Xiangrui's comments round 5
      2878675 [sboeschhuawei] Fourth round with xiangrui on PICExample
      d7ac350 [sboeschhuawei] Updates to PICExample from Xiangrui's comments round 3
      d7f0cba [sboeschhuawei] Updates to PICExample from Xiangrui's comments round 3
      cef28f4 [sboeschhuawei] Further updates to PICExample from Xiangrui's comments
      f7ff43d [sboeschhuawei] Update to PICExample from Xiangrui's comments
      efeec45 [sboeschhuawei] Update to PICExample from Xiangrui's comments
      03e8de4 [sboeschhuawei] Added PICExample
      c509130 [sboeschhuawei] placeholder for pic examples
      5864d4a [sboeschhuawei] placeholder for pic examples
      e1a1ff81
    • uncleGen's avatar
      [SPARK-5732][CORE]:Add an option to print the spark version in spark script. · c0ccd256
      uncleGen authored
      Naturally, we may need to add an option to print the spark version in spark script. It is pretty common in script tool.
      ![9](https://cloud.githubusercontent.com/assets/7402327/6183331/cab1b74e-b38e-11e4-9daa-e26e6015cff3.JPG)
      
      Author: uncleGen <hustyugm@gmail.com>
      Author: genmao.ygm <genmao.ygm@alibaba-inc.com>
      
      Closes #4522 from uncleGen/master-clean-150211 and squashes the following commits:
      
      9f2127c [genmao.ygm] revert the behavior of "-v"
      015ddee [uncleGen] minor changes
      463f02c [uncleGen] minor changes
      c0ccd256
    • WangTaoTheTonic's avatar
      [SPARK-4832][Deploy]some other processes might take the daemon pid · 1768bd51
      WangTaoTheTonic authored
      Some other processes might use the pid saved in pid file. In that case we should ignore it and launch daemons.
      
      JIRA is down for maintenance. I will file one once it return.
      
      Author: WangTaoTheTonic <barneystinson@aliyun.com>
      Author: WangTaoTheTonic <wangtao111@huawei.com>
      
      Closes #3683 from WangTaoTheTonic/otherproc and squashes the following commits:
      
      daa86a1 [WangTaoTheTonic] some bash style fix
      8befee7 [WangTaoTheTonic] handle the mistake scenario
      cf4ecc6 [WangTaoTheTonic] remove redundant condition
      f36cfb4 [WangTaoTheTonic] some other processes might take the pid
      1768bd51
    • tianyi's avatar
      [SPARK-3365][SQL]Wrong schema generated for List type · 1c8633f3
      tianyi authored
      This PR fix the issue SPARK-3365.
      The reason is Spark generated wrong schema for the type `List` in `ScalaReflection.scala`
      for example:
      
      the generated schema for type `Seq[String]` is:
      ```
      {"name":"x","type":{"type":"array","elementType":"string","containsNull":true},"nullable":true,"metadata":{}}`
      ```
      
      the generated schema for type `List[String]` is:
      ```
      {"name":"x","type":{"type":"struct","fields":[]},"nullable":true,"metadata":{}}`
      ```
      
      Author: tianyi <tianyi.asiainfo@gmail.com>
      
      Closes #4581 from tianyi/SPARK-3365 and squashes the following commits:
      
      a097e86 [tianyi] change the order of resolution in ScalaReflection.scala
      1c8633f3
  4. Feb 12, 2015
    • Yin Huai's avatar
      [SQL] Fix docs of SQLContext.tables · 2aea892e
      Yin Huai authored
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4579 from yhuai/tablesDoc and squashes the following commits:
      
      7f8964c [Yin Huai] Fix doc.
      2aea892e
    • Yin Huai's avatar
      [SPARK-3299][SQL]Public API in SQLContext to list tables · 1d0596a1
      Yin Huai authored
      https://issues.apache.org/jira/browse/SPARK-3299
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4547 from yhuai/tables and squashes the following commits:
      
      6c8f92e [Yin Huai] Add tableNames.
      acbb281 [Yin Huai] Update Python test.
      7793dcb [Yin Huai] Fix scala test.
      572870d [Yin Huai] Address comments.
      aba2e88 [Yin Huai] Format.
      12c86df [Yin Huai] Add tables() to SQLContext to return a DataFrame containing existing tables.
      1d0596a1
    • Yin Huai's avatar
      [SQL] Move SaveMode to SQL package. · c025a468
      Yin Huai authored
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4542 from yhuai/moveSaveMode and squashes the following commits:
      
      65a4425 [Yin Huai] Move SaveMode to sql package.
      c025a468
    • Vladimir Grigor's avatar
      [SPARK-5335] Fix deletion of security groups within a VPC · ada993e9
      Vladimir Grigor authored
      Please see https://issues.apache.org/jira/browse/SPARK-5335.
      
      The fix itself is in e58a8b01a8bedcbfbbc6d04b1c1489255865cf87 commit. Two earlier commits are fixes of another VPC related bug waiting to be merged. I should have created former bug fix in own branch then this fix would not have former fixes. :(
      
      This code is released under the project's license.
      
      Author: Vladimir Grigor <vladimir@kiosked.com>
      Author: Vladimir Grigor <vladimir@voukka.com>
      
      Closes #4122 from voukka/SPARK-5335_delete_sg_vpc and squashes the following commits:
      
      090dca9 [Vladimir Grigor] fixes as per review: removed printing of group_id and added comment
      730ec05 [Vladimir Grigor] fix for SPARK-5335: Destroying cluster in VPC with "--delete-groups" fails to remove security groups
      ada993e9
    • Daoyuan Wang's avatar
      [SPARK-5755] [SQL] remove unnecessary Add · d5fc5149
      Daoyuan Wang authored
          explain extended select +key from src;
      before:
      == Parsed Logical Plan ==
      'Project [(0 + 'key) AS _c0#8]
       'UnresolvedRelation [src], None
      
      == Analyzed Logical Plan ==
      Project [(0 + key#10) AS _c0#8]
       MetastoreRelation test, src, None
      
      == Optimized Logical Plan ==
      Project [(0 + key#10) AS _c0#8]
       MetastoreRelation test, src, None
      
      == Physical Plan ==
      Project [(0 + key#10) AS _c0#8]
       HiveTableScan [key#10], (MetastoreRelation test, src, None), None
      
      after this patch:
      == Parsed Logical Plan ==
      'Project ['key]
       'UnresolvedRelation [src], None
      
      == Analyzed Logical Plan ==
      Project [key#10]
       MetastoreRelation test, src, None
      
      == Optimized Logical Plan ==
      Project [key#10]
       MetastoreRelation test, src, None
      
      == Physical Plan ==
      HiveTableScan [key#10], (MetastoreRelation test, src, None), None
      
      Author: Daoyuan Wang <daoyuan.wang@intel.com>
      
      Closes #4551 from adrian-wang/positive and squashes the following commits:
      
      0821ae4 [Daoyuan Wang] remove unnecessary Add
      d5fc5149
    • Michael Armbrust's avatar
      [SPARK-5573][SQL] Add explode to dataframes · ee04a8b1
      Michael Armbrust authored
      Author: Michael Armbrust <michael@databricks.com>
      
      Closes #4546 from marmbrus/explode and squashes the following commits:
      
      eefd33a [Michael Armbrust] whitespace
      a8d496c [Michael Armbrust] Merge remote-tracking branch 'apache/master' into explode
      4af740e [Michael Armbrust] Merge remote-tracking branch 'origin/master' into explode
      dc86a5c [Michael Armbrust] simple version
      d633d01 [Michael Armbrust] add scala specific
      950707a [Michael Armbrust] fix comments
      ba8854c [Michael Armbrust] [SPARK-5573][SQL] Add explode to dataframes
      ee04a8b1
    • Yin Huai's avatar
      [SPARK-5758][SQL] Use LongType as the default type for integers in JSON schema inference. · c352ffbd
      Yin Huai authored
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #4544 from yhuai/jsonUseLongTypeByDefault and squashes the following commits:
      
      6e2ffc2 [Yin Huai] Use LongType as the default type for integers in JSON schema inference.
      c352ffbd
    • Davies Liu's avatar
      [SPARK-5780] [PySpark] Mute the logging during unit tests · 0bf03158
      Davies Liu authored
      There a bunch of logging coming from driver and worker, it's noisy and scaring, and a lots of exception in it, people are confusing about the tests are failing or not.
      
      This PR will mute the logging during tests, only show them if any one failed.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #4572 from davies/mute and squashes the following commits:
      
      1e9069c [Davies Liu] mute the logging during python tests
      0bf03158
    • David Y. Ross's avatar
      SPARK-5747: Fix wordsplitting bugs in make-distribution.sh · 26c816e7
      David Y. Ross authored
      The `$MVN` command variable may have spaces, so when referring to it, must wrap in quotes.
      
      Author: David Y. Ross <dyross@gmail.com>
      
      Closes #4540 from dyross/dyr-fix-make-distribution2 and squashes the following commits:
      
      5a41596 [David Y. Ross] SPARK-5747: Fix wordsplitting bugs in make-distribution.sh
      26c816e7
    • lianhuiwang's avatar
      [SPARK-5759][Yarn]ExecutorRunnable should catch YarnException while NMClient start contain... · 947b8bd8
      lianhuiwang authored
      some time since some reasons, it lead to some exception while NMClient start some containers.example:we do not config spark_shuffle on some machines, so it will throw a exception:
      java.lang.Error: org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:spark_shuffle does not exist.
      because YarnAllocator use ThreadPoolExecutor to start Container, so we can not find which container or hostname throw exception. I think we should catch YarnException in ExecutorRunnable when start container. if there are some exceptions, we can know the container id or hostname of failed container.
      
      Author: lianhuiwang <lianhuiwang09@gmail.com>
      
      Closes #4554 from lianhuiwang/SPARK-5759 and squashes the following commits:
      
      caf5a99 [lianhuiwang] use SparkException to warp exception
      c02140f [lianhuiwang] ExecutorRunnable should catch YarnException while NMClient start container
      947b8bd8
    • Andrew Or's avatar
      [SPARK-5760][SPARK-5761] Fix standalone rest protocol corner cases + revamp tests · 1d5663e9
      Andrew Or authored
      The changes are summarized in the commit message. Test or test-related code accounts for 90% of the lines changed.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #4557 from andrewor14/rest-tests and squashes the following commits:
      
      b4dc980 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest-tests
      b55e40f [Andrew Or] Add test for unknown fields
      cc96993 [Andrew Or] private[spark] -> private[rest]
      578cf45 [Andrew Or] Clean up test code a little
      d82d971 [Andrew Or] v1 -> serverVersion
      ea48f65 [Andrew Or] Merge branch 'master' of github.com:apache/spark into rest-tests
      00999a8 [Andrew Or] Revamp tests + fix a few corner cases
      1d5663e9
    • Kay Ousterhout's avatar
      [SPARK-5762] Fix shuffle write time for sort-based shuffle · 47c73d41
      Kay Ousterhout authored
      mateiz was excluding the time to write this final file from the shuffle write time intentional?
      
      Author: Kay Ousterhout <kayousterhout@gmail.com>
      
      Closes #4559 from kayousterhout/SPARK-5762 and squashes the following commits:
      
      5c6f3d9 [Kay Ousterhout] Use foreach
      94e4237 [Kay Ousterhout] Removed open time metrics added inadvertently
      ace156c [Kay Ousterhout] Moved metrics to finally block
      d773276 [Kay Ousterhout] Use nano time
      5a59906 [Kay Ousterhout] [SPARK-5762] Fix shuffle write time for sort-based shuffle
      47c73d41
    • Venkata Ramana Gollamudi's avatar
      [SPARK-5765][Examples]Fixed word split problem in run-example and compute-classpath · 629d0143
      Venkata Ramana Gollamudi authored
      Author: Venkata Ramana G <ramana.gollamudihuawei.com>
      
      Author: Venkata Ramana Gollamudi <ramana.gollamudi@huawei.com>
      
      Closes #4561 from gvramana/word_split and squashes the following commits:
      
      285c8d4 [Venkata Ramana Gollamudi] Fixed word split problem in run-example and compute-classpath
      629d0143
    • Katsunori Kanda's avatar
      [EC2] Update default Spark version to 1.2.1 · 9c807650
      Katsunori Kanda authored
      Author: Katsunori Kanda <potix2@gmail.com>
      
      Closes #4566 from potix2/ec2-update-version-1-2-1 and squashes the following commits:
      
      77e7840 [Katsunori Kanda] [EC2] Update default Spark version to 1.2.1
      9c807650
    • Kay Ousterhout's avatar
      [SPARK-5645] Added local read bytes/time to task metrics · 893d6fd7
      Kay Ousterhout authored
      ksakellis I stumbled on your JIRA for this yesterday; I know it's assigned to you but I'd already done this for my own uses a while ago so thought I could help save you the work of doing it!  Hopefully this doesn't duplicate any work you've already done.
      
      Here's a screenshot of what the UI looks like:
      ![image](https://cloud.githubusercontent.com/assets/1108612/6135352/c03e7276-b11c-11e4-8f11-c6aefe1f35b9.png)
      Based on a discussion with pwendell, I put the data read remotely in as an additional metric rather than showing it in brackets as you'd suggested, Kostas.  The assumption here is that the average user doesn't care about the differentiation between local / remote data, so it's better not to pollute the UI.
      
      I also added data about the local read time, which I've found very helpful for debugging, but I didn't put it in the UI because I think it's probably something not a ton of people will need to use.
      
      With this change, the total read time and total write time shown in the UI will be equal, fixing a long-term source of user confusion:
      ![image](https://cloud.githubusercontent.com/assets/1108612/6135399/25f14490-b11d-11e4-8086-20be5f4002e6.png)
      
      Author: Kay Ousterhout <kayousterhout@gmail.com>
      
      Closes #4510 from kayousterhout/SPARK-5645 and squashes the following commits:
      
      4a0182c [Kay Ousterhout] oops
      5f5da1b [Kay Ousterhout] Small style fix
      5da04cf [Kay Ousterhout] Addressed more comments from Kostas
      ba05149 [Kay Ousterhout] Remove parens
      a9dc685 [Kay Ousterhout] Kostas comment, test fix
      33d2e2d [Kay Ousterhout] Merge remote-tracking branch 'upstream/master' into SPARK-5645
      347e2cd [Kay Ousterhout] [SPARK-5645] Added local read bytes/time to task metrics
      893d6fd7
Loading