Skip to content
Snippets Groups Projects
  1. Dec 18, 2015
  2. Dec 17, 2015
  3. Dec 16, 2015
    • Andrew Or's avatar
      [SPARK-12390] Clean up unused serializer parameter in BlockManager · 97678ede
      Andrew Or authored
      No change in functionality is intended. This only changes internal API.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #10343 from andrewor14/clean-bm-serializer.
      97678ede
    • Marcelo Vanzin's avatar
      [SPARK-12386][CORE] Fix NPE when spark.executor.port is set. · d1508dd9
      Marcelo Vanzin authored
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #10339 from vanzin/SPARK-12386.
      d1508dd9
    • Rohit Agarwal's avatar
      [SPARK-12186][WEB UI] Send the complete request URI including the query string when redirecting. · fdb38227
      Rohit Agarwal authored
      Author: Rohit Agarwal <rohita@qubole.com>
      
      Closes #10180 from mindprince/SPARK-12186.
      fdb38227
    • tedyu's avatar
      [SPARK-12365][CORE] Use ShutdownHookManager where Runtime.getRuntime.addShutdownHook() is called · f590178d
      tedyu authored
      SPARK-9886 fixed ExternalBlockStore.scala
      
      This PR fixes the remaining references to Runtime.getRuntime.addShutdownHook()
      
      Author: tedyu <yuzhihong@gmail.com>
      
      Closes #10325 from ted-yu/master.
      f590178d
    • Imran Rashid's avatar
      [SPARK-10248][CORE] track exceptions in dagscheduler event loop in tests · 38d9795a
      Imran Rashid authored
      `DAGSchedulerEventLoop` normally only logs errors (so it can continue to process more events, from other jobs).  However, this is not desirable in the tests -- the tests should be able to easily detect any exception, and also shouldn't silently succeed if there is an exception.
      
      This was suggested by mateiz on https://github.com/apache/spark/pull/7699.  It may have already turned up an issue in "zero split job".
      
      Author: Imran Rashid <irashid@cloudera.com>
      
      Closes #8466 from squito/SPARK-10248.
      38d9795a
    • Andrew Or's avatar
      MAINTENANCE: Automated closing of pull requests. · ce5fd400
      Andrew Or authored
      This commit exists to close the following pull requests on Github:
      
      Closes #1217 (requested by ankurdave, srowen)
      Closes #4650 (requested by andrewor14)
      Closes #5307 (requested by vanzin)
      Closes #5664 (requested by andrewor14)
      Closes #5713 (requested by marmbrus)
      Closes #5722 (requested by andrewor14)
      Closes #6685 (requested by srowen)
      Closes #7074 (requested by srowen)
      Closes #7119 (requested by andrewor14)
      Closes #7997 (requested by jkbradley)
      Closes #8292 (requested by srowen)
      Closes #8975 (requested by andrewor14, vanzin)
      Closes #8980 (requested by andrewor14, davies)
      ce5fd400
    • Andrew Or's avatar
      [MINOR] Add missing interpolation in NettyRPCEnv · 861549ac
      Andrew Or authored
      ```
      Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException:
      Cannot receive any reply in ${timeout.duration}. This timeout is controlled by spark.rpc.askTimeout
      	at org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
      	at org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
      	at scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
      ```
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #10334 from andrewor14/rpc-typo.
      861549ac
    • Davies Liu's avatar
      [SPARK-12380] [PYSPARK] use SQLContext.getOrCreate in mllib · 27b98e99
      Davies Liu authored
      MLlib should use SQLContext.getOrCreate() instead of creating new SQLContext.
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #10338 from davies/create_context.
      27b98e99
    • Martin Menestret's avatar
      [SPARK-9690][ML][PYTHON] pyspark CrossValidator random seed · 3a44aebd
      Martin Menestret authored
      Extend CrossValidator with HasSeed in PySpark.
      
      This PR replaces [https://github.com/apache/spark/pull/7997]
      
      CC: yanboliang thunterdb mmenestret  Would one of you mind taking a look?  Thanks!
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      Author: Martin MENESTRET <mmenestret@ippon.fr>
      
      Closes #10268 from jkbradley/pyspark-cv-seed.
      3a44aebd
    • hyukjinkwon's avatar
      [SPARK-11677][SQL] ORC filter tests all pass if filters are actually not pushed down. · 9657ee87
      hyukjinkwon authored
      Currently ORC filters are not tested properly. All the tests pass even if the filters are not pushed down or disabled. In this PR, I add some logics for this.
      Since ORC does not filter record by record fully, this checks the count of the result and if it contains the expected values.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #9687 from HyukjinKwon/SPARK-11677.
      9657ee87
    • gatorsmile's avatar
      [SPARK-12164][SQL] Decode the encoded values and then display · edf65cd9
      gatorsmile authored
      Based on the suggestions from marmbrus cloud-fan in https://github.com/apache/spark/pull/10165 , this PR is to print the decoded values(user objects) in `Dataset.show`
      ```scala
          implicit val kryoEncoder = Encoders.kryo[KryoClassData]
          val ds = Seq(KryoClassData("a", 1), KryoClassData("b", 2), KryoClassData("c", 3)).toDS()
          ds.show(20, false);
      ```
      The current output is like
      ```
      +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      |value                                                                                                                                                                                 |
      +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      |[1, 0, 111, 114, 103, 46, 97, 112, 97, 99, 104, 101, 46, 115, 112, 97, 114, 107, 46, 115, 113, 108, 46, 75, 114, 121, 111, 67, 108, 97, 115, 115, 68, 97, 116, -31, 1, 1, -126, 97, 2]|
      |[1, 0, 111, 114, 103, 46, 97, 112, 97, 99, 104, 101, 46, 115, 112, 97, 114, 107, 46, 115, 113, 108, 46, 75, 114, 121, 111, 67, 108, 97, 115, 115, 68, 97, 116, -31, 1, 1, -126, 98, 4]|
      |[1, 0, 111, 114, 103, 46, 97, 112, 97, 99, 104, 101, 46, 115, 112, 97, 114, 107, 46, 115, 113, 108, 46, 75, 114, 121, 111, 67, 108, 97, 115, 115, 68, 97, 116, -31, 1, 1, -126, 99, 6]|
      +--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
      ```
      After the fix, it will be like the below if and only if the users override the `toString` function in the class `KryoClassData`
      ```scala
      override def toString: String = s"KryoClassData($a, $b)"
      ```
      ```
      +-------------------+
      |value              |
      +-------------------+
      |KryoClassData(a, 1)|
      |KryoClassData(b, 2)|
      |KryoClassData(c, 3)|
      +-------------------+
      ```
      
      If users do not override the `toString` function, the results will be like
      ```
      +---------------------------------------+
      |value                                  |
      +---------------------------------------+
      |org.apache.spark.sql.KryoClassData68ef|
      |org.apache.spark.sql.KryoClassData6915|
      |org.apache.spark.sql.KryoClassData693b|
      +---------------------------------------+
      ```
      
      Question: Should we add another optional parameter in the function `show`? It will decide if the function `show` will display the hex values or the object values?
      
      Author: gatorsmile <gatorsmile@gmail.com>
      
      Closes #10215 from gatorsmile/showDecodedValue.
      edf65cd9
    • Wenchen Fan's avatar
      [SPARK-12320][SQL] throw exception if the number of fields does not line up for Tuple encoder · a783a8ed
      Wenchen Fan authored
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #10293 from cloud-fan/err-msg.
      a783a8ed
    • Yanbo Liang's avatar
      [SPARK-12364][ML][SPARKR] Add ML example for SparkR · 1a8b2a17
      Yanbo Liang authored
      We have DataFrame example for SparkR, we also need to add ML example under ```examples/src/main/r```.
      
      cc mengxr jkbradley shivaram
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #10324 from yanboliang/spark-12364.
      1a8b2a17
    • Joseph K. Bradley's avatar
      [SPARK-11608][MLLIB][DOC] Added migration guide for MLlib 1.6 · 8148cc7a
      Joseph K. Bradley authored
      No known breaking changes, but some deprecations and changes of behavior.
      
      CC: mengxr
      
      Author: Joseph K. Bradley <joseph@databricks.com>
      
      Closes #10235 from jkbradley/mllib-guide-update-1.6.
      8148cc7a
    • Jeff Zhang's avatar
      [SPARK-12361][PYSPARK][TESTS] Should set PYSPARK_DRIVER_PYTHON before Python tests · 6a880afa
      Jeff Zhang authored
      Although this patch still doesn't solve the issue why the return code is 0 (see JIRA description), it resolves the issue of python version mismatch.
      
      Author: Jeff Zhang <zjffdu@apache.org>
      
      Closes #10322 from zjffdu/SPARK-12361.
      6a880afa
    • Yanbo Liang's avatar
      [SPARK-12309][ML] Use sqlContext from MLlibTestSparkContext for spark.ml test suites · d252b2d5
      Yanbo Liang authored
      Use ```sqlContext``` from ```MLlibTestSparkContext``` rather than creating new one for spark.ml test suites. I have checked thoroughly and found there are four test cases need to update.
      
      cc mengxr jkbradley
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #10279 from yanboliang/spark-12309.
      d252b2d5
    • Yanbo Liang's avatar
      [SPARK-9694][ML] Add random seed Param to Scala CrossValidator · 860dc7f2
      Yanbo Liang authored
      Add random seed Param to Scala CrossValidator
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #9108 from yanboliang/spark-9694.
      860dc7f2
    • Yu ISHIKAWA's avatar
      [SPARK-6518][MLLIB][EXAMPLE][DOC] Add example code and user guide for bisecting k-means · 7b6dc29d
      Yu ISHIKAWA authored
      This PR includes only an example code in order to finish it quickly.
      I'll send another PR for the docs soon.
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #9952 from yu-iskw/SPARK-6518.
      7b6dc29d
    • Timothy Chen's avatar
      [SPARK-12345][MESOS] Filter SPARK_HOME when submitting Spark jobs with Mesos cluster mode. · ad8c1f0b
      Timothy Chen authored
      SPARK_HOME is now causing problem with Mesos cluster mode since spark-submit script has been changed recently to take precendence when running spark-class scripts to look in SPARK_HOME if it's defined.
      
      We should skip passing SPARK_HOME from the Spark client in cluster mode with Mesos, since Mesos shouldn't use this configuration but should use spark.executor.home instead.
      
      Author: Timothy Chen <tnachen@gmail.com>
      
      Closes #10332 from tnachen/scheduler_ui.
      ad8c1f0b
    • Yu ISHIKAWA's avatar
      [SPARK-12215][ML][DOC] User guide section for KMeans in spark.ml · 26d70bd2
      Yu ISHIKAWA authored
      cc jkbradley
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #10244 from yu-iskw/SPARK-12215.
      26d70bd2
    • Yanbo Liang's avatar
      [SPARK-12310][SPARKR] Add write.json and write.parquet for SparkR · 22f6cd86
      Yanbo Liang authored
      Add ```write.json``` and ```write.parquet``` for SparkR, and deprecated ```saveAsParquetFile```.
      
      Author: Yanbo Liang <ybliang8@gmail.com>
      
      Closes #10281 from yanboliang/spark-12310.
      22f6cd86
    • Jeff Zhang's avatar
      [SPARK-12318][SPARKR] Save mode in SparkR should be error by default · 2eb5af5f
      Jeff Zhang authored
      shivaram  Please help review.
      
      Author: Jeff Zhang <zjffdu@apache.org>
      
      Closes #10290 from zjffdu/SPARK-12318.
      2eb5af5f
    • Davies Liu's avatar
      [SPARK-8745] [SQL] remove GenerateProjection · 54c512ba
      Davies Liu authored
      cc rxin
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #10316 from davies/remove_generate_projection.
      54c512ba
Loading