Skip to content
Snippets Groups Projects
  1. Aug 28, 2015
    • Marcelo Vanzin's avatar
      [SPARK-9284] [TESTS] Allow all tests to run without an assembly. · c53c902f
      Marcelo Vanzin authored
      This change aims at speeding up the dev cycle a little bit, by making
      sure that all tests behave the same w.r.t. where the code to be tested
      is loaded from. Namely, that means that tests don't rely on the assembly
      anymore, rather loading all needed classes from the build directories.
      
      The main change is to make sure all build directories (classes and test-classes)
      are added to the classpath of child processes when running tests.
      
      YarnClusterSuite required some custom code since the executors are run
      differently (i.e. not through the launcher library, like standalone and
      Mesos do).
      
      I also found a couple of tests that could leak a SparkContext on failure,
      and added code to handle those.
      
      With this patch, it's possible to run the following command from a clean
      source directory and have all tests pass:
      
        mvn -Pyarn -Phadoop-2.4 -Phive-thriftserver install
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #7629 from vanzin/SPARK-9284.
      c53c902f
    • Josh Rosen's avatar
      [SPARK-10325] Override hashCode() for public Row · d3f87dc3
      Josh Rosen authored
      This commit fixes an issue where the public SQL `Row` class did not override `hashCode`, causing it to violate the hashCode() + equals() contract. To fix this, I simply ported the `hashCode` implementation from the 1.4.x version of `Row`.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #8500 from JoshRosen/SPARK-10325 and squashes the following commits:
      
      51ffea1 [Josh Rosen] Override hashCode() for public Row.
      d3f87dc3
    • Luciano Resende's avatar
      [SPARK-8952] [SPARKR] - Wrap normalizePath calls with suppressWarnings · 499e8e15
      Luciano Resende authored
      This is based on davies comment on SPARK-8952 which suggests to only call normalizePath() when path starts with '~'
      
      Author: Luciano Resende <lresende@apache.org>
      
      Closes #8343 from lresende/SPARK-8952.
      499e8e15
    • Yuhao Yang's avatar
      [SPARK-9890] [DOC] [ML] User guide for CountVectorizer · e2a84309
      Yuhao Yang authored
      jira: https://issues.apache.org/jira/browse/SPARK-9890
      
      document with Scala and java examples
      
      Author: Yuhao Yang <hhbyyh@gmail.com>
      
      Closes #8487 from hhbyyh/cvDoc.
      e2a84309
    • jerryshao's avatar
      [YARN] [MINOR] Avoid hard code port number in YarnShuffleService test · 1502a0f6
      jerryshao authored
      Current port number is fixed as default (7337) in test, this will introduce port contention exception, better to change to a random number in unit test.
      
      squito , seems you're author of this unit test, mind taking a look at this fix? Thanks a lot.
      
      ```
      [info] - executor state kept across NM restart *** FAILED *** (597 milliseconds)
      [info]   org.apache.hadoop.service.ServiceStateException: java.net.BindException: Address already in use
      [info]   at org.apache.hadoop.service.ServiceStateException.convert(ServiceStateException.java:59)
      [info]   at org.apache.hadoop.service.AbstractService.init(AbstractService.java:172)
      [info]   at org.apache.spark.network.yarn.YarnShuffleServiceSuite$$anonfun$1.apply$mcV$sp(YarnShuffleServiceSuite.scala:72)
      [info]   at org.apache.spark.network.yarn.YarnShuffleServiceSuite$$anonfun$1.apply(YarnShuffleServiceSuite.scala:70)
      [info]   at org.apache.spark.network.yarn.YarnShuffleServiceSuite$$anonfun$1.apply(YarnShuffleServiceSuite.scala:70)
      [info]   at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22)
      [info]   at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85)
      [info]   at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:22)
      [info]   at org.scalatest.Transformer.apply(Transformer.scala:20)
      [info]   at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166)
      [info]   at org.apache.spark.SparkFunSuite.withFixture(SparkFunSuite.scala:42)
      ...
      ```
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #8502 from jerryshao/avoid-hardcode-port.
      1502a0f6
    • Dharmesh Kakadia's avatar
      typo in comment · 71a077f6
      Dharmesh Kakadia authored
      Author: Dharmesh Kakadia <dharmeshkakadia@users.noreply.github.com>
      
      Closes #8497 from dharmeshkakadia/patch-2.
      71a077f6
    • Keiji Yoshida's avatar
      Fix DynamodDB/DynamoDB typo in Kinesis Integration doc · 18294cd8
      Keiji Yoshida authored
      Fix DynamodDB/DynamoDB typo in Kinesis Integration doc
      
      Author: Keiji Yoshida <yoshida.keiji.84@gmail.com>
      
      Closes #8501 from yosssi/patch-1.
      18294cd8
    • Sean Owen's avatar
      [SPARK-10295] [CORE] Dynamic allocation in Mesos does not release when RDDs are cached · cc398030
      Sean Owen authored
      Remove obsolete warning about dynamic allocation not working with cached RDDs
      
      See discussion in https://issues.apache.org/jira/browse/SPARK-10295
      
      Author: Sean Owen <sowen@cloudera.com>
      
      Closes #8489 from srowen/SPARK-10295.
      cc398030
    • Yu ISHIKAWA's avatar
      [SPARK-10260] [ML] Add @Since annotation to ml.clustering · 4eeda8d4
      Yu ISHIKAWA authored
      ### JIRA
      [[SPARK-10260] Add Since annotation to ml.clustering - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-10260)
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #8455 from yu-iskw/SPARK-10260.
      4eeda8d4
    • Shivaram Venkataraman's avatar
      [SPARK-10328] [SPARKR] Fix generic for na.omit · 2f99c372
      Shivaram Venkataraman authored
      S3 function is at https://stat.ethz.ch/R-manual/R-patched/library/stats/html/na.fail.html
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      Author: Shivaram Venkataraman <shivaram.venkataraman@gmail.com>
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #8495 from shivaram/na-omit-fix.
      2f99c372
    • noelsmith's avatar
      [SPARK-10188] [PYSPARK] Pyspark CrossValidator with RMSE selects incorrect model · 7583681e
      noelsmith authored
      * Added isLargerBetter() method to Pyspark Evaluator to match the Scala version.
      * JavaEvaluator delegates isLargerBetter() to underlying Scala object.
      * Added check for isLargerBetter() in CrossValidator to determine whether to use argmin or argmax.
      * Added test cases for where smaller is better (RMSE) and larger is better (R-Squared).
      
      (This contribution is my original work and that I license the work to the project under Sparks' open source license)
      
      Author: noelsmith <mail@noelsmith.com>
      
      Closes #8399 from noel-smith/pyspark-rmse-xval-fix.
      7583681e
    • Cheng Lian's avatar
      [SPARK-SQL] [MINOR] Fixes some typos in HiveContext · 89b94343
      Cheng Lian authored
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #8481 from liancheng/hive-context-typo.
      89b94343
  2. Aug 27, 2015
  3. Aug 26, 2015
    • Cheng Lian's avatar
      [SPARK-9424] [SQL] Parquet programming guide updates for 1.5 · 0fac144f
      Cheng Lian authored
      Author: Cheng Lian <lian@databricks.com>
      
      Closes #8467 from liancheng/spark-9424/parquet-docs-for-1.5.
      0fac144f
    • Yu ISHIKAWA's avatar
      [MINOR] [SPARKR] Fix some validation problems in SparkR · 773ca037
      Yu ISHIKAWA authored
      Getting rid of some validation problems in SparkR
      https://github.com/apache/spark/pull/7883
      
      cc shivaram
      
      ```
      inst/tests/test_Serde.R:26:1: style: Trailing whitespace is superfluous.
      
      ^~
      inst/tests/test_Serde.R:34:1: style: Trailing whitespace is superfluous.
      
      ^~
      inst/tests/test_Serde.R:37:38: style: Trailing whitespace is superfluous.
        expect_equal(class(x), "character")
                                           ^~
      inst/tests/test_Serde.R:50:1: style: Trailing whitespace is superfluous.
      
      ^~
      inst/tests/test_Serde.R:55:1: style: Trailing whitespace is superfluous.
      
      ^~
      inst/tests/test_Serde.R:60:1: style: Trailing whitespace is superfluous.
      
      ^~
      inst/tests/test_sparkSQL.R:611:1: style: Trailing whitespace is superfluous.
      
      ^~
      R/DataFrame.R:664:1: style: Trailing whitespace is superfluous.
      
      ^~~~~~~~~~~~~~
      R/DataFrame.R:670:55: style: Trailing whitespace is superfluous.
                      df <- data.frame(row.names = 1 : nrow)
                                                            ^~~~~~~~~~~~~~~~
      R/DataFrame.R:672:1: style: Trailing whitespace is superfluous.
      
      ^~~~~~~~~~~~~~
      R/DataFrame.R:686:49: style: Trailing whitespace is superfluous.
                          df[[names[colIndex]]] <- vec
                                                      ^~~~~~~~~~~~~~~~~~
      ```
      
      Author: Yu ISHIKAWA <yuu.ishikawa@gmail.com>
      
      Closes #8474 from yu-iskw/minor-fix-sparkr.
      773ca037
    • Shivaram Venkataraman's avatar
      [SPARK-10308] [SPARKR] Add %in% to the exported namespace · ad7f0f16
      Shivaram Venkataraman authored
      I also checked all the other functions defined in column.R, functions.R and DataFrame.R and everything else looked fine.
      
      cc yu-iskw
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #8473 from shivaram/in-namespace.
      ad7f0f16
    • Davies Liu's avatar
      [SPARK-10305] [SQL] fix create DataFrame from Python class · d41d6c48
      Davies Liu authored
      cc jkbradley
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #8470 from davies/fix_create_df.
      d41d6c48
    • Xiangrui Meng's avatar
      [SPARK-10241] [MLLIB] update since versions in mllib.recommendation · 086d4681
      Xiangrui Meng authored
      Same as #8421 but for `mllib.recommendation`.
      
      cc srowen coderxiang
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #8432 from mengxr/SPARK-10241.
      086d4681
    • Patrick Wendell's avatar
      HOTFIX: Increase PRB timeout · de7209c2
      Patrick Wendell authored
      de7209c2
    • Xiangrui Meng's avatar
      [SPARK-9665] [MLLIB] audit MLlib API annotations · 6519fd06
      Xiangrui Meng authored
      I only found `ml.NaiveBayes` missing `Experimental` annotation. This PR doesn't cover Python APIs.
      
      cc jkbradley
      
      Author: Xiangrui Meng <meng@databricks.com>
      
      Closes #8452 from mengxr/SPARK-9665.
      6519fd06
    • Reynold Xin's avatar
      Closes #8443 · bb164052
      Reynold Xin authored
      bb164052
Loading