Skip to content
Snippets Groups Projects
  1. Jun 24, 2016
    • peng.zhang's avatar
      [SPARK-16125][YARN] Fix not test yarn cluster mode correctly in YarnClusterSuite · f4fd7432
      peng.zhang authored
      ## What changes were proposed in this pull request?
      
      Since SPARK-13220(Deprecate "yarn-client" and "yarn-cluster"), YarnClusterSuite doesn't test "yarn cluster" mode correctly.
      This pull request fixes it.
      
      ## How was this patch tested?
      Unit test
      
      (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
      
      Author: peng.zhang <peng.zhang@xiaomi.com>
      
      Closes #13836 from renozhang/SPARK-16125-test-yarn-cluster-mode.
      f4fd7432
  2. Jun 19, 2016
    • Prashant Sharma's avatar
      [SPARK-15942][REPL] Unblock `:reset` command in REPL. · 1b3a9b96
      Prashant Sharma authored
      ## What changes were proposed in this pull
      (Paste from JIRA issue.)
      As a follow up for SPARK-15697, I have following semantics for `:reset` command.
      On `:reset` we forget all that user has done but not the initialization of spark. To avoid confusion or make it more clear, we show the message `spark` and `sc` are not erased, infact they are in same state as they were left by previous operations done by the user.
      While doing above, somewhere I felt that this is not usually what reset means. But an accidental shutdown of a cluster can be very costly, so may be in that sense this is less surprising and still useful.
      
      ## How was this patch tested?
      
      Manually, by calling `:reset` command, by both altering the state of SparkContext and creating some local variables.
      
      Author: Prashant Sharma <prashant@apache.org>
      Author: Prashant Sharma <prashsh1@in.ibm.com>
      
      Closes #13661 from ScrapCodes/repl-reset-command.
      1b3a9b96
  3. Jun 16, 2016
    • Nezih Yigitbasi's avatar
      [SPARK-15782][YARN] Fix spark.jars and spark.yarn.dist.jars handling · 63470afc
      Nezih Yigitbasi authored
      When `--packages` is specified with spark-shell the classes from those packages cannot be found, which I think is due to some of the changes in SPARK-12343.
      
      Tested manually with both scala 2.10 and 2.11 repls.
      
      vanzin davies can you guys please review?
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      Author: Nezih Yigitbasi <nyigitbasi@netflix.com>
      
      Closes #13709 from nezihyigitbasi/SPARK-15782.
      63470afc
  4. Jun 15, 2016
    • Davies Liu's avatar
      a153e41c
    • Nezih Yigitbasi's avatar
      [SPARK-15782][YARN] Set spark.jars system property in client mode · 4df8df5c
      Nezih Yigitbasi authored
      ## What changes were proposed in this pull request?
      
      When `--packages` is specified with `spark-shell` the classes from those packages cannot be found, which I think is due to some of the changes in `SPARK-12343`. In particular `SPARK-12343` removes a line that sets the `spark.jars` system property in client mode, which is used by the repl main class to set the classpath.
      
      ## How was this patch tested?
      
      Tested manually.
      
      This system property is used by the repl to populate its classpath. If
      this is not set properly the classes for external packages cannot be
      found.
      
      tgravescs vanzin as you may be familiar with this part of the code.
      
      Author: Nezih Yigitbasi <nyigitbasi@netflix.com>
      
      Closes #13527 from nezihyigitbasi/repl-fix.
      4df8df5c
  5. Jun 13, 2016
    • Prashant Sharma's avatar
      [SPARK-15697][REPL] Unblock some of the useful repl commands. · 4134653e
      Prashant Sharma authored
      ## What changes were proposed in this pull request?
      
      Unblock some of the useful repl commands. like, "implicits", "javap", "power", "type", "kind". As they are useful and fully functional and part of scala/scala project, I see no harm in having them either.
      
      Verbatim paste form JIRA description.
      "implicits", "javap", "power", "type", "kind" commands in repl are blocked. However, they work fine in all cases I have tried. It is clear we don't support them as they are part of the scala/scala repl project. What is the harm in unblocking them, given they are useful ?
      In previous versions of spark we disabled these commands because it was difficult to support them without customization and the associated maintenance. Since the code base of scala repl was actually ported and maintained under spark source. Now that is not the situation and one can benefit from these commands in Spark REPL as much as in scala repl.
      
      ## How was this patch tested?
      Existing tests and manual, by trying out all of the above commands.
      
      P.S. Symantics of reset are to be discussed in a separate issue.
      
      Author: Prashant Sharma <prashsh1@in.ibm.com>
      
      Closes #13437 from ScrapCodes/SPARK-15697/repl-unblock-commands.
      4134653e
  6. Jun 09, 2016
    • Prashant Sharma's avatar
      [SPARK-15841][Tests] REPLSuite has incorrect env set for a couple of tests. · 83070cd1
      Prashant Sharma authored
      Description from JIRA.
      In ReplSuite, for a test that can be tested well on just local should not really have to start a local-cluster. And similarly a test is in-sufficiently run if it's actually fixing a problem related to a distributed run in environment with local run.
      
      Existing tests.
      
      Author: Prashant Sharma <prashsh1@in.ibm.com>
      
      Closes #13574 from ScrapCodes/SPARK-15841/repl-suite-fix.
      83070cd1
  7. Jun 02, 2016
    • hyukjinkwon's avatar
      [SPARK-15322][SQL][FOLLOWUP] Use the new long accumulator for old int accumulators. · 252417fa
      hyukjinkwon authored
      ## What changes were proposed in this pull request?
      
      This PR corrects the remaining cases for using old accumulators.
      
      This does not change some old accumulator usages below:
      
      - `ImplicitSuite.scala` - Tests dedicated to old accumulator, for implicits with `AccumulatorParam`
      
      - `AccumulatorSuite.scala` -  Tests dedicated to old accumulator
      
      - `JavaSparkContext.scala` - For supporting old accumulators for Java API.
      
      - `debug.package.scala` - Usage with `HashSet[String]`. Currently, it seems no implementation for this. I might be able to write an anonymous class for this but I didn't because I think it is not worth writing a lot of codes only for this.
      
      - `SQLMetricsSuite.scala` - This uses the old accumulator for checking type boxing. It seems new accumulator does not require type boxing for this case whereas the old one requires (due to the use of generic).
      
      ## How was this patch tested?
      
      Existing tests cover this.
      
      Author: hyukjinkwon <gurwls223@gmail.com>
      
      Closes #13434 from HyukjinKwon/accum.
      252417fa
  8. May 31, 2016
  9. May 17, 2016
  10. May 04, 2016
  11. May 03, 2016
  12. Apr 28, 2016
  13. Apr 25, 2016
    • Andrew Or's avatar
      [SPARK-14828][SQL] Start SparkSession in REPL instead of SQLContext · 34336b62
      Andrew Or authored
      ## What changes were proposed in this pull request?
      
      ```
      Spark context available as 'sc' (master = local[*], app id = local-1461283768192).
      Spark session available as 'spark'.
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 2.0.0-SNAPSHOT
            /_/
      
      Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_51)
      Type in expressions to have them evaluated.
      Type :help for more information.
      
      scala> sql("SHOW TABLES").collect()
      16/04/21 17:09:39 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 1.2.0
      16/04/21 17:09:39 WARN ObjectStore: Failed to get database default, returning NoSuchObjectException
      res0: Array[org.apache.spark.sql.Row] = Array([src,false])
      
      scala> sql("SHOW TABLES").collect()
      res1: Array[org.apache.spark.sql.Row] = Array([src,false])
      
      scala> spark.createDataFrame(Seq((1, 1), (2, 2), (3, 3)))
      res2: org.apache.spark.sql.DataFrame = [_1: int, _2: int]
      ```
      
      Hive things are loaded lazily.
      
      ## How was this patch tested?
      
      Manual.
      
      Author: Andrew Or <andrew@databricks.com>
      
      Closes #12589 from andrewor14/spark-session-repl.
      34336b62
  14. Apr 22, 2016
    • Reynold Xin's avatar
      [SPARK-10001] Consolidate Signaling and SignalLogger. · c089c6f4
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      This is a follow-up to #12557, with the following changes:
      
      1. Fixes some of the style issues.
      2. Merges Signaling and SignalLogger into a new class called SignalUtils. It was pretty confusing to have Signaling and Signal in one file, and it was also confusing to have two classes named Signaling and one called the other.
      3. Made logging registration idempotent.
      
      ## How was this patch tested?
      N/A.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #12605 from rxin/SPARK-10001.
      c089c6f4
    • Jakob Odersky's avatar
      [SPARK-10001] [CORE] Interrupt tasks in repl with Ctrl+C · 80127935
      Jakob Odersky authored
      ## What changes were proposed in this pull request?
      
      Improve signal handling to allow interrupting running tasks from the REPL (with Ctrl+C).
      If no tasks are running or Ctrl+C is pressed twice, the signal is forwarded to the default handler resulting in the usual termination of the application.
      
      This PR is a rewrite of -- and therefore closes #8216 -- as per piaozhexiu's request
      
      ## How was this patch tested?
      Signal handling is not easily testable therefore no unit tests were added. Nevertheless, the new functionality is implemented in a best-effort approach, soft-failing in case signals aren't available on a specific OS.
      
      Author: Jakob Odersky <jakob@odersky.com>
      
      Closes #12557 from jodersky/SPARK-10001-sigint.
      80127935
  15. Apr 20, 2016
    • jerryshao's avatar
      [SPARK-14725][CORE] Remove HttpServer class · 90cbc82f
      jerryshao authored
      ## What changes were proposed in this pull request?
      
      This proposal removes the class `HttpServer`, with the changing of internal file/jar/class transmission to RPC layer, currently there's no code using this `HttpServer`, so here propose to remove it.
      
      ## How was this patch tested?
      
      Unit test is verified locally.
      
      Author: jerryshao <sshao@hortonworks.com>
      
      Closes #12526 from jerryshao/SPARK-14725.
      90cbc82f
  16. Apr 14, 2016
    • Wenchen Fan's avatar
      [SPARK-14558][CORE] In ClosureCleaner, clean the outer pointer if it's a REPL line object · 1d04c86f
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      When we clean a closure, if its outermost parent is not a closure, we won't clone and clean it as cloning user's objects is dangerous. However, if it's a REPL line object, which may carry a lot of unnecessary references(like hadoop conf, spark conf, etc.), we should clean it as it's not a user object.
      
      This PR improves the check for user's objects to exclude REPL line object.
      
      ## How was this patch tested?
      
      existing tests.
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #12327 from cloud-fan/closure.
      1d04c86f
  17. Apr 12, 2016
  18. Apr 09, 2016
    • Reynold Xin's avatar
      [SPARK-14451][SQL] Move encoder definition into Aggregator interface · 520dde48
      Reynold Xin authored
      ## What changes were proposed in this pull request?
      When we first introduced Aggregators, we required the user of Aggregators to (implicitly) specify the encoders. It would actually make more sense to have the encoders be specified by the implementation of Aggregators, since each implementation should have the most state about how to encode its own data type.
      
      Note that this simplifies the Java API because Java users no longer need to explicitly specify encoders for aggregators.
      
      ## How was this patch tested?
      Updated unit tests.
      
      Author: Reynold Xin <rxin@databricks.com>
      
      Closes #12231 from rxin/SPARK-14451.
      520dde48
  19. Apr 06, 2016
    • Marcelo Vanzin's avatar
      [SPARK-14134][CORE] Change the package name used for shading classes. · 21d5ca12
      Marcelo Vanzin authored
      The current package name uses a dash, which is a little weird but seemed
      to work. That is, until a new test tried to mock a class that references
      one of those shaded types, and then things started failing.
      
      Most changes are just noise to fix the logging configs.
      
      For reference, SPARK-8815 also raised this issue, although at the time it
      did not cause any issues in Spark, so it was not addressed.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11941 from vanzin/SPARK-14134.
      21d5ca12
    • Marcelo Vanzin's avatar
      [SPARK-14446][TESTS] Fix ReplSuite for Scala 2.10. · 4901086f
      Marcelo Vanzin authored
      Just use the same test code as the 2.11 version, which seems to pass.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #12223 from vanzin/SPARK-14446.
      4901086f
  20. Apr 02, 2016
    • Dongjoon Hyun's avatar
      [MINOR][DOCS] Use multi-line JavaDoc comments in Scala code. · 4a6e78ab
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR aims to fix all Scala-Style multiline comments into Java-Style multiline comments in Scala codes.
      (All comment-only changes over 77 files: +786 lines, −747 lines)
      
      ## How was this patch tested?
      
      Manual.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #12130 from dongjoon-hyun/use_multiine_javadoc_comments.
      4a6e78ab
  21. Mar 28, 2016
    • Dongjoon Hyun's avatar
      [SPARK-14102][CORE] Block `reset` command in SparkShell · b66aa900
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Spark Shell provides an easy way to use Spark in Scala environment. This PR adds `reset` command to a blocked list, also cleaned up according to the Scala coding style.
      ```scala
      scala> sc
      res0: org.apache.spark.SparkContext = org.apache.spark.SparkContext718fad24
      scala> :reset
      scala> sc
      <console>:11: error: not found: value sc
             sc
             ^
      ```
      If we blocks `reset`, Spark Shell works like the followings.
      ```scala
      scala> :reset
      reset: no such command.  Type :help for help.
      scala> :re
      re is ambiguous: did you mean :replay or :require?
      ```
      
      ## How was this patch tested?
      
      Manual. Run `bin/spark-shell` and type `:reset`.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11920 from dongjoon-hyun/SPARK-14102.
      b66aa900
  22. Mar 25, 2016
  23. Mar 21, 2016
    • Wenchen Fan's avatar
      [SPARK-13456][SQL] fix creating encoders for case classes defined in Spark shell · 43ebf7a9
      Wenchen Fan authored
      ## What changes were proposed in this pull request?
      
      case classes defined in REPL are wrapped by line classes, and we have a trick for scala 2.10 REPL to automatically register the wrapper classes to `OuterScope` so that we can use when create encoders.
      However, this trick doesn't work right after we upgrade to scala 2.11, and unfortunately the tests are only in scala 2.10, which makes this bug hidden until now.
      
      This PR moves the encoder tests to scala 2.11  `ReplSuite`, and fixes this bug by another approach(the previous trick can't port to scala 2.11 REPL): make `OuterScope` smarter that can detect classes defined in REPL and load the singleton of line wrapper classes automatically.
      
      ## How was this patch tested?
      
      the migrated encoder tests in `ReplSuite`
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #11410 from cloud-fan/repl.
      43ebf7a9
  24. Mar 17, 2016
  25. Mar 14, 2016
    • Marcelo Vanzin's avatar
      [SPARK-13626][CORE] Avoid duplicate config deprecation warnings. · 8301fadd
      Marcelo Vanzin authored
      Three different things were needed to get rid of spurious warnings:
      - silence deprecation warnings when cloning configuration
      - change the way SparkHadoopUtil instantiates SparkConf to silence
        warnings
      - avoid creating new SparkConf instances where it's not needed.
      
      On top of that, I changed the way that Logging.scala detects the repl;
      now it uses a method that is overridden in the repl's Main class, and
      the hack in Utils.scala is not needed anymore. This makes the 2.11 repl
      behave like the 2.10 one and set the default log level to WARN, which
      is a lot better. Previously, this wasn't working because the 2.11 repl
      triggers log initialization earlier than the 2.10 one.
      
      I also removed and simplified some other code in the 2.11 repl's Main
      to avoid replicating logic that already exists elsewhere in Spark.
      
      Tested the 2.11 repl in local and yarn modes.
      
      Author: Marcelo Vanzin <vanzin@cloudera.com>
      
      Closes #11510 from vanzin/SPARK-13626.
      8301fadd
  26. Mar 10, 2016
    • Dongjoon Hyun's avatar
      [SPARK-3854][BUILD] Scala style: require spaces before `{`. · 91fed8e9
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      Since the opening curly brace, '{', has many usages as discussed in [SPARK-3854](https://issues.apache.org/jira/browse/SPARK-3854), this PR adds a ScalaStyle rule to prevent '){' pattern  for the following majority pattern and fixes the code accordingly. If we enforce this in ScalaStyle from now, it will improve the Scala code quality and reduce review time.
      ```
      // Correct:
      if (true) {
        println("Wow!")
      }
      
      // Incorrect:
      if (true){
         println("Wow!")
      }
      ```
      IntelliJ also shows new warnings based on this.
      
      ## How was this patch tested?
      
      Pass the Jenkins ScalaStyle test.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11637 from dongjoon-hyun/SPARK-3854.
      91fed8e9
  27. Mar 03, 2016
    • Dongjoon Hyun's avatar
      [MINOR] Fix typos in comments and testcase name of code · 941b270b
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      This PR fixes typos in comments and testcase name of code.
      
      ## How was this patch tested?
      
      manual.
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11481 from dongjoon-hyun/minor_fix_typos_in_code.
      941b270b
    • Dongjoon Hyun's avatar
      [SPARK-13583][CORE][STREAMING] Remove unused imports and add checkstyle rule · b5f02d67
      Dongjoon Hyun authored
      ## What changes were proposed in this pull request?
      
      After SPARK-6990, `dev/lint-java` keeps Java code healthy and helps PR review by saving much time.
      This issue aims remove unused imports from Java/Scala code and add `UnusedImports` checkstyle rule to help developers.
      
      ## How was this patch tested?
      ```
      ./dev/lint-java
      ./build/sbt compile
      ```
      
      Author: Dongjoon Hyun <dongjoon@apache.org>
      
      Closes #11438 from dongjoon-hyun/SPARK-13583.
      b5f02d67
  28. Feb 09, 2016
    • Iulian Dragos's avatar
      [SPARK-13086][SHELL] Use the Scala REPL settings, to enable things like `-i file`. · e30121af
      Iulian Dragos authored
      Now:
      
      ```
      $ bin/spark-shell -i test.scala
      NOTE: SPARK_PREPEND_CLASSES is set, placing locally compiled Spark classes ahead of assembly.
      Setting default log level to "WARN".
      To adjust logging level use sc.setLogLevel(newLevel).
      16/01/29 17:37:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
      16/01/29 17:37:39 INFO Main: Created spark context..
      Spark context available as sc (master = local[*], app id = local-1454085459000).
      16/01/29 17:37:39 INFO Main: Created sql context..
      SQL context available as sqlContext.
      Loading test.scala...
      hello
      
      Welcome to
            ____              __
           / __/__  ___ _____/ /__
          _\ \/ _ \/ _ `/ __/  '_/
         /___/ .__/\_,_/_/ /_/\_\   version 2.0.0-SNAPSHOT
            /_/
      
      Using Scala version 2.11.7 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_45)
      Type in expressions to have them evaluated.
      Type :help for more information.
      ```
      
      Author: Iulian Dragos <jaguarul@gmail.com>
      
      Closes #10984 from dragos/issue/repl-eval-file.
      e30121af
  29. Jan 30, 2016
    • Josh Rosen's avatar
      [SPARK-6363][BUILD] Make Scala 2.11 the default Scala version · 289373b2
      Josh Rosen authored
      This patch changes Spark's build to make Scala 2.11 the default Scala version. To be clear, this does not mean that Spark will stop supporting Scala 2.10: users will still be able to compile Spark for Scala 2.10 by following the instructions on the "Building Spark" page; however, it does mean that Scala 2.11 will be the default Scala version used by our CI builds (including pull request builds).
      
      The Scala 2.11 compiler is faster than 2.10, so I think we'll be able to look forward to a slight speedup in our CI builds (it looks like it's about 2X faster for the Maven compile-only builds, for instance).
      
      After this patch is merged, I'll update Jenkins to add new compile-only jobs to ensure that Scala 2.10 compilation doesn't break.
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #10608 from JoshRosen/SPARK-6363.
      289373b2
  30. Jan 13, 2016
  31. Jan 05, 2016
  32. Dec 31, 2015
  33. Dec 24, 2015
    • Kazuaki Ishizaki's avatar
      [SPARK-12311][CORE] Restore previous value of "os.arch" property in test... · 39204661
      Kazuaki Ishizaki authored
      [SPARK-12311][CORE] Restore previous value of "os.arch" property in test suites after forcing to set specific value to "os.arch" property
      
      Restore the original value of os.arch property after each test
      
      Since some of tests forced to set the specific value to os.arch property, we need to set the original value.
      
      Author: Kazuaki Ishizaki <ishizaki@jp.ibm.com>
      
      Closes #10289 from kiszk/SPARK-12311.
      39204661
Loading