Skip to content
Snippets Groups Projects
  1. Dec 16, 2015
  2. Dec 15, 2015
  3. Dec 14, 2015
    • gatorsmile's avatar
      [SPARK-12288] [SQL] Support UnsafeRow in Coalesce/Except/Intersect. · 606f99b9
      gatorsmile authored
      Support UnsafeRow for the Coalesce/Except/Intersect.
      
      Could you review if my code changes are ok? davies Thank you!
      
      Author: gatorsmile <gatorsmile@gmail.com>
      
      Closes #10285 from gatorsmile/unsafeSupportCIE.
      606f99b9
    • gatorsmile's avatar
      [SPARK-12188][SQL][FOLLOW-UP] Code refactoring and comment correction in Dataset APIs · d13ff82c
      gatorsmile authored
      marmbrus This PR is to address your comment. Thanks for your review!
      
      Author: gatorsmile <gatorsmile@gmail.com>
      
      Closes #10214 from gatorsmile/followup12188.
      d13ff82c
    • Wenchen Fan's avatar
      [SPARK-12274][SQL] WrapOption should not have type constraint for child · 9ea1a8ef
      Wenchen Fan authored
      I think it was a mistake, and we have not catched it so far until https://github.com/apache/spark/pull/10260 which begin to check if the `fromRowExpression` is resolved.
      
      Author: Wenchen Fan <wenchen@databricks.com>
      
      Closes #10263 from cloud-fan/encoder.
      9ea1a8ef
    • Shivaram Venkataraman's avatar
      [SPARK-12327] Disable commented code lintr temporarily · fb3778de
      Shivaram Venkataraman authored
      cc yhuai felixcheung shaneknapp
      
      Author: Shivaram Venkataraman <shivaram@cs.berkeley.edu>
      
      Closes #10300 from shivaram/comment-lintr-disable.
      fb3778de
    • Liang-Chi Hsieh's avatar
      [SPARK-12016] [MLLIB] [PYSPARK] Wrap Word2VecModel when loading it in pyspark · b51a4cdf
      Liang-Chi Hsieh authored
      JIRA: https://issues.apache.org/jira/browse/SPARK-12016
      
      We should not directly use Word2VecModel in pyspark. We need to wrap it in a Word2VecModelWrapper when loading it in pyspark.
      
      Author: Liang-Chi Hsieh <viirya@appier.com>
      
      Closes #10100 from viirya/fix-load-py-wordvecmodel.
      b51a4cdf
    • BenFradet's avatar
      [MINOR][DOC] Fix broken word2vec link · e25f1fe4
      BenFradet authored
      Follow-up of [SPARK-12199](https://issues.apache.org/jira/browse/SPARK-12199) and #10193 where a broken link has been left as is.
      
      Author: BenFradet <benjamin.fradet@gmail.com>
      
      Closes #10282 from BenFradet/SPARK-12199.
      e25f1fe4
    • yucai's avatar
      [SPARK-12275][SQL] No plan for BroadcastHint in some condition · ed87f6d3
      yucai authored
      When SparkStrategies.BasicOperators's "case BroadcastHint(child) => apply(child)" is hit, it only recursively invokes BasicOperators.apply with this "child". It makes many strategies have no change to process this plan, which probably leads to "No plan" issue, so we use planLater to go through all strategies.
      
      https://issues.apache.org/jira/browse/SPARK-12275
      
      Author: yucai <yucai.yu@intel.com>
      
      Closes #10265 from yucai/broadcast_hint.
      ed87f6d3
    • Davies Liu's avatar
      [SPARK-12213][SQL] use multiple partitions for single distinct query · 834e7148
      Davies Liu authored
      Currently, we could generate different plans for query with single distinct (depends on spark.sql.specializeSingleDistinctAggPlanning), one works better on low cardinality columns, the other
      works better for high cardinality column (default one).
      
      This PR change to generate a single plan (three aggregations and two exchanges), which work better in both cases, then we could safely remove the flag `spark.sql.specializeSingleDistinctAggPlanning` (introduced in 1.6).
      
      For a query like `SELECT COUNT(DISTINCT a) FROM table` will be
      ```
      AGG-4 (count distinct)
        Shuffle to a single reducer
          Partial-AGG-3 (count distinct, no grouping)
            Partial-AGG-2 (grouping on a)
              Shuffle by a
                Partial-AGG-1 (grouping on a)
      ```
      
      This PR also includes large refactor for aggregation (reduce 500+ lines of code)
      
      cc yhuai nongli marmbrus
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #10228 from davies/single_distinct.
      834e7148
    • Shixiong Zhu's avatar
      [SPARK-12281][CORE] Fix a race condition when reporting ExecutorState in the shutdown hook · 2aecda28
      Shixiong Zhu authored
      1. Make sure workers and masters exit so that no worker or master will still be running when triggering the shutdown hook.
      2. Set ExecutorState to FAILED if it's still RUNNING when executing the shutdown hook.
      
      This should fix the potential exceptions when exiting a local cluster
      ```
      java.lang.AssertionError: assertion failed: executor 4 state transfer from RUNNING to RUNNING is illegal
      	at scala.Predef$.assert(Predef.scala:179)
      	at org.apache.spark.deploy.master.Master$$anonfun$receive$1.applyOrElse(Master.scala:260)
      	at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
      	at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
      	at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
      	at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      
      java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.
      	at org.apache.spark.util.SparkShutdownHookManager.add(ShutdownHookManager.scala:246)
      	at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:191)
      	at org.apache.spark.util.ShutdownHookManager$.addShutdownHook(ShutdownHookManager.scala:180)
      	at org.apache.spark.deploy.worker.ExecutorRunner.start(ExecutorRunner.scala:73)
      	at org.apache.spark.deploy.worker.Worker$$anonfun$receive$1.applyOrElse(Worker.scala:474)
      	at org.apache.spark.rpc.netty.Inbox$$anonfun$process$1.apply$mcV$sp(Inbox.scala:116)
      	at org.apache.spark.rpc.netty.Inbox.safelyCall(Inbox.scala:204)
      	at org.apache.spark.rpc.netty.Inbox.process(Inbox.scala:100)
      	at org.apache.spark.rpc.netty.Dispatcher$MessageLoop.run(Dispatcher.scala:215)
      	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
      	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
      	at java.lang.Thread.run(Thread.java:745)
      ```
      
      Author: Shixiong Zhu <shixiong@databricks.com>
      
      Closes #10269 from zsxwing/executor-state.
      2aecda28
  4. Dec 12, 2015
  5. Dec 11, 2015
Loading