Skip to content
Snippets Groups Projects
  1. Nov 10, 2015
    • Oscar D. Lara Yejas's avatar
      [SPARK-10863][SPARKR] Method coltypes() (New version) · 47735cdc
      Oscar D. Lara Yejas authored
      This is a follow up on PR #8984, as the corresponding branch for such PR was damaged.
      
      Author: Oscar D. Lara Yejas <olarayej@mail.usf.edu>
      
      Closes #9579 from olarayej/SPARK-10863_NEW14.
      47735cdc
    • Yin Huai's avatar
      [SPARK-9830][SQL] Remove AggregateExpression1 and Aggregate Operator used to... · e0701c75
      Yin Huai authored
      [SPARK-9830][SQL] Remove AggregateExpression1 and Aggregate Operator used to evaluate AggregateExpression1s
      
      https://issues.apache.org/jira/browse/SPARK-9830
      
      This PR contains the following main changes.
      * Removing `AggregateExpression1`.
      * Removing `Aggregate` operator, which is used to evaluate `AggregateExpression1`.
      * Removing planner rule used to plan `Aggregate`.
      * Linking `MultipleDistinctRewriter` to analyzer.
      * Renaming `AggregateExpression2` to `AggregateExpression` and `AggregateFunction2` to `AggregateFunction`.
      * Updating places where we create aggregate expression. The way to create aggregate expressions is `AggregateExpression(aggregateFunction, mode, isDistinct)`.
      * Changing `val`s in `DeclarativeAggregate`s that touch children of this function to `lazy val`s (when we create aggregate expression in DataFrame API, children of an aggregate function can be unresolved).
      
      Author: Yin Huai <yhuai@databricks.com>
      
      Closes #9556 from yhuai/removeAgg1.
      e0701c75
    • Lianhui Wang's avatar
      [SPARK-11252][NETWORK] ShuffleClient should release connection after fetching... · 6e5fc378
      Lianhui Wang authored
      [SPARK-11252][NETWORK] ShuffleClient should release connection after fetching blocks had been completed for external shuffle
      
      with yarn's external shuffle, ExternalShuffleClient of executors reserve its connections for yarn's NodeManager until application has been completed. so it will make NodeManager and executors have many socket connections.
      in order to reduce network pressure of NodeManager's shuffleService, after registerWithShuffleServer or fetchBlocks have been completed in ExternalShuffleClient, connection for NM's shuffleService needs to be closed.andrewor14 rxin vanzin
      
      Author: Lianhui Wang <lianhuiwang09@gmail.com>
      
      Closes #9227 from lianhuiwang/spark-11252.
      6e5fc378
    • Josh Rosen's avatar
      [SPARK-7841][BUILD] Stop using retrieveManaged to retrieve dependencies in SBT · 689386b1
      Josh Rosen authored
      This patch modifies Spark's SBT build so that it no longer uses `retrieveManaged` / `lib_managed` to store its dependencies. The motivations for this change are nicely described on the JIRA ticket ([SPARK-7841](https://issues.apache.org/jira/browse/SPARK-7841)); my personal interest in doing this stems from the fact that `lib_managed` has caused me some pain while debugging dependency issues in another PR of mine.
      
      Removing our use of `lib_managed` would be trivial except for one snag: the Datanucleus JARs, required by Spark SQL's Hive integration, cannot be included in assembly JARs due to problems with merging OSGI `plugin.xml` files. As a result, several places in the packaging and deployment pipeline assume that these Datanucleus JARs are copied to `lib_managed/jars`. In the interest of maintaining compatibility, I have chosen to retain the `lib_managed/jars` directory _only_ for these Datanucleus JARs and have added custom code to `SparkBuild.scala` to automatically copy those JARs to that folder as part of the `assembly` task.
      
      `dev/mima` also depended on `lib_managed` in a hacky way in order to set classpaths when generating MiMa excludes; I've updated this to obtain the classpaths directly from SBT instead.
      
      /cc dragos marmbrus pwendell srowen
      
      Author: Josh Rosen <joshrosen@databricks.com>
      
      Closes #9575 from JoshRosen/SPARK-7841.
      689386b1
    • Xusen Yin's avatar
      [SPARK-11382] Replace example code in mllib-decision-tree.md using include_example · a81f47ff
      Xusen Yin authored
      https://issues.apache.org/jira/browse/SPARK-11382
      
      B.T.W. I fix an error in naive_bayes_example.py.
      
      Author: Xusen Yin <yinxusen@gmail.com>
      
      Closes #9596 from yinxusen/SPARK-11382.
      a81f47ff
    • Paul Chandler's avatar
      Fix typo in driver page · 5507a9d0
      Paul Chandler authored
      "Comamnd property" => "Command property"
      
      Author: Paul Chandler <pestilence669@users.noreply.github.com>
      
      Closes #9578 from pestilence669/fix_spelling.
      5507a9d0
    • Davies Liu's avatar
      [SPARK-11598] [SQL] enable tests for ShuffledHashOuterJoin · 521b3cae
      Davies Liu authored
      Author: Davies Liu <davies@databricks.com>
      
      Closes #9573 from davies/join_condition.
      521b3cae
    • Davies Liu's avatar
      [SPARK-11599] [SQL] fix NPE when resolve Hive UDF in SQLParser · d6cd3a18
      Davies Liu authored
      The DataFrame APIs that takes a SQL expression always use SQLParser, then the HiveFunctionRegistry will called outside of Hive state, cause NPE if there is not a active Session State for current thread (in PySpark).
      
      cc rxin yhuai
      
      Author: Davies Liu <davies@databricks.com>
      
      Closes #9576 from davies/hive_udf.
      d6cd3a18
  2. Nov 09, 2015
  3. Nov 08, 2015
Loading