-
- Downloads
[SPARK-12756][SQL] use hash expression in Exchange
This PR makes bucketing and exchange share one common hash algorithm, so that we can guarantee the data distribution is same between shuffle and bucketed data source, which enables us to only shuffle one side when join a bucketed table and a normal one. This PR also fixes the tests that are broken by the new hash behaviour in shuffle. Author: Wenchen Fan <wenchen@databricks.com> Closes #10703 from cloud-fan/use-hash-expr-in-shuffle.
Showing
- R/pkg/inst/tests/testthat/test_sparkSQL.R 1 addition, 1 deletionR/pkg/inst/tests/testthat/test_sparkSQL.R
- python/pyspark/sql/dataframe.py 13 additions, 13 deletionspython/pyspark/sql/dataframe.py
- python/pyspark/sql/group.py 3 additions, 3 deletionspython/pyspark/sql/group.py
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/physical/partitioning.scala 6 additions, 1 deletion...ache/spark/sql/catalyst/plans/physical/partitioning.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/Exchange.scala 10 additions, 2 deletions.../main/scala/org/apache/spark/sql/execution/Exchange.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala 9 additions, 11 deletions...che/spark/sql/execution/datasources/WriterContainer.scala
- sql/core/src/test/java/test/org/apache/spark/sql/JavaDataFrameSuite.java 2 additions, 2 deletions...st/java/test/org/apache/spark/sql/JavaDataFrameSuite.java
- sql/core/src/test/java/test/org/apache/spark/sql/JavaDatasetSuite.java 19 additions, 14 deletions...test/java/test/org/apache/spark/sql/JavaDatasetSuite.java
- sql/core/src/test/scala/org/apache/spark/sql/DataFrameSuite.scala 12 additions, 9 deletions.../src/test/scala/org/apache/spark/sql/DataFrameSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/DatasetSuite.scala 2 additions, 2 deletions...re/src/test/scala/org/apache/spark/sql/DatasetSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala 1 addition, 1 deletion...e/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala 6 additions, 5 deletions...ala/org/apache/spark/sql/sources/BucketedWriteSuite.scala
Loading
Please register or sign in to comment