-
- Downloads
[SPARK-12649][SQL] support reading bucketed table
This PR adds the support to read bucketed tables, and correctly populate `outputPartitioning`, so that we can avoid shuffle for some cases. TODO(follow-up PRs): * bucket pruning * avoid shuffle for bucketed table join when use any super-set of the bucketing key. (we should re-visit it after https://issues.apache.org/jira/browse/SPARK-12704 is fixed) * recognize hive bucketed table Author: Wenchen Fan <wenchen@databricks.com> Closes #10604 from cloud-fan/bucket-read.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala 1 addition, 0 deletions...src/main/scala/org/apache/spark/sql/DataFrameReader.scala
- sql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala 6 additions, 0 deletionssql/core/src/main/scala/org/apache/spark/sql/SQLConf.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/ExistingRDD.scala 24 additions, 4 deletions...in/scala/org/apache/spark/sql/execution/ExistingRDD.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoHadoopFsRelation.scala 1 addition, 1 deletion...ql/execution/datasources/InsertIntoHadoopFsRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ResolvedDataSource.scala 3 additions, 1 deletion.../spark/sql/execution/datasources/ResolvedDataSource.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/WriterContainer.scala 1 addition, 1 deletion...che/spark/sql/execution/datasources/WriterContainer.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/bucket.scala 17 additions, 4 deletions...a/org/apache/spark/sql/execution/datasources/bucket.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/ddl.scala 1 addition, 1 deletion...cala/org/apache/spark/sql/execution/datasources/ddl.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONRelation.scala 2 additions, 2 deletions...e/spark/sql/execution/datasources/json/JSONRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetRelation.scala 1 addition, 1 deletion...k/sql/execution/datasources/parquet/ParquetRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala 1 addition, 0 deletions...la/org/apache/spark/sql/execution/datasources/rules.scala
- sql/core/src/main/scala/org/apache/spark/sql/sources/interfaces.scala 51 additions, 4 deletions.../main/scala/org/apache/spark/sql/sources/interfaces.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala 2 additions, 0 deletions...ache/spark/sql/execution/datasources/json/JsonSuite.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala 14 additions, 12 deletions...cala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/commands.scala 6 additions, 1 deletion.../scala/org/apache/spark/sql/hive/execution/commands.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala 1 addition, 1 deletion...ain/scala/org/apache/spark/sql/hive/orc/OrcRelation.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/BucketedReadSuite.scala 178 additions, 0 deletions...cala/org/apache/spark/sql/sources/BucketedReadSuite.scala
- sql/hive/src/test/scala/org/apache/spark/sql/sources/BucketedWriteSuite.scala 4 additions, 12 deletions...ala/org/apache/spark/sql/sources/BucketedWriteSuite.scala
Loading
Please register or sign in to comment