-
- Downloads
[SPARK-17072][SQL] support table-level statistics generation and storing...
[SPARK-17072][SQL] support table-level statistics generation and storing into/loading from metastore ## What changes were proposed in this pull request? 1. Support generation table-level statistics for - hive tables in HiveExternalCatalog - data source tables in HiveExternalCatalog - data source tables in InMemoryCatalog. 2. Add a property "catalogStats" in CatalogTable to hold statistics in Spark side. 3. Put logics of statistics transformation between Spark and Hive in HiveClientImpl. 4. Extend Statistics class by adding rowCount (will add estimatedSize when we have column stats). ## How was this patch tested? add unit tests Author: wangzhenhua <wangzhenhua@huawei.com> Author: Zhenhua Wang <wangzhenhua@huawei.com> Closes #14712 from wzhfy/tableStats.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala 3 additions, 1 deletion...ala/org/apache/spark/sql/catalyst/catalog/interface.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/Statistics.scala 14 additions, 1 deletion.../apache/spark/sql/catalyst/plans/logical/Statistics.scala
- sql/core/src/main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala 6 additions, 2 deletions...main/scala/org/apache/spark/sql/catalyst/SQLBuilder.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/SparkSqlParser.scala 1 addition, 3 deletions...scala/org/apache/spark/sql/execution/SparkSqlParser.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzeTableCommand.scala 42 additions, 22 deletions...che/spark/sql/execution/command/AnalyzeTableCommand.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala 5 additions, 3 deletions.../spark/sql/execution/datasources/DataSourceStrategy.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/FileSourceStrategy.scala 1 addition, 1 deletion.../spark/sql/execution/datasources/FileSourceStrategy.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala 7 additions, 6 deletions...che/spark/sql/execution/datasources/LogicalRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala 4 additions, 4 deletions...la/org/apache/spark/sql/execution/datasources/rules.scala
- sql/core/src/main/scala/org/apache/spark/sql/internal/SessionState.scala 2 additions, 2 deletions...in/scala/org/apache/spark/sql/internal/SessionState.scala
- sql/core/src/test/scala/org/apache/spark/sql/StatisticsSuite.scala 26 additions, 0 deletions...src/test/scala/org/apache/spark/sql/StatisticsSuite.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala 47 additions, 10 deletions...scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala 3 additions, 7 deletions...cala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/MetastoreRelation.scala 35 additions, 33 deletions...n/scala/org/apache/spark/sql/hive/MetastoreRelation.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala 152 additions, 1 deletion...est/scala/org/apache/spark/sql/hive/StatisticsSuite.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala 15 additions, 12 deletions...la/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala
Loading
Please register or sign in to comment