-
- Downloads
[SPARK-19765][SPARK-18549][SPARK-19093][SPARK-19736][BACKPORT-2.1][SQL]...
[SPARK-19765][SPARK-18549][SPARK-19093][SPARK-19736][BACKPORT-2.1][SQL] Backport Three Cache-related PRs to Spark 2.1 ### What changes were proposed in this pull request? Backport a few cache related PRs: --- [[SPARK-19093][SQL] Cached tables are not used in SubqueryExpression](https://github.com/apache/spark/pull/16493) Consider the plans inside subquery expressions while looking up cache manager to make use of cached data. Currently CacheManager.useCachedData does not consider the subquery expressions in the plan. --- [[SPARK-19736][SQL] refreshByPath should clear all cached plans with the specified path](https://github.com/apache/spark/pull/17064) Catalog.refreshByPath can refresh the cache entry and the associated metadata for all dataframes (if any), that contain the given data source path. However, CacheManager.invalidateCachedPath doesn't clear all cached plans with the specified path. It causes some strange behaviors reported in SPARK-15678. --- [[SPARK-19765][SPARK-18549][SQL] UNCACHE TABLE should un-cache all cached plans that refer to this table](https://github.com/apache/spark/pull/17097) When un-cache a table, we should not only remove the cache entry for this table, but also un-cache any other cached plans that refer to this table. The following commands trigger the table uncache: `DropTableCommand`, `TruncateTableCommand`, `AlterTableRenameCommand`, `UncacheTableCommand`, `RefreshTable` and `InsertIntoHiveTable` This PR also includes some refactors: - use java.util.LinkedList to store the cache entries, so that it's safer to remove elements while iterating - rename invalidateCache to recacheByPlan, which is more obvious about what it does. ### How was this patch tested? N/A Author: Xiao Li <gatorsmile@gmail.com> Closes #17319 from gatorsmile/backport-17097.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/CacheManager.scala 70 additions, 50 deletions...n/scala/org/apache/spark/sql/execution/CacheManager.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/columnar/InMemoryRelation.scala 0 additions, 6 deletions...pache/spark/sql/execution/columnar/InMemoryRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala 1 addition, 2 deletions...in/scala/org/apache/spark/sql/execution/command/ddl.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/InsertIntoDataSourceCommand.scala 3 additions, 2 deletions...l/execution/datasources/InsertIntoDataSourceCommand.scala
- sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala 9 additions, 14 deletions...ain/scala/org/apache/spark/sql/internal/CatalogImpl.scala
- sql/core/src/test/scala/org/apache/spark/sql/CachedTableSuite.scala 113 additions, 8 deletions...rc/test/scala/org/apache/spark/sql/CachedTableSuite.scala
- sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala 2 additions, 2 deletions...apache/spark/sql/hive/execution/InsertIntoHiveTable.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/CachedTableSuite.scala 1 addition, 3 deletions...st/scala/org/apache/spark/sql/hive/CachedTableSuite.scala
- sql/hive/src/test/scala/org/apache/spark/sql/hive/parquetSuites.scala 1 addition, 1 deletion.../test/scala/org/apache/spark/sql/hive/parquetSuites.scala
Loading
Please register or sign in to comment