-
- Downloads
[SPARK-20908][SQL] Cache Manager: Hint should be ignored in plan matching
### What changes were proposed in this pull request? In Cache manager, the plan matching should ignore Hint. ```Scala val df1 = spark.range(10).join(broadcast(spark.range(10))) df1.cache() spark.range(10).join(spark.range(10)).explain() ``` The output plan of the above query shows that the second query is not using the cached data of the first query. ``` BroadcastNestedLoopJoin BuildRight, Inner :- *Range (0, 10, step=1, splits=2) +- BroadcastExchange IdentityBroadcastMode +- *Range (0, 10, step=1, splits=2) ``` After the fix, the plan becomes ``` InMemoryTableScan [id#20L, id#23L] +- InMemoryRelation [id#20L, id#23L], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) +- BroadcastNestedLoopJoin BuildRight, Inner :- *Range (0, 10, step=1, splits=2) +- BroadcastExchange IdentityBroadcastMode +- *Range (0, 10, step=1, splits=2) ``` ### How was this patch tested? Added a test. Author: Xiao Li <gatorsmile@gmail.com> Closes #18131 from gatorsmile/HintCache.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/hints.scala 2 additions, 0 deletions...a/org/apache/spark/sql/catalyst/plans/logical/hints.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/SameResultSuite.scala 7 additions, 1 deletion...org/apache/spark/sql/catalyst/plans/SameResultSuite.scala
Please register or sign in to comment