Skip to content
Snippets Groups Projects
Commit 2deac748 authored by luogankun's avatar luogankun Committed by Michael Armbrust
Browse files

[SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE is eager

`CACHE TABLE tbl` is now __eager__ by default not __lazy__

Author: luogankun <luogankun@gmail.com>

Closes #3773 from luogankun/SPARK-4930 and squashes the following commits:

cc17b7d [luogankun] [SPARK-4930][SQL][DOCS]Update SQL programming guide, add CACHE [LAZY] TABLE [AS SELECT] ...
bffe0e8 [luogankun] [SPARK-4930][SQL][DOCS]Update SQL programming guide, CACHE TABLE tbl is eager
parent f7a41a0e
No related branches found
No related tags found
No related merge requests found
......@@ -1007,12 +1007,11 @@ let user control table caching explicitly:
CACHE TABLE logs_last_month;
UNCACHE TABLE logs_last_month;
**NOTE:** `CACHE TABLE tbl` is lazy, similar to `.cache` on an RDD. This command only marks `tbl` to ensure that
partitions are cached when calculated but doesn't actually cache it until a query that touches `tbl` is executed.
To force the table to be cached, you may simply count the table immediately after executing `CACHE TABLE`:
**NOTE:** `CACHE TABLE tbl` is now __eager__ by default not __lazy__. Don’t need to trigger cache materialization manually anymore.
CACHE TABLE logs_last_month;
SELECT COUNT(1) FROM logs_last_month;
Spark SQL newly introduced a statement to let user control table caching whether or not lazy since Spark 1.2.0:
CACHE [LAZY] TABLE [AS SELECT] ...
Several caching related features are not supported yet:
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment