From ecc16220d2d9eace81de44c4b0aff1c364a35e3f Mon Sep 17 00:00:00 2001 From: Dongjoon Hyun <dongjoon@apache.org> Date: Sat, 7 Jan 2017 18:55:01 -0800 Subject: [PATCH] [SPARK-18941][SQL][DOC] Add a new behavior document on `CREATE/DROP TABLE` with `LOCATION` ## What changes were proposed in this pull request? This PR adds a new behavior change description on `CREATE TABLE ... LOCATION` at `sql-programming-guide.md` clearly under `Upgrading From Spark SQL 1.6 to 2.0`. This change is introduced at Apache Spark 2.0.0 as [SPARK-15276](https://issues.apache.org/jira/browse/SPARK-15276). ## How was this patch tested? ``` SKIP_API=1 jekyll build ``` **Newly Added Description** <img width="913" alt="new" src="https://cloud.githubusercontent.com/assets/9700541/21743606/7efe2b12-d4ba-11e6-8a0d-551222718ea2.png"> Author: Dongjoon Hyun <dongjoon@apache.org> Closes #16400 from dongjoon-hyun/SPARK-18941. (cherry picked from commit 923e594844a7ad406195b91877f0fb374d5a454b) Signed-off-by: gatorsmile <gatorsmile@gmail.com> --- docs/sql-programming-guide.md | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/docs/sql-programming-guide.md b/docs/sql-programming-guide.md index 58de0e1318..fb3c6a7d43 100644 --- a/docs/sql-programming-guide.md +++ b/docs/sql-programming-guide.md @@ -1358,6 +1358,14 @@ options. - Dataset and DataFrame API `explode` has been deprecated, alternatively, use `functions.explode()` with `select` or `flatMap` - Dataset and DataFrame API `registerTempTable` has been deprecated and replaced by `createOrReplaceTempView` + - Changes to `CREATE TABLE ... LOCATION` behavior for Hive tables. + - From Spark 2.0, `CREATE TABLE ... LOCATION` is equivalent to `CREATE EXTERNAL TABLE ... LOCATION` + in order to prevent accidental dropping the existing data in the user-provided locations. + That means, a Hive table created in Spark SQL with the user-specified location is always a Hive external table. + Dropping external tables will not remove the data. Users are not allowed to specify the location for Hive managed tables. + Note that this is different from the Hive behavior. + - As a result, `DROP TABLE` statements on those tables will not remove the data. + ## Upgrading From Spark SQL 1.5 to 1.6 - From Spark 1.6, by default the Thrift server runs in multi-session mode. Which means each JDBC/ODBC -- GitLab