-
- Downloads
[SPARK-2060][SQL] Querying JSON Datasets with SQL and DSL in Spark SQL
JIRA: https://issues.apache.org/jira/browse/SPARK-2060 Programming guide: http://yhuai.github.io/site/sql-programming-guide.html Scala doc of SQLContext: http://yhuai.github.io/site/api/scala/index.html#org.apache.spark.sql.SQLContext Author: Yin Huai <huai@cse.ohio-state.edu> Closes #999 from yhuai/newJson and squashes the following commits: 227e89e [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson ce8eedd [Yin Huai] rxin's comments. bc9ac51 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 94ffdaa [Yin Huai] Remove "get" from method names. ce31c81 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson e2773a6 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 79ea9ba [Yin Huai] Fix typos. 5428451 [Yin Huai] Newline 1f908ce [Yin Huai] Remove extra line. d7a005c [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 7ea750e [Yin Huai] marmbrus's comments. 6a5f5ef [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 83013fb [Yin Huai] Update Java Example. e7a6c19 [Yin Huai] SchemaRDD.javaToPython should convert a field with the StructType to a Map. 6d20b85 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 4fbddf0 [Yin Huai] Programming guide. 9df8c5a [Yin Huai] Python API. 7027634 [Yin Huai] Java API. cff84cc [Yin Huai] Use a SchemaRDD for a JSON dataset. d0bd412 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson ab810b0 [Yin Huai] Make JsonRDD private. 6df0891 [Yin Huai] Apache header. 8347f2e [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 66f9e76 [Yin Huai] Update docs and use the entire dataset to infer the schema. 8ffed79 [Yin Huai] Update the example. a5a4b52 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 4325475 [Yin Huai] If a sampled dataset is used for schema inferring, update the schema of the JsonTable after first execution. 65b87f0 [Yin Huai] Fix sampling... 8846af5 [Yin Huai] API doc. 52a2275 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson 0387523 [Yin Huai] Address PR comments. 666b957 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson a2313a6 [Yin Huai] Address PR comments. f3ce176 [Yin Huai] After type conflict resolution, if a NullType is found, StringType is used. 0576406 [Yin Huai] Add Apache license header. af91b23 [Yin Huai] Merge remote-tracking branch 'upstream/master' into newJson f45583b [Yin Huai] Infer the schema of a JSON dataset (a text file with one JSON object per line or a RDD[String] with one JSON object per string) and returns a SchemaRDD. f31065f [Yin Huai] A query plan or a SchemaRDD can print out its schema.
Showing
- .rat-excludes 1 addition, 0 deletions.rat-excludes
- docs/sql-programming-guide.md 222 additions, 68 deletionsdocs/sql-programming-guide.md
- examples/src/main/java/org/apache/spark/examples/sql/JavaSparkSQL.java 76 additions, 2 deletions...main/java/org/apache/spark/examples/sql/JavaSparkSQL.java
- examples/src/main/resources/people.json 3 additions, 0 deletionsexamples/src/main/resources/people.json
- project/SparkBuild.scala 18 additions, 4 deletionsproject/SparkBuild.scala
- python/pyspark/sql.py 62 additions, 2 deletionspython/pyspark/sql.py
- sql/catalyst/pom.xml 28 additions, 0 deletionssql/catalyst/pom.xml
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala 17 additions, 8 deletions...apache/spark/sql/catalyst/analysis/HiveTypeCoercion.scala
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala 51 additions, 0 deletions...scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/CombiningLimitsSuite.scala 2 additions, 1 deletion...e/spark/sql/catalyst/optimizer/CombiningLimitsSuite.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/ConstantFoldingSuite.scala 2 additions, 1 deletion...e/spark/sql/catalyst/optimizer/ConstantFoldingSuite.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala 2 additions, 3 deletions...he/spark/sql/catalyst/optimizer/FilterPushdownSuite.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/optimizer/SimplifyCaseConversionExpressionsSuite.scala 2 additions, 1 deletion...st/optimizer/SimplifyCaseConversionExpressionsSuite.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/PlanTest.scala 4 additions, 5 deletions.../scala/org/apache/spark/sql/catalyst/plans/PlanTest.scala
- sql/core/pom.xml 12 additions, 0 deletionssql/core/pom.xml
- sql/core/src/main/scala/org/apache/spark/sql/SQLContext.scala 38 additions, 7 deletions...core/src/main/scala/org/apache/spark/sql/SQLContext.scala
- sql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala 32 additions, 6 deletionssql/core/src/main/scala/org/apache/spark/sql/SchemaRDD.scala
- sql/core/src/main/scala/org/apache/spark/sql/SchemaRDDLike.scala 6 additions, 0 deletions...e/src/main/scala/org/apache/spark/sql/SchemaRDDLike.scala
- sql/core/src/main/scala/org/apache/spark/sql/api/java/JavaSQLContext.scala 20 additions, 0 deletions.../scala/org/apache/spark/sql/api/java/JavaSQLContext.scala
- sql/core/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala 397 additions, 0 deletions...re/src/main/scala/org/apache/spark/sql/json/JsonRDD.scala
This diff is collapsed.
examples/src/main/resources/people.json
0 → 100644
Please register or sign in to comment