-
- Downloads
[SPARK-10947] [SQL] With schema inference from JSON into a Dataframe, add...
[SPARK-10947] [SQL] With schema inference from JSON into a Dataframe, add option to infer all primitive object types as strings Currently, when a schema is inferred from a JSON file using sqlContext.read.json, the primitive object types are inferred as string, long, boolean, etc. However, if the inferred type is too specific (JSON obviously does not enforce types itself), this can cause issues with merging dataframe schemas. This pull request adds the option "primitivesAsString" to the JSON DataFrameReader which when true (defaults to false if not set) will infer all primitives as strings. Below is an example usage of this new functionality. ``` val jsonDf = sqlContext.read.option("primitivesAsString", "true").json(sampleJsonFile) scala> jsonDf.printSchema() root |-- bigInteger: string (nullable = true) |-- boolean: string (nullable = true) |-- double: string (nullable = true) |-- integer: string (nullable = true) |-- long: string (nullable = true) |-- null: string (nullable = true) |-- string: string (nullable = true) ``` Author: Stephen De Gennaro <stepheng@realitymine.com> Closes #9249 from stephend-realitymine/stephend-primitives.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala 9 additions, 1 deletion...src/main/scala/org/apache/spark/sql/DataFrameReader.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/InferSchema.scala 14 additions, 6 deletions...he/spark/sql/execution/datasources/json/InferSchema.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/json/JSONRelation.scala 12 additions, 2 deletions...e/spark/sql/execution/datasources/json/JSONRelation.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/json/JsonSuite.scala 136 additions, 2 deletions...ache/spark/sql/execution/datasources/json/JsonSuite.scala
Loading
Please register or sign in to comment