-
- Downloads
[SPARK-16101][SQL] Refactoring CSV read path to be consistent with JSON data source
## What changes were proposed in this pull request? This PR refactors CSV read path to be consistent with JSON data source. It makes the methods in classes have consistent arguments with JSON ones. `UnivocityParser` and `JacksonParser` ``` scala private[csv] class UnivocityParser( schema: StructType, requiredSchema: StructType, options: CSVOptions) extends Logging { ... def parse(input: String): Seq[InternalRow] = { ... ``` ``` scala class JacksonParser( schema: StructType, columnNameOfCorruptRecord: String, options: JSONOptions) extends Logging { ... def parse(input: String): Option[InternalRow] = { ... ``` These allow parsing an iterator (`String` to `InternalRow`) as below for both JSON and CSV: ```scala iter.flatMap(parser.parse) ``` ## How was this patch tested? Existing tests should cover this. Author: hyukjinkwon <gurwls223@gmail.com> Closes #16669 from HyukjinKwon/SPARK-16101-read.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala 8 additions, 16 deletions...e/spark/sql/execution/datasources/csv/CSVFileFormat.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVInferSchema.scala 0 additions, 118 deletions.../spark/sql/execution/datasources/csv/CSVInferSchema.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVOptions.scala 19 additions, 1 deletion...ache/spark/sql/execution/datasources/csv/CSVOptions.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVParser.scala 0 additions, 60 deletions...pache/spark/sql/execution/datasources/csv/CSVParser.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVRelation.scala 3 additions, 95 deletions...che/spark/sql/execution/datasources/csv/CSVRelation.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala 235 additions, 0 deletions...spark/sql/execution/datasources/csv/UnivocityParser.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParserSuite.scala 28 additions, 26 deletions.../sql/execution/datasources/csv/UnivocityParserSuite.scala
Loading
Please register or sign in to comment