-
- Downloads
[SPARK-19919][SQL] Defer throwing the exception for empty paths in CSV datasource into `DataSource`
## What changes were proposed in this pull request? This PR proposes to defer throwing the exception within `DataSource`. Currently, if other datasources fail to infer the schema, it returns `None` and then this is being validated in `DataSource` as below: ``` scala> spark.read.json("emptydir") org.apache.spark.sql.AnalysisException: Unable to infer schema for JSON. It must be specified manually.; ``` ``` scala> spark.read.orc("emptydir") org.apache.spark.sql.AnalysisException: Unable to infer schema for ORC. It must be specified manually.; ``` ``` scala> spark.read.parquet("emptydir") org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.; ``` However, CSV it checks it within the datasource implementation and throws another exception message as below: ``` scala> spark.read.csv("emptydir") java.lang.IllegalArgumentException: requirement failed: Cannot infer schema from an empty set of files ``` We could remove this duplicated check and validate this in one place in the same way with the same message. ## How was this patch tested? Unit test in `CSVSuite` and manual test. Author: hyukjinkwon <gurwls223@gmail.com> Closes #17256 from HyukjinKwon/SPARK-19919.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVDataSource.scala 18 additions, 7 deletions...e/spark/sql/execution/datasources/csv/CSVDataSource.scala
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/CSVFileFormat.scala 1 addition, 3 deletions...e/spark/sql/execution/datasources/csv/CSVFileFormat.scala
- sql/core/src/test/scala/org/apache/spark/sql/test/DataFrameReaderWriterSuite.scala 4 additions, 2 deletions...rg/apache/spark/sql/test/DataFrameReaderWriterSuite.scala
Loading
Please register or sign in to comment