-
- Downloads
[SPARK-21263][SQL] Do not allow partially parsing double and floats via NumberFormat in CSV
## What changes were proposed in this pull request? This PR proposes to remove `NumberFormat.parse` use to disallow a case of partially parsed data. For example, ``` scala> spark.read.schema("a DOUBLE").option("mode", "FAILFAST").csv(Seq("10u12").toDS).show() +----+ | a| +----+ |10.0| +----+ ``` ## How was this patch tested? Unit tests added in `UnivocityParserSuite` and `CSVSuite`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #18532 from HyukjinKwon/SPARK-21263.
Showing
- sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParser.scala 2 additions, 6 deletions...spark/sql/execution/datasources/csv/UnivocityParser.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala 21 additions, 0 deletions...apache/spark/sql/execution/datasources/csv/CSVSuite.scala
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/UnivocityParserSuite.scala 11 additions, 10 deletions.../sql/execution/datasources/csv/UnivocityParserSuite.scala
Loading
Please register or sign in to comment