-
- Downloads
[SPARK-20978][SQL] Bump up Univocity version to 2.5.4
## What changes were proposed in this pull request? There was a bug in Univocity Parser that causes the issue in SPARK-20978. This was fixed as below: ```scala val df = spark.read.schema("a string, b string, unparsed string").option("columnNameOfCorruptRecord", "unparsed").csv(Seq("a").toDS()) df.show() ``` **Before** ``` java.lang.NullPointerException at scala.collection.immutable.StringLike$class.stripLineEnd(StringLike.scala:89) at scala.collection.immutable.StringOps.stripLineEnd(StringOps.scala:29) at org.apache.spark.sql.execution.datasources.csv.UnivocityParser.org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$getCurrentInput(UnivocityParser.scala:56) at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$convert$1.apply(UnivocityParser.scala:207) at org.apache.spark.sql.execution.datasources.csv.UnivocityParser$$anonfun$org$apache$spark$sql$execution$datasources$csv$UnivocityParser$$convert$1.apply(UnivocityParser.scala:207) ... ``` **After** ``` +---+----+--------+ | a| b|unparsed| +---+----+--------+ | a|null| a| +---+----+--------+ ``` It was fixed in 2.5.0 and 2.5.4 was released. I guess it'd be safe to upgrade this. ## How was this patch tested? Unit test added in `CSVSuite.scala`. Author: hyukjinkwon <gurwls223@gmail.com> Closes #19113 from HyukjinKwon/bump-up-univocity.
Showing
- dev/deps/spark-deps-hadoop-2.6 1 addition, 1 deletiondev/deps/spark-deps-hadoop-2.6
- dev/deps/spark-deps-hadoop-2.7 1 addition, 1 deletiondev/deps/spark-deps-hadoop-2.7
- sql/core/pom.xml 1 addition, 1 deletionsql/core/pom.xml
- sql/core/src/test/scala/org/apache/spark/sql/execution/datasources/csv/CSVSuite.scala 8 additions, 0 deletions...apache/spark/sql/execution/datasources/csv/CSVSuite.scala
Loading
Please register or sign in to comment