-
- Downloads
[SPARK-16792][SQL] Dataset containing a Case Class with a List type causes a...
[SPARK-16792][SQL] Dataset containing a Case Class with a List type causes a CompileException (converting sequence to list) ## What changes were proposed in this pull request? Added a `to` call at the end of the code generated by `ScalaReflection.deserializerFor` if the requested type is not a supertype of `WrappedArray[_]` that uses `CanBuildFrom[_, _, _]` to convert result into an arbitrary subtype of `Seq[_]`. Care was taken to preserve the original deserialization where it is possible to avoid the overhead of conversion in cases where it is not needed `ScalaReflection.serializerFor` could already be used to serialize any `Seq[_]` so it was not altered `SQLImplicits` had to be altered and new implicit encoders added to permit serialization of other sequence types Also fixes [SPARK-16815] Dataset[List[T]] leads to ArrayStoreException ## How was this patch tested? ```bash ./build/mvn -DskipTests clean package && ./dev/run-tests ``` Also manual execution of the following sets of commands in the Spark shell: ```scala case class TestCC(key: Int, letters: List[String]) val ds1 = sc.makeRDD(Seq( (List("D")), (List("S","H")), (List("F","H")), (List("D","L","L")) )).map(x=>(x.length,x)).toDF("key","letters").as[TestCC] val test1=ds1.map{_.key} test1.show ``` ```scala case class X(l: List[String]) spark.createDataset(Seq(List("A"))).map(X).show ``` ```scala spark.sqlContext.createDataset(sc.parallelize(List(1) :: Nil)).collect ``` After adding arbitrary sequence support also tested with the following commands: ```scala case class QueueClass(q: scala.collection.immutable.Queue[Int]) spark.createDataset(Seq(List(1,2,3))).map(x => QueueClass(scala.collection.immutable.Queue(x: _*))).map(_.q.dequeue).collect ``` Author: Michal Senkyr <mike.senkyr@gmail.com> Closes #16240 from michalsenkyr/sql-caseclass-list-fix.
Showing
- sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/ScalaReflection.scala 39 additions, 1 deletion...scala/org/apache/spark/sql/catalyst/ScalaReflection.scala
- sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala 31 additions, 0 deletions.../org/apache/spark/sql/catalyst/ScalaReflectionSuite.scala
- sql/core/src/main/scala/org/apache/spark/sql/SQLImplicits.scala 94 additions, 21 deletions...re/src/main/scala/org/apache/spark/sql/SQLImplicits.scala
- sql/core/src/test/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala 67 additions, 0 deletions...st/scala/org/apache/spark/sql/DatasetPrimitiveSuite.scala
Please register or sign in to comment