-
- Downloads
[SPARK-16575][CORE] partition calculation mismatch with sc.binaryFiles
## What changes were proposed in this pull request? This Pull request comprises of the critical bug SPARK-16575 changes. This change rectifies the issue with BinaryFileRDD partition calculations as upon creating an RDD with sc.binaryFiles, the resulting RDD always just consisted of two partitions only. ## How was this patch tested? The original issue ie. getNumPartitions on binary Files RDD (always having two partitions) was first replicated and then tested upon the changes. Also the unit tests have been checked and passed. This contribution is my original work and I licence the work to the project under the project's open source license srowen hvanhovell rxin vanzin skyluc kmader zsxwing datafarmer Please have a look . Author: fidato <fidato.july13@gmail.com> Closes #15327 from fidato13/SPARK-16575.
Showing
- core/src/main/scala/org/apache/spark/input/PortableDataStream.scala 11 additions, 3 deletions...ain/scala/org/apache/spark/input/PortableDataStream.scala
- core/src/main/scala/org/apache/spark/internal/config/package.scala 13 additions, 0 deletions...main/scala/org/apache/spark/internal/config/package.scala
- core/src/main/scala/org/apache/spark/rdd/BinaryFileRDD.scala 2 additions, 2 deletionscore/src/main/scala/org/apache/spark/rdd/BinaryFileRDD.scala
- docs/configuration.md 16 additions, 0 deletionsdocs/configuration.md
Please register or sign in to comment