-
- Downloads
[SPARK-3739] [SQL] Update the split num base on block size for table scanning
In local mode, Hadoop/Hive will ignore the "mapred.map.tasks", hence for small table file, it's always a single input split, however, SparkSQL doesn't honor that in table scanning, and we will get different result when do the Hive Compatibility test. This PR will fix that. Author: Cheng Hao <hao.cheng@intel.com> Closes #2589 from chenghao-intel/source_split and squashes the following commits: dff38e7 [Cheng Hao] Remove the extra blank line 160a2b6 [Cheng Hao] fix the compiling bug 04d67f7 [Cheng Hao] Keep 1 split for small file in table scanning
Showing
- sql/hive/src/main/scala/org/apache/spark/sql/hive/TableReader.scala 9 additions, 4 deletions...rc/main/scala/org/apache/spark/sql/hive/TableReader.scala
- sql/hive/src/test/resources/golden/file_split_for_small_table-0-7a45831bf96814d9a7fc3d78fb7bd8dc 500 additions, 0 deletions..._split_for_small_table-0-7a45831bf96814d9a7fc3d78fb7bd8dc
- sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveTableScanSuite.scala 8 additions, 1 deletion.../apache/spark/sql/hive/execution/HiveTableScanSuite.scala
Please register or sign in to comment