-
- Downloads
[SPARK-21137][CORE] Spark reads many small files slowly
## What changes were proposed in this pull request? Parallelize FileInputFormat.listStatus in Hadoop API via LIST_STATUS_NUM_THREADS to speed up examination of file sizes for wholeTextFiles et al ## How was this patch tested? Existing tests, which will exercise the key path here: using a local file system. Author: Sean Owen <sowen@cloudera.com> Closes #18441 from srowen/SPARK-21137.
Showing
Please register or sign in to comment