-
- Downloads
SPARK-4687. Add a recursive option to the addFile API
This adds a recursive option to the addFile API to satisfy Hive's needs. It only allows specifying HDFS dirs that will be copied down on every executor. There are a couple outstanding questions. * Should we allow specifying local dirs as well? The best way to do this would probably be to archive them. The drawback is that it would require a fair bit of code that I don't know of any current use cases for. * The addFiles implementation has a caching component that I don't entirely understand. What events are we caching between? AFAICT it's users calling addFile on the same file in the same app at different times? Do we want/need to add something similar for addDirectory. * The addFiles implementation will check to see if an added file already exists and has the same contents. I imagine we want the same behavior, so planning to add this unless people think otherwise. I plan to add some tests if people are OK with the approach. Author: Sandy Ryza <sandy@cloudera.com> Closes #3670 from sryza/sandy-spark-4687 and squashes the following commits: f9fc77f [Sandy Ryza] Josh's comments 70cd24d [Sandy Ryza] Add another test 13da824 [Sandy Ryza] Revert executor changes 38bf94d [Sandy Ryza] Marcelo's comments ca83849 [Sandy Ryza] Add addFile test 1941be3 [Sandy Ryza] Fix test and avoid HTTP server in local mode 31f15a9 [Sandy Ryza] Use cache recursively and fix some compile errors 0239c3d [Sandy Ryza] Change addDirectory to addFile with recursive 46fe70a [Sandy Ryza] SPARK-4687. Add a addDirectory API
Showing
- core/src/main/scala/org/apache/spark/SparkContext.scala 56 additions, 12 deletionscore/src/main/scala/org/apache/spark/SparkContext.scala
- core/src/main/scala/org/apache/spark/util/Utils.scala 73 additions, 15 deletionscore/src/main/scala/org/apache/spark/util/Utils.scala
- core/src/test/scala/org/apache/spark/SparkContextSuite.scala 77 additions, 0 deletionscore/src/test/scala/org/apache/spark/SparkContextSuite.scala
- core/src/test/scala/org/apache/spark/util/UtilsSuite.scala 31 additions, 0 deletionscore/src/test/scala/org/apache/spark/util/UtilsSuite.scala
Loading
Please register or sign in to comment