Skip to content
Snippets Groups Projects
Commit 28fafa3e authored by Burak Yavuz's avatar Burak Yavuz Committed by Wenchen Fan
Browse files

[SPARK-17599] Prevent ListingFileCatalog from failing if path doesn't exist

## What changes were proposed in this pull request?

The `ListingFileCatalog` lists files given a set of resolved paths. If a folder is deleted at any time between the paths were resolved and the file catalog can check for the folder, the Spark job fails. This may abruptly stop long running StructuredStreaming jobs for example.

Folders may be deleted by users or automatically by retention policies. These cases should not prevent jobs from successfully completing.

## How was this patch tested?

Unit test in `FileCatalogSuite`

Author: Burak Yavuz <brkyvz@gmail.com>

Closes #15153 from brkyvz/SPARK-17599.
parent 3977223a
No related branches found
No related tags found
No related merge requests found
......@@ -17,6 +17,8 @@
package org.apache.spark.sql.execution.datasources
import java.io.FileNotFoundException
import scala.collection.mutable
import org.apache.hadoop.fs.{FileStatus, LocatedFileStatus, Path}
......@@ -97,8 +99,14 @@ class ListingFileCatalog(
logTrace(s"Listing $path on driver")
val childStatuses = {
val stats = fs.listStatus(path)
if (pathFilter != null) stats.filter(f => pathFilter.accept(f.getPath)) else stats
try {
val stats = fs.listStatus(path)
if (pathFilter != null) stats.filter(f => pathFilter.accept(f.getPath)) else stats
} catch {
case _: FileNotFoundException =>
logWarning(s"The directory $path was not found. Was it deleted very recently?")
Array.empty[FileStatus]
}
}
childStatuses.map {
......
......@@ -67,4 +67,15 @@ class FileCatalogSuite extends SharedSQLContext {
}
}
test("ListingFileCatalog: folders that don't exist don't throw exceptions") {
withTempDir { dir =>
val deletedFolder = new File(dir, "deleted")
assert(!deletedFolder.exists())
val catalog1 = new ListingFileCatalog(
spark, Seq(new Path(deletedFolder.getCanonicalPath)), Map.empty, None)
// doesn't throw an exception
assert(catalog1.listLeafFiles(catalog1.paths).isEmpty)
}
}
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment