Skip to content
Snippets Groups Projects
Commit 1fe08d62 authored by 曾林西's avatar 曾林西 Committed by Sean Owen
Browse files

[SPARK-21223] Change fileToAppInfo in FsHistoryProvider to fix concurrent issue.

# What issue does this PR address ?
Jira:https://issues.apache.org/jira/browse/SPARK-21223
fix the Thread-safety issue in FsHistoryProvider
Currently, Spark HistoryServer use a HashMap named fileToAppInfo in class FsHistoryProvider to store the map of eventlog path and attemptInfo.
When use ThreadPool to Replay the log files in the list and merge the list of old applications with new ones, multi thread may update fileToAppInfo at the same time, which may cause Thread-safety issues, such as  falling into an infinite loop because of calling resize func of the hashtable.

Author: 曾林西 <zenglinxi@meituan.com>

Closes #18430 from zenglinxi0615/master.
parent 528c9281
No related branches found
No related tags found
No related merge requests found
...@@ -19,7 +19,7 @@ package org.apache.spark.deploy.history ...@@ -19,7 +19,7 @@ package org.apache.spark.deploy.history
import java.io.{FileNotFoundException, IOException, OutputStream} import java.io.{FileNotFoundException, IOException, OutputStream}
import java.util.UUID import java.util.UUID
import java.util.concurrent.{Executors, ExecutorService, Future, TimeUnit} import java.util.concurrent.{ConcurrentHashMap, Executors, ExecutorService, Future, TimeUnit}
import java.util.zip.{ZipEntry, ZipOutputStream} import java.util.zip.{ZipEntry, ZipOutputStream}
import scala.collection.mutable import scala.collection.mutable
...@@ -122,7 +122,7 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) ...@@ -122,7 +122,7 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
@volatile private var applications: mutable.LinkedHashMap[String, FsApplicationHistoryInfo] @volatile private var applications: mutable.LinkedHashMap[String, FsApplicationHistoryInfo]
= new mutable.LinkedHashMap() = new mutable.LinkedHashMap()
val fileToAppInfo = new mutable.HashMap[Path, FsApplicationAttemptInfo]() val fileToAppInfo = new ConcurrentHashMap[Path, FsApplicationAttemptInfo]()
// List of application logs to be deleted by event log cleaner. // List of application logs to be deleted by event log cleaner.
private var attemptsToClean = new mutable.ListBuffer[FsApplicationAttemptInfo] private var attemptsToClean = new mutable.ListBuffer[FsApplicationAttemptInfo]
...@@ -321,7 +321,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) ...@@ -321,7 +321,8 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
// scan for modified applications, replay and merge them // scan for modified applications, replay and merge them
val logInfos: Seq[FileStatus] = statusList val logInfos: Seq[FileStatus] = statusList
.filter { entry => .filter { entry =>
val prevFileSize = fileToAppInfo.get(entry.getPath()).map{_.fileSize}.getOrElse(0L) val fileInfo = fileToAppInfo.get(entry.getPath())
val prevFileSize = if (fileInfo != null) fileInfo.fileSize else 0L
!entry.isDirectory() && !entry.isDirectory() &&
// FsHistoryProvider generates a hidden file which can't be read. Accidentally // FsHistoryProvider generates a hidden file which can't be read. Accidentally
// reading a garbage file is safe, but we would log an error which can be scary to // reading a garbage file is safe, but we would log an error which can be scary to
...@@ -475,7 +476,7 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock) ...@@ -475,7 +476,7 @@ private[history] class FsHistoryProvider(conf: SparkConf, clock: Clock)
fileStatus.getLen(), fileStatus.getLen(),
appListener.appSparkVersion.getOrElse("") appListener.appSparkVersion.getOrElse("")
) )
fileToAppInfo(logPath) = attemptInfo fileToAppInfo.put(logPath, attemptInfo)
logDebug(s"Application log ${attemptInfo.logPath} loaded successfully: $attemptInfo") logDebug(s"Application log ${attemptInfo.logPath} loaded successfully: $attemptInfo")
Some(attemptInfo) Some(attemptInfo)
} else { } else {
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment