Skip to content
Snippets Groups Projects
Commit 1765c8d0 authored by Colin McCabe's avatar Colin McCabe Committed by Patrick Wendell
Browse files

SPARK-1518: FileLogger: Fix compile against Hadoop trunk

In Hadoop trunk (currently Hadoop 3.0.0), the deprecated
FSDataOutputStream#sync() method has been removed.  Instead, we should
call FSDataOutputStream#hflush, which does the same thing as the
deprecated method used to do.

Author: Colin McCabe <cmccabe@cloudera.com>

Closes #898 from cmccabe/SPARK-1518 and squashes the following commits:

752b9d7 [Colin McCabe] FileLogger: Fix compile against Hadoop trunk
parent 189df165
No related branches found
No related tags found
No related merge requests found
......@@ -61,6 +61,14 @@ private[spark] class FileLogger(
// Only defined if the file system scheme is not local
private var hadoopDataStream: Option[FSDataOutputStream] = None
// The Hadoop APIs have changed over time, so we use reflection to figure out
// the correct method to use to flush a hadoop data stream. See SPARK-1518
// for details.
private val hadoopFlushMethod = {
val cls = classOf[FSDataOutputStream]
scala.util.Try(cls.getMethod("hflush")).getOrElse(cls.getMethod("sync"))
}
private var writer: Option[PrintWriter] = None
/**
......@@ -149,13 +157,13 @@ private[spark] class FileLogger(
/**
* Flush the writer to disk manually.
*
* If the Hadoop FileSystem is used, the underlying FSDataOutputStream (r1.0.4) must be
* sync()'ed manually as it does not support flush(), which is invoked by when higher
* level streams are flushed.
* When using a Hadoop filesystem, we need to invoke the hflush or sync
* method. In HDFS, hflush guarantees that the data gets to all the
* DataNodes.
*/
def flush() {
writer.foreach(_.flush())
hadoopDataStream.foreach(_.sync())
hadoopDataStream.foreach(hadoopFlushMethod.invoke(_))
}
/**
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment