Skip to content
Snippets Groups Projects
Commit d06d5ee9 authored by Josh Rosen's avatar Josh Rosen Committed by Patrick Wendell
Browse files

[SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRecordReaderSuite

This patch fixes two difficult-to-reproduce Jenkins test failures in InputOutputMetricsSuite (SPARK-5227 and SPARK-5679).  The problem was that WholeTextFileRecordReaderSuite modifies the `fs.local.block.size` Hadoop configuration and this change was affecting subsequent test suites due to Hadoop's caching of FileSystem instances (see HADOOP-8490 for more details).

The fix implemented here is to disable FileSystem caching in WholeTextFileRecordReaderSuite.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #4599 from JoshRosen/inputoutputsuite-fix and squashes the following commits:

47dc447 [Josh Rosen] [SPARK-5227] [SPARK-5679] Disable FileSystem cache in WholeTextFileRecordReaderSuite
parent 4f4c6d5a
No related branches found
No related tags found
No related merge requests found
......@@ -28,7 +28,7 @@ import org.scalatest.FunSuite
import org.apache.hadoop.io.Text
import org.apache.spark.SparkContext
import org.apache.spark.{SparkConf, SparkContext}
import org.apache.spark.util.Utils
import org.apache.hadoop.io.compress.{DefaultCodec, CompressionCodecFactory, GzipCodec}
......@@ -42,7 +42,15 @@ class WholeTextFileRecordReaderSuite extends FunSuite with BeforeAndAfterAll {
private var factory: CompressionCodecFactory = _
override def beforeAll() {
sc = new SparkContext("local", "test")
// Hadoop's FileSystem caching does not use the Configuration as part of its cache key, which
// can cause Filesystem.get(Configuration) to return a cached instance created with a different
// configuration than the one passed to get() (see HADOOP-8490 for more details). This caused
// hard-to-reproduce test failures, since any suites that were run after this one would inherit
// the new value of "fs.local.block.size" (see SPARK-5227 and SPARK-5679). To work around this,
// we disable FileSystem caching in this suite.
val conf = new SparkConf().set("spark.hadoop.fs.file.impl.disable.cache", "true")
sc = new SparkContext("local", "test", conf)
// Set the block size of local file system to test whether files are split right or not.
sc.hadoopConfiguration.setLong("fs.local.block.size", 32)
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment