Skip to content
Snippets Groups Projects
Unverified Commit 6717981e authored by hyukjinkwon's avatar hyukjinkwon Committed by Sean Owen
Browse files

[SPARK-18422][CORE] Fix wholeTextFiles test to pass on Windows in JavaAPISuite

## What changes were proposed in this pull request?

This PR fixes the test `wholeTextFiles` in `JavaAPISuite.java`. This is failed due to the different path format on Windows.

For example, the path in `container` was

```
C:\projects\spark\target\tmp\1478967560189-0/part-00000
```

whereas `new URI(res._1()).getPath()` was as below:

```
/C:/projects/spark/target/tmp/1478967560189-0/part-00000
```

## How was this patch tested?

Tests in `JavaAPISuite.java`.

Tested via AppVeyor.

**Before**
Build: https://ci.appveyor.com/project/spark-test/spark/build/63-JavaAPISuite-1
Diff: https://github.com/apache/spark/compare/master...spark-test:JavaAPISuite-1

```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
[error] Test org.apache.spark.JavaAPISuite.wholeTextFiles failed: java.lang.AssertionError: expected:<spark is easy to use.
[error] > but was:<null>, took 0.578 sec
[error]     at org.apache.spark.JavaAPISuite.wholeTextFiles(JavaAPISuite.java:1089)
...
```

**After**
Build started: [CORE] `org.apache.spark.JavaAPISuite` [![PR-15866](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=198DDA52-F201-4D2B-BE2F-244E0C1725B2&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/198DDA52-F201-4D2B-BE2F-244E0C1725B2)
Diff: https://github.com/apache/spark/compare/master...spark-test:198DDA52-F201-4D2B-BE2F-244E0C1725B2



```
[info] Test org.apache.spark.JavaAPISuite.wholeTextFiles started
...
```

Author: hyukjinkwon <gurwls223@gmail.com>

Closes #15866 from HyukjinKwon/SPARK-18422.

(cherry picked from commit 40d59ff5)
Signed-off-by: default avatarSean Owen <sowen@cloudera.com>
parent ec622eb7
No related branches found
No related tags found
No related merge requests found
......@@ -20,7 +20,6 @@ package org.apache.spark;
import java.io.*;
import java.nio.channels.FileChannel;
import java.nio.ByteBuffer;
import java.net.URI;
import java.nio.charset.StandardCharsets;
import java.util.ArrayList;
import java.util.Arrays;
......@@ -46,6 +45,7 @@ import com.google.common.collect.Iterators;
import com.google.common.collect.Lists;
import com.google.common.base.Throwables;
import com.google.common.io.Files;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.io.compress.DefaultCodec;
......@@ -1075,18 +1075,23 @@ public class JavaAPISuite implements Serializable {
byte[] content2 = "spark is also easy to use.\n".getBytes(StandardCharsets.UTF_8);
String tempDirName = tempDir.getAbsolutePath();
Files.write(content1, new File(tempDirName + "/part-00000"));
Files.write(content2, new File(tempDirName + "/part-00001"));
String path1 = new Path(tempDirName, "part-00000").toUri().getPath();
String path2 = new Path(tempDirName, "part-00001").toUri().getPath();
Files.write(content1, new File(path1));
Files.write(content2, new File(path2));
Map<String, String> container = new HashMap<>();
container.put(tempDirName+"/part-00000", new Text(content1).toString());
container.put(tempDirName+"/part-00001", new Text(content2).toString());
container.put(path1, new Text(content1).toString());
container.put(path2, new Text(content2).toString());
JavaPairRDD<String, String> readRDD = sc.wholeTextFiles(tempDirName, 3);
List<Tuple2<String, String>> result = readRDD.collect();
for (Tuple2<String, String> res : result) {
assertEquals(res._2(), container.get(new URI(res._1()).getPath()));
// Note that the paths from `wholeTextFiles` are in URI format on Windows,
// for example, file:/C:/a/b/c.
assertEquals(res._2(), container.get(new Path(res._1()).toUri().getPath()));
}
}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment