Skip to content
Snippets Groups Projects
Commit f9ae99fe authored by Junyang's avatar Junyang Committed by Sean Owen
Browse files

[SPARK-13074][CORE] Add JavaSparkContext. getPersistentRDDs method

The "getPersistentRDDs()" is a useful API of SparkContext to get cached RDDs. However, the JavaSparkContext does not have this API.

Add a simple getPersistentRDDs() to get java.util.Map<Integer, JavaRDD> for Java users.

Author: Junyang <fly.shenjy@gmail.com>

Closes #10978 from flyjy/master.
parent c2f21d88
No related branches found
No related tags found
No related merge requests found
......@@ -774,6 +774,16 @@ class JavaSparkContext(val sc: SparkContext)
/** Cancel all jobs that have been scheduled or are running. */
def cancelAllJobs(): Unit = sc.cancelAllJobs()
/**
* Returns an Java map of JavaRDDs that have marked themselves as persistent via cache() call.
* Note that this does not necessarily mean the caching or computation was successful.
*/
def getPersistentRDDs: JMap[java.lang.Integer, JavaRDD[_]] = {
sc.getPersistentRDDs.mapValues(s => JavaRDD.fromRDD(s))
.asJava.asInstanceOf[JMap[java.lang.Integer, JavaRDD[_]]]
}
}
object JavaSparkContext {
......
......@@ -1811,4 +1811,16 @@ public class JavaAPISuite implements Serializable {
conf.get("spark.kryo.classesToRegister"));
}
@Test
public void testGetPersistentRDDs() {
java.util.Map<Integer, JavaRDD<?>> cachedRddsMap = sc.getPersistentRDDs();
Assert.assertTrue(cachedRddsMap.isEmpty());
JavaRDD<String> rdd1 = sc.parallelize(Arrays.asList("a", "b")).setName("RDD1").cache();
JavaRDD<String> rdd2 = sc.parallelize(Arrays.asList("c", "d")).setName("RDD2").cache();
cachedRddsMap = sc.getPersistentRDDs();
Assert.assertEquals(2, cachedRddsMap.size());
Assert.assertEquals("RDD1", cachedRddsMap.get(0).name());
Assert.assertEquals("RDD2", cachedRddsMap.get(1).name());
}
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment