Skip to content
Snippets Groups Projects
Commit d305e686 authored by Olivier Girardot's avatar Olivier Girardot Committed by Reynold Xin
Browse files

SPARK-6988 : Fix documentation regarding DataFrames using the Java API


This patch includes :
 * adding how to use map after an sql query using javaRDD
 * fixing the first few java examples that were written in Scala

Thank you for your time,

Olivier.

Author: Olivier Girardot <o.girardot@lateral-thoughts.com>

Closes #5564 from ogirardot/branch-1.3 and squashes the following commits:

9f8d60e [Olivier Girardot] SPARK-6988 : Fix documentation regarding DataFrames using the Java API

(cherry picked from commit 6b528dc139da594ef2e651d84bd91fe0f738a39d)
Signed-off-by: default avatarReynold Xin <rxin@databricks.com>
parent 59e206de
No related branches found
No related tags found
No related merge requests found
...@@ -193,8 +193,8 @@ df.groupBy("age").count().show() ...@@ -193,8 +193,8 @@ df.groupBy("age").count().show()
<div data-lang="java" markdown="1"> <div data-lang="java" markdown="1">
{% highlight java %} {% highlight java %}
val sc: JavaSparkContext // An existing SparkContext. JavaSparkContext sc // An existing SparkContext.
val sqlContext = new org.apache.spark.sql.SQLContext(sc) SQLContext sqlContext = new org.apache.spark.sql.SQLContext(sc)
// Create the DataFrame // Create the DataFrame
DataFrame df = sqlContext.jsonFile("examples/src/main/resources/people.json"); DataFrame df = sqlContext.jsonFile("examples/src/main/resources/people.json");
...@@ -308,8 +308,8 @@ val df = sqlContext.sql("SELECT * FROM table") ...@@ -308,8 +308,8 @@ val df = sqlContext.sql("SELECT * FROM table")
<div data-lang="java" markdown="1"> <div data-lang="java" markdown="1">
{% highlight java %} {% highlight java %}
val sqlContext = ... // An existing SQLContext SQLContext sqlContext = ... // An existing SQLContext
val df = sqlContext.sql("SELECT * FROM table") DataFrame df = sqlContext.sql("SELECT * FROM table")
{% endhighlight %} {% endhighlight %}
</div> </div>
...@@ -435,7 +435,7 @@ DataFrame teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AN ...@@ -435,7 +435,7 @@ DataFrame teenagers = sqlContext.sql("SELECT name FROM people WHERE age >= 13 AN
// The results of SQL queries are DataFrames and support all the normal RDD operations. // The results of SQL queries are DataFrames and support all the normal RDD operations.
// The columns of a row in the result can be accessed by ordinal. // The columns of a row in the result can be accessed by ordinal.
List<String> teenagerNames = teenagers.map(new Function<Row, String>() { List<String> teenagerNames = teenagers.javaRDD().map(new Function<Row, String>() {
public String call(Row row) { public String call(Row row) {
return "Name: " + row.getString(0); return "Name: " + row.getString(0);
} }
...@@ -599,7 +599,7 @@ DataFrame results = sqlContext.sql("SELECT name FROM people"); ...@@ -599,7 +599,7 @@ DataFrame results = sqlContext.sql("SELECT name FROM people");
// The results of SQL queries are DataFrames and support all the normal RDD operations. // The results of SQL queries are DataFrames and support all the normal RDD operations.
// The columns of a row in the result can be accessed by ordinal. // The columns of a row in the result can be accessed by ordinal.
List<String> names = results.map(new Function<Row, String>() { List<String> names = results.javaRDD().map(new Function<Row, String>() {
public String call(Row row) { public String call(Row row) {
return "Name: " + row.getString(0); return "Name: " + row.getString(0);
} }
...@@ -860,7 +860,7 @@ DataFrame parquetFile = sqlContext.parquetFile("people.parquet"); ...@@ -860,7 +860,7 @@ DataFrame parquetFile = sqlContext.parquetFile("people.parquet");
//Parquet files can also be registered as tables and then used in SQL statements. //Parquet files can also be registered as tables and then used in SQL statements.
parquetFile.registerTempTable("parquetFile"); parquetFile.registerTempTable("parquetFile");
DataFrame teenagers = sqlContext.sql("SELECT name FROM parquetFile WHERE age >= 13 AND age <= 19"); DataFrame teenagers = sqlContext.sql("SELECT name FROM parquetFile WHERE age >= 13 AND age <= 19");
List<String> teenagerNames = teenagers.map(new Function<Row, String>() { List<String> teenagerNames = teenagers.javaRDD().map(new Function<Row, String>() {
public String call(Row row) { public String call(Row row) {
return "Name: " + row.getString(0); return "Name: " + row.getString(0);
} }
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment