Skip to content
Snippets Groups Projects
Commit b3c22912 authored by Gaetan Semet's avatar Gaetan Semet Committed by Sean Owen
Browse files

[SPARK-16992][PYSPARK] use map comprehension in doc

Code is equivalent, but map comprehency is most of the time faster than a map.

Author: Gaetan Semet <gaetan@xeberon.net>

Closes #14863 from Stibbons/map_comprehension.
parent 4efcdb7f
No related branches found
No related tags found
No related merge requests found
......@@ -29,7 +29,7 @@ if __name__ == "__main__":
.getOrCreate()
# $example on$
data = [(0, 18.0,), (1, 19.0,), (2, 8.0,), (3, 5.0,), (4, 2.2,)]
data = [(0, 18.0), (1, 19.0), (2, 8.0), (3, 5.0), (4, 2.2)]
df = spark.createDataFrame(data, ["id", "hour"])
# $example off$
......
......@@ -32,8 +32,8 @@ if __name__ == "__main__":
# $example on$
df = spark.createDataFrame([
Row(userFeatures=Vectors.sparse(3, {0: -2.0, 1: 2.3}),),
Row(userFeatures=Vectors.dense([-2.0, 2.3, 0.0]),)])
Row(userFeatures=Vectors.sparse(3, {0: -2.0, 1: 2.3})),
Row(userFeatures=Vectors.dense([-2.0, 2.3, 0.0]))])
slicer = VectorSlicer(inputCol="userFeatures", outputCol="features", indices=[1])
......
......@@ -79,7 +79,7 @@ if __name__ == "__main__":
# You can also use DataFrames to create temporary views within a SparkSession.
Record = Row("key", "value")
recordsDF = spark.createDataFrame(map(lambda i: Record(i, "val_" + str(i)), range(1, 101)))
recordsDF = spark.createDataFrame([Record(i, "val_" + str(i)) for i in range(1, 101)])
recordsDF.createOrReplaceTempView("records")
# Queries can then join DataFrame data with data stored in Hive.
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment