Skip to content
Snippets Groups Projects
Commit 6f0988b1 authored by linbojin's avatar linbojin Committed by Sean Owen
Browse files

[MINOR][DOC] Correct code snippet results in quick start documentation

## What changes were proposed in this pull request?

As README.md file is updated over time. Some code snippet outputs are not correct based on new README.md file. For example:
```
scala> textFile.count()
res0: Long = 126
```
should be
```
scala> textFile.count()
res0: Long = 99
```
This pr is to add comments to point out this problem so that new spark learners have a correct reference.
Also, fixed a samll bug, inside current documentation, the outputs of linesWithSpark.count() without and with cache are different (one is 15 and the other is 19)
```
scala> val linesWithSpark = textFile.filter(line => line.contains("Spark"))
linesWithSpark: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[2] at filter at <console>:27

scala> textFile.filter(line => line.contains("Spark")).count() // How many lines contain "Spark"?
res3: Long = 15

...

scala> linesWithSpark.cache()
res7: linesWithSpark.type = MapPartitionsRDD[2] at filter at <console>:27

scala> linesWithSpark.count()
res8: Long = 19
```

## How was this patch tested?

manual test:  run `$ SKIP_API=1 jekyll serve --watch`

Author: linbojin <linbojin203@gmail.com>

Closes #14645 from linbojin/quick-start-documentation.
parent 8fdc6ce4
No related branches found
No related tags found
No related merge requests found
......@@ -40,7 +40,7 @@ RDDs have _[actions](programming-guide.html#actions)_, which return values, and
{% highlight scala %}
scala> textFile.count() // Number of items in this RDD
res0: Long = 126
res0: Long = 126 // May be different from yours as README.md will change over time, similar to other outputs
scala> textFile.first() // First item in this RDD
res1: String = # Apache Spark
......@@ -184,10 +184,10 @@ scala> linesWithSpark.cache()
res7: linesWithSpark.type = MapPartitionsRDD[2] at filter at <console>:27
scala> linesWithSpark.count()
res8: Long = 19
res8: Long = 15
scala> linesWithSpark.count()
res9: Long = 19
res9: Long = 15
{% endhighlight %}
It may seem silly to use Spark to explore and cache a 100-line text file. The interesting part is
......@@ -202,10 +202,10 @@ a cluster, as described in the [programming guide](programming-guide.html#initia
>>> linesWithSpark.cache()
>>> linesWithSpark.count()
19
15
>>> linesWithSpark.count()
19
15
{% endhighlight %}
It may seem silly to use Spark to explore and cache a 100-line text file. The interesting part is
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment