Skip to content
Snippets Groups Projects
Commit dd77e278 authored by Nick Evans's avatar Nick Evans Committed by Tathagata Das
Browse files

[SPARK-11335][STREAMING] update kafka direct python docs on how to get the...

[SPARK-11335][STREAMING] update kafka direct python docs on how to get the offset ranges for a KafkaRDD

tdas koeninger

This updates the Spark Streaming + Kafka Integration Guide doc with a working method to access the offsets of a `KafkaRDD` through Python.

Author: Nick Evans <me@nicolasevans.org>

Closes #9289 from manygrams/update_kafka_direct_python_docs.
parent a9a6b80c
No related branches found
No related tags found
No related merge requests found
...@@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application. ...@@ -181,7 +181,20 @@ Next, we discuss how to use this approach in your streaming application.
); );
</div> </div>
<div data-lang="python" markdown="1"> <div data-lang="python" markdown="1">
Not supported yet offsetRanges = []
def storeOffsetRanges(rdd):
global offsetRanges
offsetRanges = rdd.offsetRanges()
return rdd
def printOffsetRanges(rdd):
for o in offsetRanges:
print "%s %s %s %s" % (o.topic, o.partition, o.fromOffset, o.untilOffset)
directKafkaStream\
.transform(storeOffsetRanges)\
.foreachRDD(printOffsetRanges)
</div> </div>
</div> </div>
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment