Skip to content
Snippets Groups Projects
Commit 439e3610 authored by François Garillot's avatar François Garillot Committed by Shixiong Zhu
Browse files

[SPARK-9819][STREAMING][DOCUMENTATION] Clarify doc for invReduceFunc in...

[SPARK-9819][STREAMING][DOCUMENTATION] Clarify doc for invReduceFunc in incremental versions of reduceByWindow

- that reduceFunc and invReduceFunc should be associative
- that the intermediate result in iterated applications of inverseReduceFunc
  is its first argument

Author: François Garillot <francois@garillot.net>

Closes #8103 from huitseeker/issue/invReduceFuncDoc.
parent ca813330
No related branches found
No related tags found
No related merge requests found
...@@ -454,7 +454,9 @@ class DStream(object): ...@@ -454,7 +454,9 @@ class DStream(object):
This is more efficient than `invReduceFunc` is None. This is more efficient than `invReduceFunc` is None.
@param reduceFunc: associative and commutative reduce function @param reduceFunc: associative and commutative reduce function
@param invReduceFunc: inverse reduce function of `reduceFunc` @param invReduceFunc: inverse reduce function of `reduceFunc`; such that for all y,
and invertible x:
`invReduceFunc(reduceFunc(x, y), x) = y`
@param windowDuration: width of the window; must be a multiple of this DStream's @param windowDuration: width of the window; must be a multiple of this DStream's
batching interval batching interval
@param slideDuration: sliding interval of the window (i.e., the interval after which @param slideDuration: sliding interval of the window (i.e., the interval after which
......
...@@ -240,7 +240,8 @@ trait JavaDStreamLike[T, This <: JavaDStreamLike[T, This, R], R <: JavaRDDLike[T ...@@ -240,7 +240,8 @@ trait JavaDStreamLike[T, This <: JavaDStreamLike[T, This, R], R <: JavaRDDLike[T
* This is more efficient than reduceByWindow without "inverse reduce" function. * This is more efficient than reduceByWindow without "inverse reduce" function.
* However, it is applicable to only "invertible reduce functions". * However, it is applicable to only "invertible reduce functions".
* @param reduceFunc associative and commutative reduce function * @param reduceFunc associative and commutative reduce function
* @param invReduceFunc inverse reduce function * @param invReduceFunc inverse reduce function; such that for all y, invertible x:
* `invReduceFunc(reduceFunc(x, y), x) = y`
* @param windowDuration width of the window; must be a multiple of this DStream's * @param windowDuration width of the window; must be a multiple of this DStream's
* batching interval * batching interval
* @param slideDuration sliding interval of the window (i.e., the interval after which * @param slideDuration sliding interval of the window (i.e., the interval after which
......
...@@ -336,7 +336,8 @@ class JavaPairDStream[K, V](val dstream: DStream[(K, V)])( ...@@ -336,7 +336,8 @@ class JavaPairDStream[K, V](val dstream: DStream[(K, V)])(
* However, it is applicable to only "invertible reduce functions". * However, it is applicable to only "invertible reduce functions".
* Hash partitioning is used to generate the RDDs with Spark's default number of partitions. * Hash partitioning is used to generate the RDDs with Spark's default number of partitions.
* @param reduceFunc associative and commutative reduce function * @param reduceFunc associative and commutative reduce function
* @param invReduceFunc inverse function * @param invReduceFunc inverse function; such that for all y, invertible x:
* `invReduceFunc(reduceFunc(x, y), x) = y`
* @param windowDuration width of the window; must be a multiple of this DStream's * @param windowDuration width of the window; must be a multiple of this DStream's
* batching interval * batching interval
* @param slideDuration sliding interval of the window (i.e., the interval after which * @param slideDuration sliding interval of the window (i.e., the interval after which
......
...@@ -793,7 +793,8 @@ abstract class DStream[T: ClassTag] ( ...@@ -793,7 +793,8 @@ abstract class DStream[T: ClassTag] (
* This is more efficient than reduceByWindow without "inverse reduce" function. * This is more efficient than reduceByWindow without "inverse reduce" function.
* However, it is applicable to only "invertible reduce functions". * However, it is applicable to only "invertible reduce functions".
* @param reduceFunc associative and commutative reduce function * @param reduceFunc associative and commutative reduce function
* @param invReduceFunc inverse reduce function * @param invReduceFunc inverse reduce function; such that for all y, invertible x:
* `invReduceFunc(reduceFunc(x, y), x) = y`
* @param windowDuration width of the window; must be a multiple of this DStream's * @param windowDuration width of the window; must be a multiple of this DStream's
* batching interval * batching interval
* @param slideDuration sliding interval of the window (i.e., the interval after which * @param slideDuration sliding interval of the window (i.e., the interval after which
......
...@@ -290,7 +290,8 @@ class PairDStreamFunctions[K, V](self: DStream[(K, V)]) ...@@ -290,7 +290,8 @@ class PairDStreamFunctions[K, V](self: DStream[(K, V)])
* However, it is applicable to only "invertible reduce functions". * However, it is applicable to only "invertible reduce functions".
* Hash partitioning is used to generate the RDDs with Spark's default number of partitions. * Hash partitioning is used to generate the RDDs with Spark's default number of partitions.
* @param reduceFunc associative and commutative reduce function * @param reduceFunc associative and commutative reduce function
* @param invReduceFunc inverse reduce function * @param invReduceFunc inverse reduce function; such that for all y, invertible x:
* `invReduceFunc(reduceFunc(x, y), x) = y`
* @param windowDuration width of the window; must be a multiple of this DStream's * @param windowDuration width of the window; must be a multiple of this DStream's
* batching interval * batching interval
* @param slideDuration sliding interval of the window (i.e., the interval after which * @param slideDuration sliding interval of the window (i.e., the interval after which
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment