Skip to content
Snippets Groups Projects
  • Holden Karau's avatar
    ce8ec545
    Spark 1271: Co-Group and Group-By should pass Iterable[X] · ce8ec545
    Holden Karau authored
    Author: Holden Karau <holden@pigscanfly.ca>
    
    Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits:
    
    f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator
    77048f8 [Holden Karau] Fix merge up to master
    d3fe909 [Holden Karau] use toSeq instead
    7a092a3 [Holden Karau] switch resultitr to resultiterable
    eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables
    c5075aa [Holden Karau] If guava 14 had iterables
    2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API
    11e730c [Holden Karau] Fix streaming tests
    66b583d [Holden Karau] Fix the core test suite to compile
    4ed579b [Holden Karau] Refactor from iterator to iterable
    d052c07 [Holden Karau] Python tests now pass with iterator pandas
    3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work"
    cd1e81c [Holden Karau] Try and make pickling list iterators work
    c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well
    88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming
    a5ee714 [Holden Karau] oops, was checking wrong iterator
    e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming
    ec8cc3e [Holden Karau] Fix test issues\!
    4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions
    fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD"
    ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas"
    b692868 [Holden Karau] Revert
    7e533f7 [Holden Karau] Fix the bug
    8a5153a [Holden Karau] Revert me, but we have some stuff to debug
    b4e86a9 [Holden Karau] Add a join based on the problem in SVD
    c4510e2 [Holden Karau] Revert this but for now put things in list pandas
    b4e0b1d [Holden Karau] Fix style issues
    71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness.
    b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work
    37888ec [Holden Karau] core/tests now pass
    249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes
    6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy"
    fe992fe [Holden Karau] hmmm try and fix up basic operation suite
    172705c [Holden Karau] Fix Java API suite
    caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy
    88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator
    4991af6 [Holden Karau] Fix some tests
    be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after
    687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
    ce8ec545
    History
    Spark 1271: Co-Group and Group-By should pass Iterable[X]
    Holden Karau authored
    Author: Holden Karau <holden@pigscanfly.ca>
    
    Closes #242 from holdenk/spark-1320-cogroupandgroupshouldpassiterator and squashes the following commits:
    
    f289536 [Holden Karau] Fix bad merge, should have been Iterable rather than Iterator
    77048f8 [Holden Karau] Fix merge up to master
    d3fe909 [Holden Karau] use toSeq instead
    7a092a3 [Holden Karau] switch resultitr to resultiterable
    eb06216 [Holden Karau] maybe I should have had a coffee first. use correct import for guava iterables
    c5075aa [Holden Karau] If guava 14 had iterables
    2d06e10 [Holden Karau] Fix Java 8 cogroup tests for the new API
    11e730c [Holden Karau] Fix streaming tests
    66b583d [Holden Karau] Fix the core test suite to compile
    4ed579b [Holden Karau] Refactor from iterator to iterable
    d052c07 [Holden Karau] Python tests now pass with iterator pandas
    3bcd81d [Holden Karau] Revert "Try and make pickling list iterators work"
    cd1e81c [Holden Karau] Try and make pickling list iterators work
    c60233a [Holden Karau] Start investigating moving to iterators for python API like the Java/Scala one. tl;dr: We will have to write our own iterator since the default one doesn't pickle well
    88a5cef [Holden Karau] Fix cogroup test in JavaAPISuite for streaming
    a5ee714 [Holden Karau] oops, was checking wrong iterator
    e687f21 [Holden Karau] Fix groupbykey test in JavaAPISuite of streaming
    ec8cc3e [Holden Karau] Fix test issues\!
    4b0eeb9 [Holden Karau] Switch cast in PairDStreamFunctions
    fa395c9 [Holden Karau] Revert "Add a join based on the problem in SVD"
    ec99e32 [Holden Karau] Revert "Revert this but for now put things in list pandas"
    b692868 [Holden Karau] Revert
    7e533f7 [Holden Karau] Fix the bug
    8a5153a [Holden Karau] Revert me, but we have some stuff to debug
    b4e86a9 [Holden Karau] Add a join based on the problem in SVD
    c4510e2 [Holden Karau] Revert this but for now put things in list pandas
    b4e0b1d [Holden Karau] Fix style issues
    71e8b9f [Holden Karau] I really need to stop calling size on iterators, it is the path of sadness.
    b1ae51a [Holden Karau] Fix some of the types in the streaming JavaAPI suite. Probably still needs more work
    37888ec [Holden Karau] core/tests now pass
    249abde [Holden Karau] org.apache.spark.rdd.PairRDDFunctionsSuite passes
    6698186 [Holden Karau] Revert "I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy"
    fe992fe [Holden Karau] hmmm try and fix up basic operation suite
    172705c [Holden Karau] Fix Java API suite
    caafa63 [Holden Karau] I think this might be a bad rabbit hole. Started work to make CoGroupedRDD use iterator and then went crazy
    88b3329 [Holden Karau] Fix groupbykey to actually give back an iterator
    4991af6 [Holden Karau] Fix some tests
    be50246 [Holden Karau] Calling size on an iterator is not so good if we want to use it after
    687ffbc [Holden Karau] This is the it compiles point of replacing Seq with Iterator and JList with JIterator in the groupby and cogroup signatures
join.py 3.37 KiB