Skip to content
  • Davies Liu's avatar
    4fa2fda8
    [SPARK-2871] [PySpark] add RDD.lookup(key) · 4fa2fda8
    Davies Liu authored
    RDD.lookup(key)
    
            Return the list of values in the RDD for key `key`. This operation
            is done efficiently if the RDD has a known partitioner by only
            searching the partition that the key maps to.
    
            >>> l = range(1000)
            >>> rdd = sc.parallelize(zip(l, l), 10)
            >>> rdd.lookup(42)  # slow
            [42]
            >>> sorted = rdd.sortByKey()
            >>> sorted.lookup(42)  # fast
            [42]
    
    It also clean up the code in RDD.py, and fix several bugs (related to preservesPartitioning).
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #2093 from davies/lookup and squashes the following commits:
    
    1789cd4 [Davies Liu] `f` in foreach could be generator or not.
    2871b80 [Davies Liu] Merge branch 'master' into lookup
    c6390ea [Davies Liu] address all comments
    0f1bce8 [Davies Liu] add test case for lookup()
    be0e8ba [Davies Liu] fix preservesPartitioning
    eb1305d [Davies Liu] add RDD.lookup(key)
    4fa2fda8
    [SPARK-2871] [PySpark] add RDD.lookup(key)
    Davies Liu authored
    RDD.lookup(key)
    
            Return the list of values in the RDD for key `key`. This operation
            is done efficiently if the RDD has a known partitioner by only
            searching the partition that the key maps to.
    
            >>> l = range(1000)
            >>> rdd = sc.parallelize(zip(l, l), 10)
            >>> rdd.lookup(42)  # slow
            [42]
            >>> sorted = rdd.sortByKey()
            >>> sorted.lookup(42)  # fast
            [42]
    
    It also clean up the code in RDD.py, and fix several bugs (related to preservesPartitioning).
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #2093 from davies/lookup and squashes the following commits:
    
    1789cd4 [Davies Liu] `f` in foreach could be generator or not.
    2871b80 [Davies Liu] Merge branch 'master' into lookup
    c6390ea [Davies Liu] address all comments
    0f1bce8 [Davies Liu] add test case for lookup()
    be0e8ba [Davies Liu] fix preservesPartitioning
    eb1305d [Davies Liu] add RDD.lookup(key)
Loading