Skip to content
Snippets Groups Projects
  • Xiangrui Meng's avatar
    abd58175
    [SPARK-4398][PySpark] specialize sc.parallelize(xrange) · abd58175
    Xiangrui Meng authored
    `sc.parallelize(range(1 << 20), 1).count()` may take 15 seconds to finish and the rdd object stores the entire list, making task size very large. This PR adds a specialized version for xrange.
    
    JoshRosen davies
    
    Author: Xiangrui Meng <meng@databricks.com>
    
    Closes #3264 from mengxr/SPARK-4398 and squashes the following commits:
    
    8953c41 [Xiangrui Meng] follow davies' suggestion
    cbd58e3 [Xiangrui Meng] specialize sc.parallelize(xrange)
    abd58175
    History
    [SPARK-4398][PySpark] specialize sc.parallelize(xrange)
    Xiangrui Meng authored
    `sc.parallelize(range(1 << 20), 1).count()` may take 15 seconds to finish and the rdd object stores the entire list, making task size very large. This PR adds a specialized version for xrange.
    
    JoshRosen davies
    
    Author: Xiangrui Meng <meng@databricks.com>
    
    Closes #3264 from mengxr/SPARK-4398 and squashes the following commits:
    
    8953c41 [Xiangrui Meng] follow davies' suggestion
    cbd58e3 [Xiangrui Meng] specialize sc.parallelize(xrange)