Skip to content
Snippets Groups Projects
  • Davies Liu's avatar
    75663b57
    [SPARK-2652] [PySpark] Turning some default configs for PySpark · 75663b57
    Davies Liu authored
    Add several default configs for PySpark, related to serialization in JVM.
    
    spark.serializer = org.apache.spark.serializer.KryoSerializer
    spark.serializer.objectStreamReset = 100
    spark.rdd.compress = True
    
    This will help to reduce the memory usage during RDD.partitionBy()
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #1568 from davies/conf and squashes the following commits:
    
    cd316f1 [Davies Liu] remove duplicated line
    f71a355 [Davies Liu] rebase to master, add spark.rdd.compress = True
    8f63f45 [Davies Liu] Merge branch 'master' into conf
    8bc9f08 [Davies Liu] fix unittest
    c04a83d [Davies Liu] some default configs for PySpark
    75663b57
    History
    [SPARK-2652] [PySpark] Turning some default configs for PySpark
    Davies Liu authored
    Add several default configs for PySpark, related to serialization in JVM.
    
    spark.serializer = org.apache.spark.serializer.KryoSerializer
    spark.serializer.objectStreamReset = 100
    spark.rdd.compress = True
    
    This will help to reduce the memory usage during RDD.partitionBy()
    
    Author: Davies Liu <davies.liu@gmail.com>
    
    Closes #1568 from davies/conf and squashes the following commits:
    
    cd316f1 [Davies Liu] remove duplicated line
    f71a355 [Davies Liu] rebase to master, add spark.rdd.compress = True
    8f63f45 [Davies Liu] Merge branch 'master' into conf
    8bc9f08 [Davies Liu] fix unittest
    c04a83d [Davies Liu] some default configs for PySpark