Skip to content
Snippets Groups Projects
  • Michael Armbrust's avatar
    25bef7e6
    [SQL] More aggressive defaults · 25bef7e6
    Michael Armbrust authored
     - Turns on compression for in-memory cached data by default
     - Changes the default parquet compression format back to gzip (we have seen more OOMs with production workloads due to the way Snappy allocates memory)
     - Ups the batch size to 10,000 rows
     - Increases the broadcast threshold to 10mb.
     - Uses our parquet implementation instead of the hive one by default.
     - Cache parquet metadata by default.
    
    Author: Michael Armbrust <michael@databricks.com>
    
    Closes #3064 from marmbrus/fasterDefaults and squashes the following commits:
    
    97ee9f8 [Michael Armbrust] parquet codec docs
    e641694 [Michael Armbrust] Remote also
    a12866a [Michael Armbrust] Cache metadata.
    2d73acc [Michael Armbrust] Update docs defaults.
    d63d2d5 [Michael Armbrust] document parquet option
    da373f9 [Michael Armbrust] More aggressive defaults
    25bef7e6
    History
    [SQL] More aggressive defaults
    Michael Armbrust authored
     - Turns on compression for in-memory cached data by default
     - Changes the default parquet compression format back to gzip (we have seen more OOMs with production workloads due to the way Snappy allocates memory)
     - Ups the batch size to 10,000 rows
     - Increases the broadcast threshold to 10mb.
     - Uses our parquet implementation instead of the hive one by default.
     - Cache parquet metadata by default.
    
    Author: Michael Armbrust <michael@databricks.com>
    
    Closes #3064 from marmbrus/fasterDefaults and squashes the following commits:
    
    97ee9f8 [Michael Armbrust] parquet codec docs
    e641694 [Michael Armbrust] Remote also
    a12866a [Michael Armbrust] Cache metadata.
    2d73acc [Michael Armbrust] Update docs defaults.
    d63d2d5 [Michael Armbrust] document parquet option
    da373f9 [Michael Armbrust] More aggressive defaults