Skip to content
Snippets Groups Projects
  • Andrew Ash's avatar
    652b781a
    SPARK-3526 Add section about data locality to the tuning guide · 652b781a
    Andrew Ash authored
    cc kayousterhout
    
    I have a few outstanding questions from compiling this documentation:
    - What's the difference between NO_PREF and ANY?  I understand the implications of the ordering but don't know what an example of each would be
    - Why is NO_PREF ahead of RACK_LOCAL?  I would think it'd be better to schedule rack-local tasks ahead of no preference if you could only do one or the other.  Is the idea to wait longer and hope for the rack-local tasks to turn into node-local or better?
    - Will there be a datacenter-local locality level in the future?  Apache Cassandra for example has this level
    
    Author: Andrew Ash <andrew@andrewash.com>
    
    Closes #2519 from ash211/SPARK-3526 and squashes the following commits:
    
    44cff28 [Andrew Ash] Link to spark.locality parameters rather than copying the list
    6d5d966 [Andrew Ash] Stay focused on Spark, no astronaut architecture mumbo-jumbo
    20e0e31 [Andrew Ash] SPARK-3526 Add section about data locality to the tuning guide
    652b781a
    History
    SPARK-3526 Add section about data locality to the tuning guide
    Andrew Ash authored
    cc kayousterhout
    
    I have a few outstanding questions from compiling this documentation:
    - What's the difference between NO_PREF and ANY?  I understand the implications of the ordering but don't know what an example of each would be
    - Why is NO_PREF ahead of RACK_LOCAL?  I would think it'd be better to schedule rack-local tasks ahead of no preference if you could only do one or the other.  Is the idea to wait longer and hope for the rack-local tasks to turn into node-local or better?
    - Will there be a datacenter-local locality level in the future?  Apache Cassandra for example has this level
    
    Author: Andrew Ash <andrew@andrewash.com>
    
    Closes #2519 from ash211/SPARK-3526 and squashes the following commits:
    
    44cff28 [Andrew Ash] Link to spark.locality parameters rather than copying the list
    6d5d966 [Andrew Ash] Stay focused on Spark, no astronaut architecture mumbo-jumbo
    20e0e31 [Andrew Ash] SPARK-3526 Add section about data locality to the tuning guide