Skip to content
Snippets Groups Projects
  • Ilya Ganelin's avatar
    2c43ea38
    [SPARK-6492][CORE] SparkContext.stop() can deadlock when DAGSchedulerEventProcessLoop dies · 2c43ea38
    Ilya Ganelin authored
    I've added a timeout and retry loop around the SparkContext shutdown code that should fix this deadlock. If a SparkContext shutdown is in progress when another thread comes knocking, it will wait for 10 seconds for the lock, then fall through where the outer loop will re-submit the request.
    
    Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
    
    Closes #5277 from ilganeli/SPARK-6492 and squashes the following commits:
    
    8617a7e [Ilya Ganelin] Resolved merge conflict
    2fbab66 [Ilya Ganelin] Added MIMA Exclude
    a0e2c70 [Ilya Ganelin] Deleted stale imports
    fa28ce7 [Ilya Ganelin] reverted to just having a single stopped
    76fc825 [Ilya Ganelin] Updated to use atomic booleans instead of the synchronized vars
    6e8a7f7 [Ilya Ganelin] Removing unecessary null check for now since i'm not fixing stop ordering yet
    cdf7073 [Ilya Ganelin] [SPARK-6492] Moved stopped=true back to the start of the shutdown sequence so this can be addressed in a seperate PR
    7fb795b [Ilya Ganelin] Spacing
    b7a0c5c [Ilya Ganelin] Import ordering
    df8224f [Ilya Ganelin] Added comment for added lock
    343cb94 [Ilya Ganelin] [SPARK-6492] Added timeout/retry logic to fix a deadlock in SparkContext shutdown
    2c43ea38
    History
    [SPARK-6492][CORE] SparkContext.stop() can deadlock when DAGSchedulerEventProcessLoop dies
    Ilya Ganelin authored
    I've added a timeout and retry loop around the SparkContext shutdown code that should fix this deadlock. If a SparkContext shutdown is in progress when another thread comes knocking, it will wait for 10 seconds for the lock, then fall through where the outer loop will re-submit the request.
    
    Author: Ilya Ganelin <ilya.ganelin@capitalone.com>
    
    Closes #5277 from ilganeli/SPARK-6492 and squashes the following commits:
    
    8617a7e [Ilya Ganelin] Resolved merge conflict
    2fbab66 [Ilya Ganelin] Added MIMA Exclude
    a0e2c70 [Ilya Ganelin] Deleted stale imports
    fa28ce7 [Ilya Ganelin] reverted to just having a single stopped
    76fc825 [Ilya Ganelin] Updated to use atomic booleans instead of the synchronized vars
    6e8a7f7 [Ilya Ganelin] Removing unecessary null check for now since i'm not fixing stop ordering yet
    cdf7073 [Ilya Ganelin] [SPARK-6492] Moved stopped=true back to the start of the shutdown sequence so this can be addressed in a seperate PR
    7fb795b [Ilya Ganelin] Spacing
    b7a0c5c [Ilya Ganelin] Import ordering
    df8224f [Ilya Ganelin] Added comment for added lock
    343cb94 [Ilya Ganelin] [SPARK-6492] Added timeout/retry logic to fix a deadlock in SparkContext shutdown