Skip to content
  • Josh Rosen's avatar
    7215aa74
    [SPARK-6209] Clean up connections in ExecutorClassLoader after failing to load... · 7215aa74
    Josh Rosen authored
    [SPARK-6209] Clean up connections in ExecutorClassLoader after failing to load classes (master branch PR)
    
    ExecutorClassLoader does not ensure proper cleanup of network connections that it opens. If it fails to load a class, it may leak partially-consumed InputStreams that are connected to the REPL's HTTP class server, causing that server to exhaust its thread pool, which can cause the entire job to hang.  See [SPARK-6209](https://issues.apache.org/jira/browse/SPARK-6209) for more details, including a bug reproduction.
    
    This patch fixes this issue by ensuring proper cleanup of these resources.  It also adds logging for unexpected error cases.
    
    This PR is an extended version of #4935 and adds a regression test.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #4944 from JoshRosen/executorclassloader-leak-master-branch and squashes the following commits:
    
    e0e3c25 [Josh Rosen] Wrap try block around getReponseCode; re-enable keep-alive by closing error stream
    961c284 [Josh Rosen] Roll back changes that were added to get the regression test to fail
    7ee2261 [Josh Rosen] Add a failing regression test
    e2d70a3 [Josh Rosen] Properly clean up after errors in ExecutorClassLoader
    7215aa74
    [SPARK-6209] Clean up connections in ExecutorClassLoader after failing to load...
    Josh Rosen authored
    [SPARK-6209] Clean up connections in ExecutorClassLoader after failing to load classes (master branch PR)
    
    ExecutorClassLoader does not ensure proper cleanup of network connections that it opens. If it fails to load a class, it may leak partially-consumed InputStreams that are connected to the REPL's HTTP class server, causing that server to exhaust its thread pool, which can cause the entire job to hang.  See [SPARK-6209](https://issues.apache.org/jira/browse/SPARK-6209) for more details, including a bug reproduction.
    
    This patch fixes this issue by ensuring proper cleanup of these resources.  It also adds logging for unexpected error cases.
    
    This PR is an extended version of #4935 and adds a regression test.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #4944 from JoshRosen/executorclassloader-leak-master-branch and squashes the following commits:
    
    e0e3c25 [Josh Rosen] Wrap try block around getReponseCode; re-enable keep-alive by closing error stream
    961c284 [Josh Rosen] Roll back changes that were added to get the regression test to fail
    7ee2261 [Josh Rosen] Add a failing regression test
    e2d70a3 [Josh Rosen] Properly clean up after errors in ExecutorClassLoader
Loading