Skip to content
Snippets Groups Projects
  • Josh Rosen's avatar
    7bbbe380
    [SPARK-5161] Parallelize Python test execution · 7bbbe380
    Josh Rosen authored
    This commit parallelizes the Python unit test execution, significantly reducing Jenkins build times.  Parallelism is now configurable by passing the `-p` or `--parallelism` flags to either `dev/run-tests` or `python/run-tests` (the default parallelism is 4, but I've successfully tested with higher parallelism).
    
    To avoid flakiness, I've disabled the Spark Web UI for the Python tests, similar to what we've done for the JVM tests.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #7031 from JoshRosen/parallelize-python-tests and squashes the following commits:
    
    feb3763 [Josh Rosen] Re-enable other tests
    f87ea81 [Josh Rosen] Only log output from failed tests
    d4ded73 [Josh Rosen] Logging improvements
    a2717e1 [Josh Rosen] Make parallelism configurable via dev/run-tests
    1bacf1b [Josh Rosen] Merge remote-tracking branch 'origin/master' into parallelize-python-tests
    110cd9d [Josh Rosen] Fix universal_newlines for Python 3
    cd13db8 [Josh Rosen] Also log python_implementation
    9e31127 [Josh Rosen] Log Python --version output for each executable.
    a2b9094 [Josh Rosen] Bump up parallelism.
    5552380 [Josh Rosen] Python 3 fix
    866b5b9 [Josh Rosen] Fix lazy logging warnings in Prospector checks
    87cb988 [Josh Rosen] Skip MLLib tests for PyPy
    8309bfe [Josh Rosen] Temporarily disable parallelism to debug a failure
    9129027 [Josh Rosen] Disable Spark UI in Python tests
    037b686 [Josh Rosen] Temporarily disable JVM tests so we can test Python speedup in Jenkins.
    af4cef4 [Josh Rosen] Initial attempt at parallelizing Python test execution
    7bbbe380
    History
    [SPARK-5161] Parallelize Python test execution
    Josh Rosen authored
    This commit parallelizes the Python unit test execution, significantly reducing Jenkins build times.  Parallelism is now configurable by passing the `-p` or `--parallelism` flags to either `dev/run-tests` or `python/run-tests` (the default parallelism is 4, but I've successfully tested with higher parallelism).
    
    To avoid flakiness, I've disabled the Spark Web UI for the Python tests, similar to what we've done for the JVM tests.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #7031 from JoshRosen/parallelize-python-tests and squashes the following commits:
    
    feb3763 [Josh Rosen] Re-enable other tests
    f87ea81 [Josh Rosen] Only log output from failed tests
    d4ded73 [Josh Rosen] Logging improvements
    a2717e1 [Josh Rosen] Make parallelism configurable via dev/run-tests
    1bacf1b [Josh Rosen] Merge remote-tracking branch 'origin/master' into parallelize-python-tests
    110cd9d [Josh Rosen] Fix universal_newlines for Python 3
    cd13db8 [Josh Rosen] Also log python_implementation
    9e31127 [Josh Rosen] Log Python --version output for each executable.
    a2b9094 [Josh Rosen] Bump up parallelism.
    5552380 [Josh Rosen] Python 3 fix
    866b5b9 [Josh Rosen] Fix lazy logging warnings in Prospector checks
    87cb988 [Josh Rosen] Skip MLLib tests for PyPy
    8309bfe [Josh Rosen] Temporarily disable parallelism to debug a failure
    9129027 [Josh Rosen] Disable Spark UI in Python tests
    037b686 [Josh Rosen] Temporarily disable JVM tests so we can test Python speedup in Jenkins.
    af4cef4 [Josh Rosen] Initial attempt at parallelizing Python test execution
run-tests 895 B