Skip to content
Snippets Groups Projects
  • Josh Rosen's avatar
    40648c56
    [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with... · 40648c56
    Josh Rosen authored
    [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
    
    This patch refactors the `python/run-tests` script:
    
    - It's now written in Python instead of Bash.
    - The descriptions of the tests to run are now stored in `dev/run-tests`'s modules.  This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes.  For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
    - `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
    
      ```
    Usage: run-tests [options]
    
    Options:
      -h, --help            show this help message and exit
      --python-executables=PYTHON_EXECUTABLES
                            A comma-separated list of Python executables to test
                            against (default: python2.6,python3.4,pypy)
      --modules=MODULES     A comma-separated list of Python modules to test
                            (default: pyspark-core,pyspark-ml,pyspark-mllib
                            ,pyspark-sql,pyspark-streaming)
       ```
    - `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
    
    f578d6d [Josh Rosen] Fix print for Python 2.x
    8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
    34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
    8f65ed0 [Josh Rosen] Fix handling of  module in python/run-tests
    37aff00 [Josh Rosen] Python 3 fix
    27a389f [Josh Rosen] Skip MLLib tests for PyPy
    c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
    568a3fd [Josh Rosen] Fix hashbang
    3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
    f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
    9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
    d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
    4f8902c [Josh Rosen] Python lint fixes.
    8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
    f542ac5 [Josh Rosen] Fix lint check for Python 3
    fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
    2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
    b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
    caeb040 [Josh Rosen] Fixes to PySpark test module definitions
    d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
    def2d8a [Josh Rosen] Two minor fixes
    aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
    04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
    4c97136 [Josh Rosen] PYTHONPATH fixes
    dcc9c09 [Josh Rosen] Fix time division
    32660fc [Josh Rosen] Initial cut at Python test runner refactoring
    311c6a9 [Josh Rosen] Move shell utility functions to own module.
    1bdeb87 [Josh Rosen] Move module definitions to separate file.
    40648c56
    History
    [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with...
    Josh Rosen authored
    [SPARK-8583] [SPARK-5482] [BUILD] Refactor python/run-tests to integrate with dev/run-tests module system
    
    This patch refactors the `python/run-tests` script:
    
    - It's now written in Python instead of Bash.
    - The descriptions of the tests to run are now stored in `dev/run-tests`'s modules.  This allows the pull request builder to skip Python tests suites that were not affected by the pull request's changes.  For example, we can now skip the PySpark Streaming test cases when only SQL files are changed.
    - `python/run-tests` now supports command-line flags to make it easier to run individual test suites (this addresses SPARK-5482):
    
      ```
    Usage: run-tests [options]
    
    Options:
      -h, --help            show this help message and exit
      --python-executables=PYTHON_EXECUTABLES
                            A comma-separated list of Python executables to test
                            against (default: python2.6,python3.4,pypy)
      --modules=MODULES     A comma-separated list of Python modules to test
                            (default: pyspark-core,pyspark-ml,pyspark-mllib
                            ,pyspark-sql,pyspark-streaming)
       ```
    - `dev/run-tests` has been split into multiple files: the module definitions and test utility functions are now stored inside of a `dev/sparktestsupport` Python module, allowing them to be re-used from the Python test runner script.
    
    Author: Josh Rosen <joshrosen@databricks.com>
    
    Closes #6967 from JoshRosen/run-tests-python-modules and squashes the following commits:
    
    f578d6d [Josh Rosen] Fix print for Python 2.x
    8233d61 [Josh Rosen] Add python/run-tests.py to Python lint checks
    34c98d2 [Josh Rosen] Fix universal_newlines for Python 3
    8f65ed0 [Josh Rosen] Fix handling of  module in python/run-tests
    37aff00 [Josh Rosen] Python 3 fix
    27a389f [Josh Rosen] Skip MLLib tests for PyPy
    c364ccf [Josh Rosen] Use which() to convert PYSPARK_PYTHON to an absolute path before shelling out to run tests
    568a3fd [Josh Rosen] Fix hashbang
    3b852ae [Josh Rosen] Fall back to PYSPARK_PYTHON when sys.executable is None (fixes a test)
    f53db55 [Josh Rosen] Remove python2 flag, since the test runner script also works fine under Python 3
    9c80469 [Josh Rosen] Fix passing of PYSPARK_PYTHON
    d33e525 [Josh Rosen] Merge remote-tracking branch 'origin/master' into run-tests-python-modules
    4f8902c [Josh Rosen] Python lint fixes.
    8f3244c [Josh Rosen] Use universal_newlines to fix dev/run-tests doctest failures on Python 3.
    f542ac5 [Josh Rosen] Fix lint check for Python 3
    fff4d09 [Josh Rosen] Add dev/sparktestsupport to pep8 checks
    2efd594 [Josh Rosen] Update dev/run-tests to use new Python test runner flags
    b2ab027 [Josh Rosen] Add command-line options for running individual suites in python/run-tests
    caeb040 [Josh Rosen] Fixes to PySpark test module definitions
    d6a77d3 [Josh Rosen] Fix the tests of dev/run-tests
    def2d8a [Josh Rosen] Two minor fixes
    aec0b8f [Josh Rosen] Actually get the Kafka stuff to run properly
    04015b9 [Josh Rosen] First attempt at getting PySpark Kafka test to work in new runner script
    4c97136 [Josh Rosen] PYTHONPATH fixes
    dcc9c09 [Josh Rosen] Fix time division
    32660fc [Josh Rosen] Initial cut at Python test runner refactoring
    311c6a9 [Josh Rosen] Move shell utility functions to own module.
    1bdeb87 [Josh Rosen] Move module definitions to separate file.