Skip to content
Snippets Groups Projects
  • freeman's avatar
    97cf19f6
    Fix for sampling error in NumPy v1.9 [SPARK-3995][PYSPARK] · 97cf19f6
    freeman authored
    Change maximum value for default seed during RDD sampling so that it is strictly less than 2 ** 32. This prevents a bug in the most recent version of NumPy, which cannot accept random seeds above this bound.
    
    Adds an extra test that uses the default seed (instead of setting it manually, as in the docstrings).
    
    mengxr
    
    Author: freeman <the.freeman.lab@gmail.com>
    
    Closes #2889 from freeman-lab/pyspark-sampling and squashes the following commits:
    
    dc385ef [freeman] Change maximum value for default seed
    97cf19f6
    History
    Fix for sampling error in NumPy v1.9 [SPARK-3995][PYSPARK]
    freeman authored
    Change maximum value for default seed during RDD sampling so that it is strictly less than 2 ** 32. This prevents a bug in the most recent version of NumPy, which cannot accept random seeds above this bound.
    
    Adds an extra test that uses the default seed (instead of setting it manually, as in the docstrings).
    
    mengxr
    
    Author: freeman <the.freeman.lab@gmail.com>
    
    Closes #2889 from freeman-lab/pyspark-sampling and squashes the following commits:
    
    dc385ef [freeman] Change maximum value for default seed