-
- Downloads
[SPARK-14238][ML][MLLIB][PYSPARK] Add binary toggle Param to PySpark HashingTF in ML & MLlib
## What changes were proposed in this pull request? This fix tries to add binary toggle Param to PySpark HashingTF in ML & MLlib. If this toggle is set, then all non-zero counts will be set to 1. Note: This fix (SPARK-14238) is extended from SPARK-13963 where Scala implementation was done. ## How was this patch tested? This fix adds two tests to cover the code changes. One for HashingTF in PySpark's ML and one for HashingTF in PySpark's MLLib. Author: Yong Tang <yong.tang.github@outlook.com> Closes #12079 from yongtang/SPARK-14238.
Showing
- python/pyspark/ml/feature.py 22 additions, 2 deletionspython/pyspark/ml/feature.py
- python/pyspark/ml/tests.py 19 additions, 0 deletionspython/pyspark/ml/tests.py
- python/pyspark/mllib/feature.py 12 additions, 1 deletionpython/pyspark/mllib/feature.py
- python/pyspark/mllib/tests.py 16 additions, 0 deletionspython/pyspark/mllib/tests.py
Loading
Please register or sign in to comment