Skip to content
  • Yanbo Liang's avatar
    425f6916
    [SPARK-10574][ML][MLLIB] HashingTF supports MurmurHash3 · 425f6916
    Yanbo Liang authored
    ## What changes were proposed in this pull request?
    As the discussion at [SPARK-10574](https://issues.apache.org/jira/browse/SPARK-10574), ```HashingTF``` should support MurmurHash3 and make it as the default hash algorithm. We should also expose set/get API for ```hashAlgorithm```, then users can choose the hash method.
    
    Note: The problem that ```mllib.feature.HashingTF``` behaves differently between Scala/Java and Python will be resolved in the followup work.
    
    ## How was this patch tested?
    unit tests.
    
    cc jkbradley MLnick
    
    Author: Yanbo Liang <ybliang8@gmail.com>
    Author: Joseph K. Bradley <joseph@databricks.com>
    
    Closes #12498 from yanboliang/spark-10574.
    425f6916
    [SPARK-10574][ML][MLLIB] HashingTF supports MurmurHash3
    Yanbo Liang authored
    ## What changes were proposed in this pull request?
    As the discussion at [SPARK-10574](https://issues.apache.org/jira/browse/SPARK-10574), ```HashingTF``` should support MurmurHash3 and make it as the default hash algorithm. We should also expose set/get API for ```hashAlgorithm```, then users can choose the hash method.
    
    Note: The problem that ```mllib.feature.HashingTF``` behaves differently between Scala/Java and Python will be resolved in the followup work.
    
    ## How was this patch tested?
    unit tests.
    
    cc jkbradley MLnick
    
    Author: Yanbo Liang <ybliang8@gmail.com>
    Author: Joseph K. Bradley <joseph@databricks.com>
    
    Closes #12498 from yanboliang/spark-10574.
Loading