-
- Downloads
[SPARK-17548][MLLIB] Word2VecModel.findSynonyms no longer spuriously rejects...
[SPARK-17548][MLLIB] Word2VecModel.findSynonyms no longer spuriously rejects the best match when invoked with a vector ## What changes were proposed in this pull request? This pull request changes the behavior of `Word2VecModel.findSynonyms` so that it will not spuriously reject the best match when invoked with a vector that does not correspond to a word in the model's vocabulary. Instead of blindly discarding the best match, the changed implementation discards a match that corresponds to the query word (in cases where `findSynonyms` is invoked with a word) or that has an identical angle to the query vector. ## How was this patch tested? I added a test to `Word2VecSuite` to ensure that the word with the most similar vector from a supplied vector would not be spuriously rejected. Author: William Benton <willb@redhat.com> Closes #15105 from willb/fix/findSynonyms.
Showing
- mllib/src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala 11 additions, 9 deletions...src/main/scala/org/apache/spark/ml/feature/Word2Vec.scala
- mllib/src/main/scala/org/apache/spark/mllib/api/python/Word2VecModelWrapper.scala 19 additions, 3 deletions.../apache/spark/mllib/api/python/Word2VecModelWrapper.scala
- mllib/src/main/scala/org/apache/spark/mllib/feature/Word2Vec.scala 28 additions, 9 deletions.../main/scala/org/apache/spark/mllib/feature/Word2Vec.scala
- mllib/src/test/scala/org/apache/spark/mllib/feature/Word2VecSuite.scala 16 additions, 0 deletions.../scala/org/apache/spark/mllib/feature/Word2VecSuite.scala
- python/pyspark/mllib/feature.py 9 additions, 3 deletionspython/pyspark/mllib/feature.py
Loading
Please register or sign in to comment