-
- Downloads
[SPARK-8536] [MLLIB] Generalize OnlineLDAOptimizer to asymmetric document-topic Dirichlet priors
Modify `LDA` to take asymmetric document-topic prior distributions and `OnlineLDAOptimizer` to use the asymmetric prior during variational inference. This PR only generalizes `OnlineLDAOptimizer` and the associated `LocalLDAModel`; `EMLDAOptimizer` and `DistributedLDAModel` still only support symmetric `alpha` (checked during `EMLDAOptimizer.initialize`). Author: Feynman Liang <fliang@databricks.com> Closes #7575 from feynmanliang/SPARK-8536-LDA-asymmetric-priors and squashes the following commits: af8fbb7 [Feynman Liang] Fix merge errors ef5821d [Feynman Liang] Merge remote-tracking branch 'apache/master' into SPARK-8536-LDA-asymmetric-priors 58f1d7b [Feynman Liang] Fix from review feedback a6dcf70 [Feynman Liang] Change docConcentration interface and move LDAOptimizer validation to initialize, add sad path tests 72038ff [Feynman Liang] Add tests referenced against gensim d4284fa [Feynman Liang] Generalize OnlineLDA to asymmetric priors, no tests
Showing
- mllib/src/main/scala/org/apache/spark/mllib/clustering/LDA.scala 31 additions, 18 deletions...rc/main/scala/org/apache/spark/mllib/clustering/LDA.scala
- mllib/src/main/scala/org/apache/spark/mllib/clustering/LDAOptimizer.scala 21 additions, 6 deletions...cala/org/apache/spark/mllib/clustering/LDAOptimizer.scala
- mllib/src/test/scala/org/apache/spark/mllib/clustering/LDASuite.scala 74 additions, 8 deletions...st/scala/org/apache/spark/mllib/clustering/LDASuite.scala
Please register or sign in to comment