-
- Downloads
[SPARK-19066][SPARKR] SparkR LDA doesn't set optimizer correctly
## What changes were proposed in this pull request? spark.lda passes the optimizer "em" or "online" as a string to the backend. However, LDAWrapper doesn't set optimizer based on the value from R. Therefore, for optimizer "em", the `isDistributed` field is FALSE, which should be TRUE based on scala code. In addition, the `summary` method should bring back the results related to `DistributedLDAModel`. ## How was this patch tested? Manual tests by comparing with scala example. Modified the current unit test: fix the incorrect unit test and add necessary tests for `summary` method. Author: wm624@hotmail.com <wm624@hotmail.com> Closes #16464 from wangmiao1981/new.
Showing
- R/pkg/R/mllib_clustering.R 19 additions, 1 deletionR/pkg/R/mllib_clustering.R
- R/pkg/inst/tests/testthat/test_mllib_clustering.R 14 additions, 2 deletionsR/pkg/inst/tests/testthat/test_mllib_clustering.R
- R/pkg/inst/tests/testthat/test_mllib_tree.R 0 additions, 1 deletionR/pkg/inst/tests/testthat/test_mllib_tree.R
- mllib/src/main/scala/org/apache/spark/ml/r/LDAWrapper.scala 9 additions, 1 deletionmllib/src/main/scala/org/apache/spark/ml/r/LDAWrapper.scala
Please register or sign in to comment