-
- Downloads
[SPARK-5886][ML] Add StringIndexer as a feature transformer
This PR adds string indexer, which takes a column of string labels and outputs a double column with labels indexed by their frequency. TODOs: - [x] store feature to index map in output metadata Author: Xiangrui Meng <meng@databricks.com> Closes #4735 from mengxr/SPARK-5886 and squashes the following commits: d82575f [Xiangrui Meng] fix test 700e70f [Xiangrui Meng] rename LabelIndexer to StringIndexer 16a6f8c [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-5886 457166e [Xiangrui Meng] Merge remote-tracking branch 'apache/master' into SPARK-5886 f8b30f4 [Xiangrui Meng] update label indexer to output metadata e81ec28 [Xiangrui Meng] Merge branch 'openhashmap-contains' into SPARK-5886-2 d6e6f1f [Xiangrui Meng] add contains to primitivekeyopenhashmap 748a69b [Xiangrui Meng] add contains to OpenHashMap def3c5c [Xiangrui Meng] add LabelIndexer
Showing
- mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 126 additions, 0 deletions...ain/scala/org/apache/spark/ml/feature/StringIndexer.scala
- mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala 52 additions, 0 deletions...cala/org/apache/spark/ml/feature/StringIndexerSuite.scala
Please register or sign in to comment