-
- Downloads
[SPARK-8764] [ML] string indexer should take option to handle unseen values
As a precursor to adding a public constructor add an option to handle unseen values by skipping rather than throwing an exception (default remains throwing an exception), Author: Holden Karau <holden@pigscanfly.ca> Closes #7266 from holdenk/SPARK-8764-string-indexer-should-take-option-to-handle-unseen-values and squashes the following commits: 38a4de9 [Holden Karau] fix long line 045bf22 [Holden Karau] Add a second b entry so b gets 0 for sure 81dd312 [Holden Karau] Update the docs for handleInvalid param to be more descriptive 7f37f6e [Holden Karau] remove extra space (scala style) 414e249 [Holden Karau] And switch to using handleInvalid instead of skipInvalid 1e53f9b [Holden Karau] update the param (codegen side) 7a22215 [Holden Karau] fix typo 100a39b [Holden Karau] Merge in master aa5b093 [Holden Karau] Since we filter we should never go down this code path if getSkipInvalid is true 75ffa69 [Holden Karau] Remove extra newline d69ef5e [Holden Karau] Add a test b5734be [Holden Karau] Add support for unseen labels afecd4e [Holden Karau] Add a param to skip invalid entries.
Showing
- mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala 22 additions, 4 deletions...ain/scala/org/apache/spark/ml/feature/StringIndexer.scala
- mllib/src/main/scala/org/apache/spark/ml/param/shared/SharedParamsCodeGen.scala 4 additions, 0 deletions...rg/apache/spark/ml/param/shared/SharedParamsCodeGen.scala
- mllib/src/main/scala/org/apache/spark/ml/param/shared/sharedParams.scala 15 additions, 0 deletions...scala/org/apache/spark/ml/param/shared/sharedParams.scala
- mllib/src/test/scala/org/apache/spark/ml/feature/StringIndexerSuite.scala 32 additions, 0 deletions...cala/org/apache/spark/ml/feature/StringIndexerSuite.scala
Please register or sign in to comment