Commit f4344582 authored 9 years ago by fwang1 Committed by Xiangrui Meng 9 years ago

[SPARK-14497][ML] Use top instead of sortBy() to get top N frequent words as...

[SPARK-14497][ML] Use top instead of sortBy() to get top N frequent words as dict in ConutVectorizer

## What changes were proposed in this pull request?

Replace sortBy() with top() to calculate the top N frequent words as dictionary.

## How was this patch tested?
existing unit tests.  The terms with same TF would be sorted in descending order. The test would fail if hardcode the terms with same TF the dictionary like "c", "d"...

Author: fwang1 <desperado.wf@gmail.com>

Closes #12265 from lionelfeng/master.

parent 22014e6f

No related branches found

No related tags found

No related merge requests found

Hide whitespace changes

Inline Side-by-side

Showing with 8 additions and 13 deletions

Please register or to comment