Skip to content
Snippets Groups Projects
Commit 7b841540 authored by Sean Owen's avatar Sean Owen
Browse files

[SPARK-12494][MLLIB] Array out of bound Exception in KMeans Yarn Mode

## What changes were proposed in this pull request?

Better error message with k-means init can't be enough samples from input (because it is perhaps empty)

## How was this patch tested?

Jenkins tests.

Author: Sean Owen <sowen@cloudera.com>

Closes #11979 from srowen/SPARK-12494.
parent aac13fb4
No related branches found
No related tags found
No related merge requests found
......@@ -390,6 +390,8 @@ class KMeans private (
// Initialize each run's first center to a random point.
val seed = new XORShiftRandom(this.seed).nextInt()
val sample = data.takeSample(true, runs, seed).toSeq
// Could be empty if data is empty; fail with a better message early:
require(sample.size >= runs, s"Required $runs samples but got ${sample.size} from $data")
val newCenters = Array.tabulate(runs)(r => ArrayBuffer(sample(r).toDense))
/** Merges new centers to centers. */
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment