Skip to content
Snippets Groups Projects
Commit 7ecb867c authored by Xiangrui Meng's avatar Xiangrui Meng
Browse files

[MLLIB] use Iterator.fill instead of Array.fill

Iterator.fill uses less memory

Author: Xiangrui Meng <meng@databricks.com>

Closes #1930 from mengxr/rand-gen-iter and squashes the following commits:

24178ca [Xiangrui Meng] use Iterator.fill instead of Array.fill
parent 434bea1c
No related branches found
No related tags found
No related merge requests found
......@@ -105,16 +105,16 @@ private[mllib] object RandomRDD {
def getPointIterator[T: ClassTag](partition: RandomRDDPartition[T]): Iterator[T] = {
val generator = partition.generator.copy()
generator.setSeed(partition.seed)
Array.fill(partition.size)(generator.nextValue()).toIterator
Iterator.fill(partition.size)(generator.nextValue())
}
// The RNG has to be reset every time the iterator is requested to guarantee same data
// every time the content of the RDD is examined.
def getVectorIterator(partition: RandomRDDPartition[Double],
vectorSize: Int): Iterator[Vector] = {
def getVectorIterator(
partition: RandomRDDPartition[Double],
vectorSize: Int): Iterator[Vector] = {
val generator = partition.generator.copy()
generator.setSeed(partition.seed)
Array.fill(partition.size)(new DenseVector(
(0 until vectorSize).map { _ => generator.nextValue() }.toArray)).toIterator
Iterator.fill(partition.size)(new DenseVector(Array.fill(vectorSize)(generator.nextValue())))
}
}
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment