Skip to content
Snippets Groups Projects
Commit 7aeb20be authored by Yanbo Liang's avatar Yanbo Liang
Browse files

[MINOR][ML] Avoid 2D array flatten in NB training.

## What changes were proposed in this pull request?
Avoid 2D array flatten in ```NaiveBayes``` training, since flatten method might be expensive (It will create another array and copy data there).

## How was this patch tested?
Existing tests.

Author: Yanbo Liang <ybliang8@gmail.com>

Closes #15359 from yanboliang/nb-theta.
parent b678e465
No related branches found
No related tags found
No related merge requests found
......@@ -176,8 +176,8 @@ class NaiveBayes @Since("1.5.0") (
val numLabels = aggregated.length
val numDocuments = aggregated.map(_._2._1).sum
val piArray = Array.fill[Double](numLabels)(0.0)
val thetaArrays = Array.fill[Double](numLabels, numFeatures)(0.0)
val piArray = new Array[Double](numLabels)
val thetaArray = new Array[Double](numLabels * numFeatures)
val lambda = $(smoothing)
val piLogDenom = math.log(numDocuments + numLabels * lambda)
......@@ -193,14 +193,14 @@ class NaiveBayes @Since("1.5.0") (
}
var j = 0
while (j < numFeatures) {
thetaArrays(i)(j) = math.log(sumTermFreqs(j) + lambda) - thetaLogDenom
thetaArray(i * numFeatures + j) = math.log(sumTermFreqs(j) + lambda) - thetaLogDenom
j += 1
}
i += 1
}
val pi = Vectors.dense(piArray)
val theta = new DenseMatrix(numLabels, thetaArrays(0).length, thetaArrays.flatten, true)
val theta = new DenseMatrix(numLabels, numFeatures, thetaArray, true)
new NaiveBayesModel(uid, pi, theta)
}
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment