Skip to content
Snippets Groups Projects
Commit da738cff authored by Niccolo Becchi's avatar Niccolo Becchi Committed by Sean Owen
Browse files

[MINOR] Renamed variables in SparkKMeans.scala, LocalKMeans.scala and...

[MINOR] Renamed variables in SparkKMeans.scala, LocalKMeans.scala and kmeans.py to simplify readability

With the previous syntax it could look like that the reduceByKey sums separately abscissas and ordinates of some 2D points. Perhaps in this way should be easier to understand the example, especially for who is starting the functional programming like me now.

Author: Niccolo Becchi <niccolo.becchi@gmail.com>
Author: pippobaudos <niccolo.becchi@gmail.com>

Closes #5875 from pippobaudos/patch-1 and squashes the following commits:

3bb3a47 [pippobaudos] renamed variables in LocalKMeans.scala and kmeans.py to simplify readability
2c2a7a2 [Niccolo Becchi] Update SparkKMeans.scala
parent e9b16e67
No related branches found
No related tags found
No related merge requests found
......@@ -68,14 +68,14 @@ if __name__ == "__main__":
closest = data.map(
lambda p: (closestPoint(p, kPoints), (p, 1)))
pointStats = closest.reduceByKey(
lambda (x1, y1), (x2, y2): (x1 + x2, y1 + y2))
lambda (p1, c1), (p2, c2): (p1 + p2, c1 + c2))
newPoints = pointStats.map(
lambda xy: (xy[0], xy[1][0] / xy[1][1])).collect()
lambda st: (st[0], st[1][0] / st[1][1])).collect()
tempDist = sum(np.sum((kPoints[x] - y) ** 2) for (x, y) in newPoints)
tempDist = sum(np.sum((kPoints[iK] - p) ** 2) for (iK, p) in newPoints)
for (x, y) in newPoints:
kPoints[x] = y
for (iK, p) in newPoints:
kPoints[iK] = p
print("Final centers: " + str(kPoints))
......
......@@ -99,7 +99,7 @@ object LocalKMeans {
var pointStats = mappings.map { pair =>
pair._2.reduceLeft [(Int, (Vector[Double], Int))] {
case ((id1, (x1, y1)), (id2, (x2, y2))) => (id1, (x1 + x2, y1 + y2))
case ((id1, (p1, c1)), (id2, (p2, c2))) => (id1, (p1 + p2, c1 + c2))
}
}
......
......@@ -79,7 +79,7 @@ object SparkKMeans {
while(tempDist > convergeDist) {
val closest = data.map (p => (closestPoint(p, kPoints), (p, 1)))
val pointStats = closest.reduceByKey{case ((x1, y1), (x2, y2)) => (x1 + x2, y1 + y2)}
val pointStats = closest.reduceByKey{case ((p1, c1), (p2, c2)) => (p1 + p2, c1 + c2)}
val newPoints = pointStats.map {pair =>
(pair._1, pair._2._1 * (1.0 / pair._2._2))}.collectAsMap()
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment