Scala: Adding elements to Set inside 'foreach' doesn't persist -
Scala: Adding elements to Set inside 'foreach' doesn't persist -
i create mutable set , iterate on list using 'foreach' populate set. when print set within foreach, prints contents of set correctly. however, set empty after end of 'foreach'. not able figure out missing.
import org.apache.spark._ import org.apache.spark.graphx._ import org.apache.spark.sparkconf import org.apache.spark.rdd.rdd object sparktest { def main(args: array[string]) { val conf = new sparkconf().setappname("spark test") val sc = new sparkcontext(conf) val graph = graphloader.edgelistfile(sc, "followers.txt") val edgelist = graph.edges var mapperresults = iteratemapper(edgelist) sc.stop() } def iteratemapper(edges: edgerdd[int, int]) : scala.collection.mutable.set[(vertexid, vertexid)] = { var mapperresults = scala.collection.mutable.set[(vertexid, vertexid)]() val mappedvalues = edges.mapvalues(edge => (edge.srcid, edge.dstid)) ++ edges.mapvalues(edge => (edge.dstid, edge.srcid)) mappedvalues.foreach { border => { var src = edge.attr._1 var dst = edge.attr._2 mapperresults += ((src, dst)) } } println(mapperresults) homecoming mapperresults } } this code i'm working with. modified illustration spark.
the
println(mapperresults) prints out empty set.
actually works, in worker. foreach function exists side effects, work on worker, wont see updated set. other issue design immutable! not utilize mutable collection there. there no need that. next code should meant do:
var mapperresults = mappedvalues.map(_.attr).distinct.collect it shorter, cleaner , map work on workers.
scala
Comments
Post a Comment