Monday, March 9, 2015

Serialize classes or models is Apache Spark

Normal serialization does work but the deserialized objects cannot be mapped to RDD, i.e. their functions cannot be applied to RDD. Hack:
sc.parallelize(Seq(model), 1).saveAsObjectFile("path") val sameModel = sc.objectFile[YourCLASS]("path").first()

No comments:

Post a Comment