Technical notes: July 2015

Thursday, July 9, 2015

In place update of Spark RDD

RDD in Spark are immutable. When you need to change the RDD, you produce a new RDD. When you need to change all the data in RDD frequently, e.g. when running some iterative algorithm, you might not want to spend time and memory on creation of the new structure. However, when you cache the RDD, you can get the reference to the data inside and change it. You need to make sure that the RDD stays in memory all the time. The following is a hack around RDD immutability and is not recommended to do. Also, you fault tolerance is lost, though you can force check-pointing.


// class that allows modification of its variable with inc function

class Counter extends Serializable { var i: Int = 0; def inc: Unit = i += 1 }

// Approach 1:
create an RDD with 5 instances of this class

val rdd = sc.parallelize(1 to 5, 5).map(x => new Counter())

// trying to apply modification

rdd.foreach(x => x.inc)

// modification did not work: all values are still zeroes

rdd.collect.foreach(x => println(x.i))



// Approach 2:
create a cached RDD with 5 instances of the Counter class

val cachedRdd = sc.parallelize(1 to 5, 5).map(x => new Counter()).cache

// trying to apply modification

cachedRdd.foreach(x => x.inc)

// modification worked: all values are ones

cachedRdd.collect.foreach(x => println(x.i))

Monday, July 6, 2015

Git https ssh errors

I get the error in Linux when trying to git push:
error: The requested URL returned error: 403 Forbidden while accessing https://github.com/avulanov/ann-benchmark.git/info/refs
When I set:
git remote set-url origin https://username@github.com/username/reponame.git
I get:
Gtk-WARNING **: cannot open display:
The following command fixes it:
unset SSH_ASKPASS
http://stackoverflow.com/questions/16077971/git-push-produces-gtk-warning

Thursday, July 2, 2015

Installation of the new NVIDIA driver on Red Hat 6.3

In my case, when I download a new driver from http://www.nvidia.com/download/find.aspx and try to run it, I get "ERROR: You appear to be running an X server; please exit X before installing.". I comment x startup in /etc/rc.local and /etc/rc.d/rc.local (in my case there are two lines). After the reboot, there is no X server running, but running the driver installation tells: "unload nvidia module". I can't do this with "rmmod -f" because it is in use. GPU monitoring utility might use it. You might want to kill gmond. In my case the following worked:

sudo yum remove nvidia-kmod

It uninstalls the kernel module with the driver. So now the new driver can be installed.

IntelliJ IDEA Scala: bad compiler option error message

It happens in 14.1.2. Remove it in Options->Scala Compiler->Additional options
http://stackoverflow.com/questions/26995023/errorscalac-bad-option-p-intellij-idea