How to guess the number of Workers per cluster node that will provide the best performance? I performed few tests on 6-node cluster with 3.3GHz 4 core 16GB machines. I used mnist8m classification task as a test. It terms that there should be Workers < cores. Moreover, 1 Worker per node worked the same time as more than one. This means that one worker takes advantage of multiple cores. Also, data partitions should be >= number of workers, otherwise it will be processed by less Workers.