Not the OP, but moving to sparse matrices is probably going to give you the most...

Not the OP, but moving to sparse matrices is probably going to give you the most bang for your buck. I would strongly suspect that those huge dataframes could be encoded sparsely in a much more efficient format.

To be fair, that's one of the reasons that Spark ML stuff works quite well. Be warned though, estimating how long a Spark job will take/how much resources it will need is a dark, dark art.