could be used by new algos
haval256 is now 2x faster, but sha512 perf depends a lot on cuda version...