Was maybe my fault, but the benchmark mode was
always recomputing from nonce 0.
Also fix blake if -d 1 is used (one thread but second gpu)
stats: do not use thread id as key, prefer gpu id...
Project was updated for VS2013 and CUDA SDK 6.5
add also a --cputest function to dump cpu hash results
TODO: x15 is not fully functional, but first loop seems ok
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>