Tanguy Pruvot
1b65cd05cc
heavy: add error checks, fix strict aliasing and linux
...
The core problem was the cuda hefty Thread per block set to high
but took me several hours to find that...
btw... +25% in heavy 12500 with 256 threads per block... vs 128 & 512
if max reg count is set to 80...
2014-11-27 09:14:59 +01:00
Tanguy Pruvot
b128312efb
cuda: store device SM in a global var
...
sample usage made for blake and fugue (higher intensity for SM5.2)
add these to cuda_helper and clean unused code
2014-11-11 19:11:16 +01:00
Tanguy Pruvot
d9ea5f72ce
Remove duplicated defines present in cuda_helper.h
...
also add cudaDeviceReset() on Ctrl+C for nvprof
2014-08-19 03:29:11 +02:00
Christian Buchner
3b21069504
bump to revision V1.1 with Killer Groestl
2014-06-14 01:43:28 +02:00
Christian Buchner
6c8eff98c0
bump to revision v0.8
2014-05-03 21:01:50 +02:00