djm34
6e9fe540b6
allow to compile with cuda 7.5
2016-02-04 19:20:54 +01:00
Tanguy Pruvot
922c2a5cd7
algos: free allocated mem for algo switch
...
All can be freed propertly now, except script (reset) and lyra2 (leak)
2015-10-08 21:35:30 +02:00
Tanguy Pruvot
e6112e878d
cleanup: use unsigned throughput parameters
...
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
2015-02-28 14:05:09 +01:00
Tanguy Pruvot
cafd4477d7
Handle a maximum of 16 gpus (vs 8 before)
...
Some cards have 2 gpus on board...
2015-01-22 04:55:27 +01:00
Tanguy Pruvot
1b65cd05cc
heavy: add error checks, fix strict aliasing and linux
...
The core problem was the cuda hefty Thread per block set to high
but took me several hours to find that...
btw... +25% in heavy 12500 with 256 threads per block... vs 128 & 512
if max reg count is set to 80...
2014-11-27 09:14:59 +01:00
Tanguy Pruvot
b128312efb
cuda: store device SM in a global var
...
sample usage made for blake and fugue (higher intensity for SM5.2)
add these to cuda_helper and clean unused code
2014-11-11 19:11:16 +01:00
Tanguy Pruvot
d9ea5f72ce
Remove duplicated defines present in cuda_helper.h
...
also add cudaDeviceReset() on Ctrl+C for nvprof
2014-08-19 03:29:11 +02:00
Christian Buchner
3b21069504
bump to revision V1.1 with Killer Groestl
2014-06-14 01:43:28 +02:00
Christian Buchner
6c8eff98c0
bump to revision v0.8
2014-05-03 21:01:50 +02:00