import and keep my code for older archs, like skein 64
reduce the gap between our versions...
+150kH x11 GTX 960 / +30kH 750Ti
+900kH quark GTX 960 / +230kH 750Ti
Switching to a pool with a different algo will require a barrier
to free ressources, like what was made in the global benchmark.
add also the algo in pool structure...
Quark and S3 are now a bit faster (+1 %)
x11 get +0.6 % (+20kH/s on a 750ti, +30kH on a 960)
80 bytes implementation to do/test ... (skein/skein2)
but keep my previous version for older devices...
ccminer -n 2>NUL
GPU #0: SM 5.2 GeForce GTX 970
GPU #1: SM 5.0 Gigabyte GTX 750 Ti
GPU #2: SM 5.2 ASUS GTX 970
note: nvml destroy is made in proper_exit function
reduce "false" warnings, and ignore unrelated/small ones <= 1 MB
On windows the gpu memory can be allocated by other processes
+ some cleanup in algos... (free/gpulog)
Mostly to do compatibilty tests, SM 2.1 support is very limited
SM 3.0 code should run on SM 3.5 (only a few cards use this arch)
As i can't test SM 3.5, its best to let users do their own tests...
The full benchmark can now be launched with "ccminer --benchmark"
add a new helper function which log a warning with last cuda error
(not shown with the quiet option) : CUDA_LOG_ERROR();
it can be used where miner.h is included (.c/.cpp/.cu)
fix x14 (in ccminer.cpp), a break was missing in switch..case
when using multiple cpu threads per gpu, use the T prefix, ex:
[2015-10-11 09:52:49] GPU #0: app clocks set to P0 (3600/1228)
vs
[2015-10-11 09:52:51] GPU T0: MSI GTX 960, 5953.35 kH/s
Only thr_id is required, the function take care of the dev id
remains my original lyra2 implementation to fix... (cuda_lyra2.cu)
I guess some kind of memory overflow force the driver to allocate
memory... but was unable to free it without device reset.
Note: lyra2, lyra2v2 and script seems to have problems
to coexist with other algos... to run after some of them...
moved lyra2 first and skip scrypt/jane for the moment...
Only stored in memory for now.. to display a table after the bench
ccminer -a auto --benchmark
Results may be exported later to a json file...
make a difference between whirlpool and whirlcoin algos (stratum)
Look like the old SHA merkleroot method doesnt work on recent coins
Doesn't affect solo mining, only pools using stratum+tcp:// protocol
Reduce a bit the 750Ti speed but improve a lot the 9xx speed.
Keep compat for SM 3/3.5 in a second file..
Note: With this code and Cuda 7.5, the speed won is the reverse...
May be "reverted" soon