fix some algo weird hashrates (like blake)
and reset device between algos, for better accuracy
but this reset doesnt seems enough to bench all algos correctly...
to test on linux, could be a driver issue...
heavy: fix first alloc and indent with tabs...
made for linux and require libpci-dev (optional)
if libpci is not installed, card's vendor names are not handled...
Note: only a few vendor names were added, common GeForce vendors.
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Should give same or better than SP and klaus versions
Keep old code for older devices and skein2 compat
Linux perf: 750Ti 78MH/s and GTX 970 260MH/s
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Seems to be djm34 work, i recognize the code style ;)
Code was cleaned/indented and adapted to my fork...
Only usable on the test pool until 16 december 2014!
heavy: reduce by 256 threads default intensity to all -i 20
cuda: put static thread init bools outside the code (made once)
api: fix nvml header to build without
The core problem was the cuda hefty Thread per block set to high
but took me several hours to find that...
btw... +25% in heavy 12500 with 256 threads per block... vs 128 & 512
if max reg count is set to 80...
Small echo rewrite. +10KHASH on the 650(compute 3.0)
tpruvot: add Linux Makefile - Force to 80 registers (else -30KH/s)
Note : the hashrate seems more constant with this change
Previous echo commit was only increasing linux performance, and reducing
windows perf compared to the 1.4.9, this one seems to give at least
the 1.4.9 on windows, and the same on linux...
Shavite optimisation seems ok on both (use now 64 registers)
the launch_bounds will force the number of registers, so remove specific
Makefile rules on linux...
manual "cherry pick" with fixed line endings and some adaptations
There was a different behavior on linux and visual studio
That was making it hard to link functions correctly
That remove some ifdef / extern "C" requirements
note about x86 releases, x86 nvml.dll is not installed on Windows x64!
Based on mwhite73 <marvin.white@gmail.com> implementation
Linked to the api system
Also fix Makefile to support standard c++ files
This prevent nvcc use without device code
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>