also known as "Thor's Riddle"... yes sure ;)
Credits to ocminer who found and "implemented" it.
Note: tested "ok" on x64 and CUDA 6.5 x86, not on 7.5 and 8.0 x86
PS: Don't have the time for a more proper CUDA implementation of Streebog
made for linux and require libpci-dev (optional)
if libpci is not installed, card's vendor names are not handled...
Note: only a few vendor names were added, common GeForce vendors.
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Since linux driver 346.72, nvidia-smi allow to query gpu/mem clocks
Tested ok on the Asus Strix 970, but fails on the Gigabyte 750 Ti
system could require first persistence mode and app clock unlock :
nvidia-smi -pm 1
nvidia-smi -acp 0
supported values are displayed by
nvidia-smi -q -d SUPPORTED_CLOCKS
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
only allowed if --api-remote parameter or config key is set
and fix possible problem with urls containing user:password@
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
heavy: reduce by 256 threads default intensity to all -i 20
cuda: put static thread init bools outside the code (made once)
api: fix nvml header to build without
There was a different behavior on linux and visual studio
That was making it hard to link functions correctly
That remove some ifdef / extern "C" requirements
note about x86 releases, x86 nvml.dll is not installed on Windows x64!
Based on mwhite73 <marvin.white@gmail.com> implementation
Linked to the api system
Also fix Makefile to support standard c++ files
This prevent nvcc use without device code
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Unlike other hash algos, blake256 compute the hash
with blocks of 64 bytes.
We can do the first part on the cpu, only the 4 last int32
are computed on gpu (including the tested nonce)
Previous method was also using this kind of cache with a crc.
Blake Hash Speed: +5%
based on klaus commits, will increase a bit speed of most algos
PS: main increase is due to the register count tuning in Makefile
and for skein512 on linux, its the ROTL64
but almost no changes on X11 : 2648MH/s vs 2630 before