Original Commit :
Removed sharedmem and reduced calculations with precalcing (ECHO hash).
750ti + 20KHASH(x11)
tpruvot notes:
Real change is more of 10 KH/s on stock clocks (but real)
launch bounds disabled, no perf increase with 64 registers
A dev version was released on http://cryptomining-blog.com/
Please update, the previous one has some bugs when using multiple
gpus and the API format has changed!
nvml.dll doesnt exists for 32bit binaries! use nvapi to get infos
seems to have more/different features than NVML... like pstate etc..
This is nvapi r343 : https://developer.nvidia.com/nvapi
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
There was a different behavior on linux and visual studio
That was making it hard to link functions correctly
That remove some ifdef / extern "C" requirements
note about x86 releases, x86 nvml.dll is not installed on Windows x64!
Based on mwhite73 <marvin.white@gmail.com> implementation
Linked to the api system
Also fix Makefile to support standard c++ files
This prevent nvcc use without device code
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
possible values :
5000 or :5000 to use port 5000 (local only)
0.0.0.0:5000 to allow connections from the network
127.0.0.1:4068 to only allow local connections (default)
Use -b 0 to disable the API system.
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Note: Heavy and Mjollnir are broken on linux (only)...
To check in the next version... 4 hours i try to fix that without
success. djm34 variant seems ok but also make a lot of rejects.
Displayed data is the average of the last 50 scans in the 5 last minutes
Also move cuda common functions in a new file (cuda.cu)
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Like cgminer, the value equals to 1 << n
if 0, we keep the default value defined in algo (19 for Xn algos)
19 = 524288 threads per gpu call
GTX 970 and 980 handle a higher number of threads compared to the 750 Ti
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
echo : 40.056ms -> 39.241ms
cube : 14.490ms -> 13.511ms
cube hash change look like useless (__device__ code in generally inlined)
but the reality proves that cuda documentation is wrong...
tpruvot: fixed dos lines ending in echo,
and used my style for cuda function attributes