1
0
mirror of https://github.com/GOSTSec/ccminer synced 2025-01-09 22:38:05 +00:00
Commit Graph

40 Commits

Author SHA1 Message Date
Tanguy Pruvot
8d4d4d65ce cuda: header for common kernel functions (quark/x11)
Was thinking about doing that since months ;) lets go
2015-10-25 06:54:17 +01:00
Tanguy Pruvot
d43dc9a021 use blake512 sp kernels on SM 5+ (80+64)
import and keep my code for older archs, like skein 64

reduce the gap between our versions...

+150kH x11   GTX 960 / +30kH  750Ti
+900kH quark GTX 960 / +230kH 750Ti
2015-10-24 13:43:22 +02:00
Tanguy Pruvot
355b835ae0 benchmark: enhance the mem leak detection
reduce "false" warnings, and ignore unrelated/small ones <= 1 MB

On windows the gpu memory can be allocated by other processes

+ some cleanup in algos... (free/gpulog)
2015-10-16 22:04:30 +02:00
Tanguy Pruvot
9dfa757dc7 warn on cuda errors + various small changes
The full benchmark can now be launched with "ccminer --benchmark"

add a new helper function which log a warning with last cuda error
(not shown with the quiet option) : CUDA_LOG_ERROR();
it can be used where miner.h is included (.c/.cpp/.cu)

fix x14 (in ccminer.cpp), a break was missing in switch..case
2015-10-12 08:46:13 +02:00
Tanguy Pruvot
d195f2e8a2 intensity: do not reduce throughput before init
Else the memory allocated could be less than required later

btw, use the new "cuda" function to apply intensity/throughput
2015-10-11 05:01:41 +02:00
Tanguy Pruvot
c2214091ae benchmark: free last memory leaks on algo switch
remains my original lyra2 implementation to fix... (cuda_lyra2.cu)

I guess some kind of memory overflow force the driver to allocate
memory... but was unable to free it without device reset.
2015-10-10 02:15:32 +02:00
Tanguy Pruvot
922c2a5cd7 algos: free allocated mem for algo switch
All can be freed propertly now, except script (reset) and lyra2 (leak)
2015-10-08 21:35:30 +02:00
Tanguy Pruvot
ee93927fac diff: use the new function in all algos 2015-10-07 20:10:15 +02:00
Tanguy Pruvot
e1c4b3042c algos: add functions to free allocated resources
Will be used later for algo switching

not really tested yet...
2015-09-25 07:51:57 +02:00
Tanguy Pruvot
5308898d1c start v1.7, apply new prototypes to all algos 2015-09-23 15:42:17 +02:00
Tanguy Pruvot
42bcb91ca0 x11: update sp luffa/cube to get closer x11 speeds..
i had to clean it... lot of unused defines...
2015-06-17 02:31:15 +02:00
Tanguy Pruvot
2113be6eec blake80: some changes and launch bounds, no perf changes 2015-04-24 14:12:21 +02:00
Tanguy Pruvot
3d3f2e2cb5 warnings: use the right device id (device_map[thr_id]) 2015-04-23 09:41:56 +02:00
KlausT
ae8e863591 remove uint32_t cast 2015-03-12 01:01:47 +01:00
Tanguy Pruvot
e6112e878d cleanup: use unsigned throughput parameters
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
2015-02-28 14:05:09 +01:00
Tanguy Pruvot
26b51a557b Allow different intensity per device
and clean the old variables, no more required
2015-01-24 11:17:29 +01:00
Tanguy Pruvot
45206e49c1 hamsi: TPB of 128 give better results (+10kh) 2015-01-24 07:17:12 +01:00
Tanguy Pruvot
2a5233f56e api: report throughput when default 2015-01-22 06:28:59 +01:00
Tanguy Pruvot
cafd4477d7 Handle a maximum of 16 gpus (vs 8 before)
Some cards have 2 gpus on board...
2015-01-22 04:55:27 +01:00
Tanguy Pruvot
c3bdb623e8 Check and submit multiple nonces in one loop
Added to most algos, checkhash function scans a big range
and can find multiple nonces at once if the difficulty is low.

Stop ignoring them, submit second one if found...

Clean the draft code for rc=2 implemented for blake and pentablake

btw... fix the reduced displayed hashrate when a nonce is found...

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2014-12-05 15:53:40 +00:00
Tanguy Pruvot
118a6be361 checkhash: simplify the common function
use klaus trivial function, the old code has always been a bit weird..

split cuda_check_cpu_hash_64 in two functions, keep old for branched stuff
2014-12-01 00:20:40 +01:00
Tanguy Pruvot
8ad180cc70 various small changes
heavy: reduce by 256 threads default intensity to all -i 20
cuda: put static thread init bools outside the code (made once)
api: fix nvml header to build without
2014-11-28 20:57:35 +01:00
Tanguy Pruvot
6ae28162db various extern cleanup + api history uids and gpu SM
uids could be useful to create graphes from history data

Note: please do a clean build after this commit (changes in miner.h)
2014-11-26 11:55:42 +01:00
Tanguy Pruvot
9b1ff1280e Allow intermediate intensity (decimals)
Sample with -i 18.5
  Adding 131072 threads to intensity 18, 393216 cuda threads

And with -i 19.5
  Adding 262144 threads to intensity 19, 786432 cuda threads
2014-11-25 19:57:56 +01:00
Tanguy Pruvot
71f9003901 x13: use tsiv hamsi implementation (+70KH) 2014-11-24 23:01:41 +01:00
Tanguy Pruvot
c88750332c simd512: restore SM3/3.5 perfs
Simple change which affect all algos based on SIMD512

fresh, qubit, s3, x11 to x17...
2014-11-23 19:07:06 +01:00
sp-hash
f0d91ab8a6 Luffa and simd merged to one kernal.
Small echo rewrite. +10KHASH on the 650(compute 3.0)

tpruvot: add Linux Makefile - Force to 80 registers (else -30KH/s)

Note : the hashrate seems more constant with this change
2014-11-23 07:04:07 +01:00
Tanguy Pruvot
73f22b237a Prepare trap of hardware/mem failures 2014-11-20 18:44:25 +01:00
Tanguy Pruvot
fe4ad36b73 intensity: sign warnings fixes min(i,u) 2014-11-17 14:48:55 +01:00
Tanguy Pruvot
c859041993 quark/blake512 opt. pointed by sp without asm
indeed, the pragma unroll doesnt always make things faster

asm part... to check later
2014-11-17 00:01:32 +01:00
Tanguy Pruvot
438308b3a2 Rework benchmark mode and min/max range
Was maybe my fault, but the benchmark mode was
always recomputing from nonce 0.

Also fix blake if -d 1 is used (one thread but second gpu)

stats: do not use thread id as key, prefer gpu id...
2014-11-16 23:28:18 +01:00
Tanguy Pruvot
b128312efb cuda: store device SM in a global var
sample usage made for blake and fugue (higher intensity for SM5.2)

add these to cuda_helper and clean unused code
2014-11-11 19:11:16 +01:00
Tanguy Pruvot
11c5ec810d Handle intensity param in all algos
and add a check related to start/max nounce params
2014-11-09 22:27:32 +01:00
Tanguy Pruvot
7cc5222394 Move common check_cpu functions to root 2014-09-10 00:27:01 +02:00
Tanguy Pruvot
b4e690b486 sources: swith to UTF-8 2014-08-21 08:27:48 +02:00
Tanguy Pruvot
d9ea5f72ce Remove duplicated defines present in cuda_helper.h
also add cudaDeviceReset() on Ctrl+C for nvprof
2014-08-19 03:29:11 +02:00
Tanguy Pruvot
a9a3ad8afc cuda: check for errors on cuda mem alloc 2014-08-17 22:41:05 +02:00
Tanguy Pruvot
cf7351d138 x10 funcs cleanup, we dont need host constant tables 2014-08-15 03:40:13 +02:00
Tanguy Pruvot
06763c20b1 Implement x14 (cuda + cpu functions)
Project was updated for VS2013 and CUDA SDK 6.5

add also a --cputest function to dump cpu hash results

TODO: x15 is not fully functional, but first loop seems ok

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2014-08-12 14:47:03 +02:00
Christian Buchner
d99b91ea65 adding third party X13 and Diamond Groestl code contributions. 2014-06-15 14:31:20 +02:00