ccminer

mirror of https://github.com/GOSTSec/ccminer synced 2025-01-09 22:38:05 +00:00

Author	SHA1	Message	Date
Tanguy Pruvot	2113be6eec	blake80: some changes and launch bounds, no perf changes	2015-04-24 14:12:21 +02:00
Tanguy Pruvot	3d3f2e2cb5	warnings: use the right device id (device_map[thr_id])	2015-04-23 09:41:56 +02:00
Tanguy Pruvot	e7ae27137e	x11/qubit: remove some extra MyStreamSynchronize only one per loop is required to prevent 100% cpu usage	2015-04-15 05:30:22 +02:00
Tanguy Pruvot	d58d53f2b2	update README, small changes, prepare release 1.6.1 still need a SM 3.0 fix for skein...	2015-04-14 23:28:00 +02:00
Tanguy Pruvot	4f43abb402	bmw512: indent and restore SM 3.0 compat could be also the source of the problem seen with CUDA 7 restored the code before sp/klaus changes for SM 3.0 devices...	2015-03-28 12:01:50 +01:00
KlausT	ae8e863591	remove uint32_t cast	2015-03-12 01:01:47 +01:00
Tanguy Pruvot	35cc5908ee	windows: return to normal priority, fix json decref the jansson error seems only seen in windows debug mode	2015-03-10 19:14:15 +01:00
Tanguy Pruvot	ebd23bcc66	whirlpoolx: real fix for multi gpus Main problem was the arrays allocations which should be made per cpu Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>	2015-03-08 22:56:04 +01:00
Tanguy Pruvot	9c4158aadb	debug: x11 algo traces for cuda 7 problem	2015-03-02 16:29:46 +01:00
Tanguy Pruvot	e6112e878d	cleanup: use unsigned throughput parameters Yes, its a big commit, was waiting 1.6 to do that... Sorry for your possible merge issues ;)	2015-02-28 14:05:09 +01:00
Tanguy Pruvot	26b51a557b	Allow different intensity per device and clean the old variables, no more required	2015-01-24 11:17:29 +01:00
Tanguy Pruvot	2a5233f56e	api: report throughput when default	2015-01-22 06:28:59 +01:00
Tanguy Pruvot	cafd4477d7	Handle a maximum of 16 gpus (vs 8 before) Some cards have 2 gpus on board...	2015-01-22 04:55:27 +01:00
Tanguy Pruvot	b521acb480	groestl: use sp bitslice enhancement, prepare SM 2.x variant todo: simd512 SM 2.x variant (shfl op), and groestl/myriad functions	2015-01-19 00:42:14 +01:00
Tanguy Pruvot	90efbdcece	simd cleanup	2014-12-19 09:16:55 +01:00
Tanguy Pruvot	ec5a48f420	x11: small simd512 gpu_expand improvement	2014-12-19 09:16:55 +01:00
Tanguy Pruvot	6c7fce187b	x11: use KlausT optimisation (+20 KHs) But use a define in AES to use or not device initial memcpy I already tried to use everywhere direct device constants and its not faster for big arrays (difference is small) also change launch bounds to reduce spills (72 regs) to check on windows too, could improve the perf... or not	2014-12-06 04:14:36 +01:00
Tanguy Pruvot	c3bdb623e8	Check and submit multiple nonces in one loop Added to most algos, checkhash function scans a big range and can find multiple nonces at once if the difficulty is low. Stop ignoring them, submit second one if found... Clean the draft code for rc=2 implemented for blake and pentablake btw... fix the reduced displayed hashrate when a nonce is found... Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>	2014-12-05 15:53:40 +00:00
Tanguy Pruvot	f387898ead	Prepare multiple nonces support in one loop (if found) Tested on x11 which find sometimes 3 nonces in one call, actually they are ignored because only the biggest was kept... This commit doesnt fix that, but will allow to enhance shares rate later...	2014-12-05 10:16:06 +01:00
Tanguy Pruvot	118a6be361	checkhash: simplify the common function use klaus trivial function, the old code has always been a bit weird.. split cuda_check_cpu_hash_64 in two functions, keep old for branched stuff	2014-12-01 00:20:40 +01:00
Tanguy Pruvot	8ad180cc70	various small changes heavy: reduce by 256 threads default intensity to all -i 20 cuda: put static thread init bools outside the code (made once) api: fix nvml header to build without	2014-11-28 20:57:35 +01:00
Tanguy Pruvot	6ae28162db	various extern cleanup + api history uids and gpu SM uids could be useful to create graphes from history data Note: please do a clean build after this commit (changes in miner.h)	2014-11-26 11:55:42 +01:00
Tanguy Pruvot	9b1ff1280e	Allow intermediate intensity (decimals) Sample with -i 18.5 Adding 131072 threads to intensity 18, 393216 cuda threads And with -i 19.5 Adding 262144 threads to intensity 19, 786432 cuda threads	2014-11-25 19:57:56 +01:00
Tanguy Pruvot	d0316220dd	simd512: restore full maxwell power (typo)	2014-11-23 21:19:35 +01:00
Tanguy Pruvot	c88750332c	simd512: restore SM3/3.5 perfs Simple change which affect all algos based on SIMD512 fresh, qubit, s3, x11 to x17...	2014-11-23 19:07:06 +01:00
Tanguy Pruvot	94c9945fe6	cubeluffa: Fix indent and add some static prefixes use git "show -w <commithash>" to see changes Duplicated functions in merged Cube+Luffa could be cross linked without	2014-11-23 07:17:20 +01:00
sp-hash	f0d91ab8a6	Luffa and simd merged to one kernal. Small echo rewrite. +10KHASH on the 650(compute 3.0) tpruvot: add Linux Makefile - Force to 80 registers (else -30KH/s) Note : the hashrate seems more constant with this change	2014-11-23 07:04:07 +01:00
sp-hash	7d88e5cca1	Faster Simd On maxwell compress1 and compress2 can be run in one run instead of two.(750TI + 20KHASH)	2014-11-22 03:25:49 +01:00
Tanguy Pruvot	73f22b237a	Prepare trap of hardware/mem failures	2014-11-20 18:44:25 +01:00
Tanguy Pruvot	bdfce54c3b	x11: restore default intensity to 19 on windows	2014-11-17 14:48:55 +01:00
Tanguy Pruvot	fe4ad36b73	intensity: sign warnings fixes min(i,u)	2014-11-17 14:48:55 +01:00
Tanguy Pruvot	c859041993	quark/blake512 opt. pointed by sp without asm indeed, the pragma unroll doesnt always make things faster asm part... to check later	2014-11-17 00:01:32 +01:00
Tanguy Pruvot	438308b3a2	Rework benchmark mode and min/max range Was maybe my fault, but the benchmark mode was always recomputing from nonce 0. Also fix blake if -d 1 is used (one thread but second gpu) stats: do not use thread id as key, prefer gpu id...	2014-11-16 23:28:18 +01:00
Tanguy Pruvot	11dbbcc12d	checkhash: some work on a faster variant (wip) This should not be used for all algos... not enabled yet todo: multiple nounces or blake32 style checkup	2014-11-16 17:37:02 +01:00
Tanguy Pruvot	14a41959f8	x11: switch to intensity 20 for SM>=5.2 750+970	2014-11-16 17:34:50 +01:00
Tanguy Pruvot	fdd5d29071	x11: shavite and echo from sp (now ok on win32) Previous echo commit was only increasing linux performance, and reducing windows perf compared to the 1.4.9, this one seems to give at least the 1.4.9 on windows, and the same on linux... Shavite optimisation seems ok on both (use now 64 registers) the launch_bounds will force the number of registers, so remove specific Makefile rules on linux... manual "cherry pick" with fixed line endings and some adaptations	2014-11-16 17:34:50 +01:00
sp-hash	e18a54e8fc	sp echo optimisation + cleanup Original Commit : Removed sharedmem and reduced calculations with precalcing (ECHO hash). 750ti + 20KHASH(x11) tpruvot notes: Real change is more of 10 KH/s on stock clocks (but real) launch bounds disabled, no perf increase with 64 registers	2014-11-16 03:08:46 +01:00
Tanguy Pruvot	b128312efb	cuda: store device SM in a global var sample usage made for blake and fugue (higher intensity for SM5.2) add these to cuda_helper and clean unused code	2014-11-11 19:11:16 +01:00
Tanguy Pruvot	11c5ec810d	Handle intensity param in all algos and add a check related to start/max nounce params	2014-11-09 22:27:32 +01:00
sp-hash	5be6811dcf	x11: echo and cubehash optimization echo : 40.056ms -> 39.241ms cube : 14.490ms -> 13.511ms cube hash change look like useless (__device__ code in generally inlined) but the reality proves that cuda documentation is wrong... tpruvot: fixed dos lines ending in echo, and used my style for cuda function attributes	2014-11-06 15:17:26 +01:00
Tanguy Pruvot	b191d713a0	s3: reduce a bit the intensity on windows	2014-10-26 11:18:59 +01:00
Tanguy Pruvot	6169bf683b	Add S3 Algo (1Coin) Simple addition of the algo using existing X11 code	2014-10-26 09:10:58 +01:00
Tanguy Pruvot	93f4409dde	simd: then reindent the code no changes, only error checks (cuda safe call)	2014-10-25 23:03:20 +02:00
Tanguy Pruvot	b465fe6825	optimize x11 simd512 (+100KH/s) change picked from tsiv repo	2014-10-25 22:15:43 +02:00
Tanguy Pruvot	1b241df5c0	cubehash and luffa funnel shit (from klaus) No gain... but i like this define, more readable in luffa ;)	2014-10-20 19:06:27 +02:00
Tanguy Pruvot	d8a23fa970	Tune quark part of Xn funcs based on klaus commits, will increase a bit speed of most algos PS: main increase is due to the register count tuning in Makefile and for skein512 on linux, its the ROTL64 but almost no changes on X11 : 2648MH/s vs 2630 before	2014-10-20 03:15:17 +02:00
Tanguy Pruvot	7cc5222394	Move common check_cpu functions to root	2014-09-10 00:27:01 +02:00
Tanguy Pruvot	95ac1d0f19	x11: adapt some blake 256 opts to 512 one blake512: for the moment 6.2ms vs 7.12 before (+10%)	2014-09-09 17:55:07 +02:00
Tanguy Pruvot	b4e690b486	sources: swith to UTF-8	2014-08-21 08:27:48 +02:00
Tanguy Pruvot	912ef1215d	small reg tunes, rename whirlcoin to whirl	2014-08-21 02:57:10 +02:00

1 2

64 Commits