ccminer

mirror of https://github.com/GOSTSec/ccminer synced 2025-01-09 14:28:15 +00:00

Author	SHA1	Message	Date
Tanguy Pruvot	4709668995	jh512: rewrite and optimize with asm swap 5% improvement by the vshl asm swap functions, mixed shl+add inst., Add also xchg(x, y) func and XCHG(x, y) define in cuda_helper for later use... other jh changes are mainly for the beauty of the code... Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>	2015-06-16 08:20:48 +02:00
Tanguy Pruvot	52df82917a	cuda: fix uint2 substract operator	2015-05-29 14:32:13 +02:00
Tanguy Pruvot	7bf256c81c	cuda_helper: define UINT32_MAX if not defined seems not defined on slackware...	2015-05-12 18:05:09 +02:00
Tanguy Pruvot	2f541065fb	cuda_helper: rename correctly hiword/loword functions	2015-05-12 17:13:58 +02:00
Tanguy Pruvot	b35a6742fe	cuda_helper: properly ifdef for vstudio c++ compat	2015-05-12 05:33:57 +02:00
Tanguy Pruvot	7c7f40a634	neoscrypt: attempt to recode shift256R for SM 3.0	2015-05-08 23:42:24 +02:00
Tanguy Pruvot	1ad34dc13d	reset: take care of multi-threaded gpus (-d 0,0) to be tested... could create problems when reset in a chain like x11...	2015-04-21 09:12:43 +02:00
Tanguy Pruvot	38e6672d70	Allow test of SM 2.1/3.0 binaries on newer cards Implementation based on klausT work.. a bit different This code must be placed in a common .cu file, cuda.cpp is not compiled with nvcc and doesnt allow cuda code...	2015-03-28 12:00:53 +01:00
Tanguy Pruvot	7939dce0aa	pluck: adaptation from djm repo remains the cpu validation check to do... throughput for this algo is divided by 128 to keep same kind of intensity values (default 18.0)	2015-03-08 15:16:11 +01:00
Tanguy Pruvot	3ed1c552bd	cuda: always disable asm for host code	2015-03-05 18:15:52 +01:00
Tanguy Pruvot	e6112e878d	cleanup: use unsigned throughput parameters Yes, its a big commit, was waiting 1.6 to do that... Sorry for your possible merge issues ;)	2015-02-28 14:05:09 +01:00
Tanguy Pruvot	768b5ccb76	import bmw512 uint2 changes from sp + some cleanup... 15KH/s won (750Ti)	2015-01-24 08:02:41 +01:00
Tanguy Pruvot	9f2dd3ee60	Remove some useless conversions do not impact perfs neither...	2015-01-24 08:00:22 +01:00
Tanguy Pruvot	cafd4477d7	Handle a maximum of 16 gpus (vs 8 before) Some cards have 2 gpus on board...	2015-01-22 04:55:27 +01:00
Tanguy Pruvot	b3188669e2	lyra2: cleanup quickly tested with a SM 3.0 binary...	2014-12-20 13:10:33 +01:00
Tanguy Pruvot	da2e2528a7	uint2: fix SM 3.0 ROR and ROL Not sure its the fastest way, but it works for offsets 0-63 + 64 Also note than asm SM 3.5+ doesn't support ROR with offset 64	2014-12-19 21:45:40 +01:00
Tanguy Pruvot	c5b349e079	Add Lyra2 algo, based on Vertcoin published code Seems to be djm34 work, i recognize the code style ;) Code was cleaned/indented and adapted to my fork... Only usable on the test pool until 16 december 2014!	2014-12-06 11:28:26 +01:00
Tanguy Pruvot	c3bdb623e8	Check and submit multiple nonces in one loop Added to most algos, checkhash function scans a big range and can find multiple nonces at once if the difficulty is low. Stop ignoring them, submit second one if found... Clean the draft code for rc=2 implemented for blake and pentablake btw... fix the reduced displayed hashrate when a nonce is found... Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>	2014-12-05 15:53:40 +00:00
Tanguy Pruvot	f387898ead	Prepare multiple nonces support in one loop (if found) Tested on x11 which find sometimes 3 nonces in one call, actually they are ignored because only the biggest was kept... This commit doesnt fix that, but will allow to enhance shares rate later...	2014-12-05 10:16:06 +01:00
Tanguy Pruvot	118a6be361	checkhash: simplify the common function use klaus trivial function, the old code has always been a bit weird.. split cuda_check_cpu_hash_64 in two functions, keep old for branched stuff	2014-12-01 00:20:40 +01:00
Tanguy Pruvot	6ae28162db	various extern cleanup + api history uids and gpu SM uids could be useful to create graphes from history data Note: please do a clean build after this commit (changes in miner.h)	2014-11-26 11:55:42 +01:00
sp-hash	26b9fe3586	faster x15, +23KH or 4ms on whirpool (30ms vs 34ms) tpruvot: i didnt pick the asm replace_hiword, slower on linux	2014-11-20 19:19:27 +01:00
Tanguy Pruvot	73f22b237a	Prepare trap of hardware/mem failures	2014-11-20 18:44:25 +01:00
Tanguy Pruvot	11dbbcc12d	checkhash: some work on a faster variant (wip) This should not be used for all algos... not enabled yet todo: multiple nounces or blake32 style checkup	2014-11-16 17:37:02 +01:00
Tanguy Pruvot	b128312efb	cuda: store device SM in a global var sample usage made for blake and fugue (higher intensity for SM5.2) add these to cuda_helper and clean unused code	2014-11-11 19:11:16 +01:00
Tanguy Pruvot	987edf63f3	vstudio: fix launch_bounds intellisense warnings in ide	2014-11-09 20:51:24 +01:00
Tanguy Pruvot	149143d5cd	Fix left value warning in SWAPDWORDS + groestl change	2014-11-09 13:23:31 +01:00
Tanguy Pruvot	a747e4ca0f	blake512: use a new SWAPDWORDS asm func (0.05ms) small improvement, do it on pentablake and heavy variants too based on sp commit (but SWAP32 is already used for 32bit ints)	2014-11-09 01:26:55 +01:00
Tanguy Pruvot	5bc969fa57	Some work on data alignment linux: add -march=native (we build it ourself) and some other flags + remove unused vars (seen with -Wall)	2014-11-03 16:40:13 +01:00
Tanguy Pruvot	2de9b1375b	prepare next version	2014-10-20 19:00:44 +02:00
Tanguy Pruvot	d8a23fa970	Tune quark part of Xn funcs based on klaus commits, will increase a bit speed of most algos PS: main increase is due to the register count tuning in Makefile and for skein512 on linux, its the ROTL64 but almost no changes on X11 : 2648MH/s vs 2630 before	2014-10-20 03:15:17 +02:00
Tanguy Pruvot	ba33492592	blake: return to ptarget 6:7 compare clz can be erroneous, ex 0xE0 vs 0xF0	2014-09-19 05:01:16 +02:00
Tanguy Pruvot	91eea0d76b	blake: remove int cudaMemcpyToSymbol for MSVC use clz (leading zeros) asm func for a fast gpu compare of ptarget[6]:[7] add also missing windows ctz/clz host functions New NEOS speed: 227MH to 270MH (Gigabyte 750Ti Black Edition)	2014-09-13 17:31:01 +02:00
Tanguy Pruvot	c3eb66683a	Import djm34 qubit, deep and doom algos Indent, and put commonly used functions proto. in cuda_helper.h And add them to --cputest function Also change the color option to --nocolor, -C is no more needed Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com> (Which is tired to remove these german copy/pasted comments)	2014-09-10 00:26:55 +02:00
Tanguy Pruvot	13bb9d267e	Remove debug rpc, already exists with -P	2014-09-09 21:59:03 +02:00
Tanguy Pruvot	64e8cd3f98	add x17 algo, cleaned djm34 commit todo: visual studio...	2014-08-23 22:44:17 +02:00
Tanguy Pruvot	3f6ebc10cc	whirlpool: x64 asm is very slow (30ms win32 vs 90)	2014-08-22 04:09:16 +02:00
Tanguy Pruvot	912ef1215d	small reg tunes, rename whirlcoin to whirl	2014-08-21 02:57:10 +02:00
Tanguy Pruvot	1fbcbbacc4	Add whirlcoin and optimize x11 luffa (maxrregcount)	2014-08-20 07:49:22 +02:00
Tanguy Pruvot	4bc23048b5	x15: use djm34 code with asm xor64 + my rot64 some optimizations could be done later, after whirlcoin integration	2014-08-20 05:54:47 +02:00
Tanguy Pruvot	d9ea5f72ce	Remove duplicated defines present in cuda_helper.h also add cudaDeviceReset() on Ctrl+C for nvprof	2014-08-19 03:29:11 +02:00
Tanguy Pruvot	a9a3ad8afc	cuda: check for errors on cuda mem alloc	2014-08-17 22:41:05 +02:00
Christian Buchner	f22ae4ebde	forgot this file in previous commit	2014-05-03 21:09:43 +02:00

43 Commits