Commit Graph

36 Commits

Author SHA1 Message Date
Tanguy Pruvot
38e6672d70 Allow test of SM 2.1/3.0 binaries on newer cards
Implementation based on klausT work.. a bit different

This code must be placed in a common .cu file,
cuda.cpp is not compiled with nvcc and doesnt allow cuda code...
2015-03-28 12:00:53 +01:00
Tanguy Pruvot
7939dce0aa pluck: adaptation from djm repo
remains the cpu validation check to do...

throughput for this algo is divided by 128 to keep same kind of intensity values (default 18.0)
2015-03-08 15:16:11 +01:00
Tanguy Pruvot
3ed1c552bd cuda: always disable asm for host code 2015-03-05 18:15:52 +01:00
Tanguy Pruvot
e6112e878d cleanup: use unsigned throughput parameters
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
2015-02-28 14:05:09 +01:00
Tanguy Pruvot
768b5ccb76 import bmw512 uint2 changes from sp
+ some cleanup... 15KH/s won (750Ti)
2015-01-24 08:02:41 +01:00
Tanguy Pruvot
9f2dd3ee60 Remove some useless conversions
do not impact perfs neither...
2015-01-24 08:00:22 +01:00
Tanguy Pruvot
cafd4477d7 Handle a maximum of 16 gpus (vs 8 before)
Some cards have 2 gpus on board...
2015-01-22 04:55:27 +01:00
Tanguy Pruvot
b3188669e2 lyra2: cleanup
quickly tested with a SM 3.0 binary...
2014-12-20 13:10:33 +01:00
Tanguy Pruvot
da2e2528a7 uint2: fix SM 3.0 ROR and ROL
Not sure its the fastest way, but it works for offsets 0-63 + 64

Also note than asm SM 3.5+ doesn't support ROR with offset 64
2014-12-19 21:45:40 +01:00
Tanguy Pruvot
c5b349e079 Add Lyra2 algo, based on Vertcoin published code
Seems to be djm34 work, i recognize the code style ;)

Code was cleaned/indented and adapted to my fork...

Only usable on the test pool until 16 december 2014!
2014-12-06 11:28:26 +01:00
Tanguy Pruvot
c3bdb623e8 Check and submit multiple nonces in one loop
Added to most algos, checkhash function scans a big range
and can find multiple nonces at once if the difficulty is low.

Stop ignoring them, submit second one if found...

Clean the draft code for rc=2 implemented for blake and pentablake

btw... fix the reduced displayed hashrate when a nonce is found...

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2014-12-05 15:53:40 +00:00
Tanguy Pruvot
f387898ead Prepare multiple nonces support in one loop (if found)
Tested on x11 which find sometimes 3 nonces in one call,
actually they are ignored because only the biggest was kept...

This commit doesnt fix that, but will allow to enhance shares rate later...
2014-12-05 10:16:06 +01:00
Tanguy Pruvot
118a6be361 checkhash: simplify the common function
use klaus trivial function, the old code has always been a bit weird..

split cuda_check_cpu_hash_64 in two functions, keep old for branched stuff
2014-12-01 00:20:40 +01:00
Tanguy Pruvot
6ae28162db various extern cleanup + api history uids and gpu SM
uids could be useful to create graphes from history data

Note: please do a clean build after this commit (changes in miner.h)
2014-11-26 11:55:42 +01:00
sp-hash
26b9fe3586 faster x15, +23KH or 4ms on whirpool (30ms vs 34ms)
tpruvot: i didnt pick the asm replace_hiword, slower on linux
2014-11-20 19:19:27 +01:00
Tanguy Pruvot
73f22b237a Prepare trap of hardware/mem failures 2014-11-20 18:44:25 +01:00
Tanguy Pruvot
11dbbcc12d checkhash: some work on a faster variant (wip)
This should not be used for all algos... not enabled yet

todo: multiple nounces or blake32 style checkup
2014-11-16 17:37:02 +01:00
Tanguy Pruvot
b128312efb cuda: store device SM in a global var
sample usage made for blake and fugue (higher intensity for SM5.2)

add these to cuda_helper and clean unused code
2014-11-11 19:11:16 +01:00
Tanguy Pruvot
987edf63f3 vstudio: fix launch_bounds intellisense warnings in ide 2014-11-09 20:51:24 +01:00
Tanguy Pruvot
149143d5cd Fix left value warning in SWAPDWORDS + groestl change 2014-11-09 13:23:31 +01:00
Tanguy Pruvot
a747e4ca0f blake512: use a new SWAPDWORDS asm func (0.05ms)
small improvement, do it on pentablake and heavy variants too

based on sp commit (but SWAP32 is already used for 32bit ints)
2014-11-09 01:26:55 +01:00
Tanguy Pruvot
5bc969fa57 Some work on data alignment
linux: add -march=native (we build it ourself) and some other flags

+ remove unused vars (seen with -Wall)
2014-11-03 16:40:13 +01:00
Tanguy Pruvot
2de9b1375b prepare next version 2014-10-20 19:00:44 +02:00
Tanguy Pruvot
d8a23fa970 Tune quark part of Xn funcs
based on klaus commits, will increase a bit speed of most algos

PS: main increase is due to the register count tuning in Makefile

and for skein512 on linux, its the ROTL64

but almost no changes on X11 : 2648MH/s vs 2630 before
2014-10-20 03:15:17 +02:00
Tanguy Pruvot
ba33492592 blake: return to ptarget 6:7 compare
clz can be erroneous, ex 0xE0 vs 0xF0
2014-09-19 05:01:16 +02:00
Tanguy Pruvot
91eea0d76b blake: remove int cudaMemcpyToSymbol for MSVC
use clz (leading zeros) asm func for a fast gpu compare of ptarget[6]:[7]

add also missing windows ctz/clz host functions

New NEOS speed: 227MH to 270MH (Gigabyte 750Ti Black Edition)
2014-09-13 17:31:01 +02:00
Tanguy Pruvot
c3eb66683a Import djm34 qubit, deep and doom algos
Indent, and put commonly used functions proto. in cuda_helper.h

And add them to --cputest function

Also change the color option to --nocolor, -C is no more needed

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
(Which is tired to remove these german copy/pasted comments)
2014-09-10 00:26:55 +02:00
Tanguy Pruvot
13bb9d267e Remove debug rpc, already exists with -P 2014-09-09 21:59:03 +02:00
Tanguy Pruvot
64e8cd3f98 add x17 algo, cleaned djm34 commit
todo: visual studio...
2014-08-23 22:44:17 +02:00
Tanguy Pruvot
3f6ebc10cc whirlpool: x64 asm is very slow (30ms win32 vs 90) 2014-08-22 04:09:16 +02:00
Tanguy Pruvot
912ef1215d small reg tunes, rename whirlcoin to whirl 2014-08-21 02:57:10 +02:00
Tanguy Pruvot
1fbcbbacc4 Add whirlcoin and optimize x11 luffa (maxrregcount) 2014-08-20 07:49:22 +02:00
Tanguy Pruvot
4bc23048b5 x15: use djm34 code with asm xor64 + my rot64
some optimizations could be done later, after whirlcoin integration
2014-08-20 05:54:47 +02:00
Tanguy Pruvot
d9ea5f72ce Remove duplicated defines present in cuda_helper.h
also add cudaDeviceReset() on Ctrl+C for nvprof
2014-08-19 03:29:11 +02:00
Tanguy Pruvot
a9a3ad8afc cuda: check for errors on cuda mem alloc 2014-08-17 22:41:05 +02:00
Christian Buchner
f22ae4ebde forgot this file in previous commit 2014-05-03 21:09:43 +02:00