Tanguy Pruvot
11dbbcc12d
checkhash: some work on a faster variant (wip)
...
This should not be used for all algos... not enabled yet
todo: multiple nounces or blake32 style checkup
2014-11-16 17:37:02 +01:00
Tanguy Pruvot
b128312efb
cuda: store device SM in a global var
...
sample usage made for blake and fugue (higher intensity for SM5.2)
add these to cuda_helper and clean unused code
2014-11-11 19:11:16 +01:00
Tanguy Pruvot
987edf63f3
vstudio: fix launch_bounds intellisense warnings in ide
2014-11-09 20:51:24 +01:00
Tanguy Pruvot
149143d5cd
Fix left value warning in SWAPDWORDS + groestl change
2014-11-09 13:23:31 +01:00
Tanguy Pruvot
a747e4ca0f
blake512: use a new SWAPDWORDS asm func (0.05ms)
...
small improvement, do it on pentablake and heavy variants too
based on sp commit (but SWAP32 is already used for 32bit ints)
2014-11-09 01:26:55 +01:00
Tanguy Pruvot
5bc969fa57
Some work on data alignment
...
linux: add -march=native (we build it ourself) and some other flags
+ remove unused vars (seen with -Wall)
2014-11-03 16:40:13 +01:00
Tanguy Pruvot
2de9b1375b
prepare next version
2014-10-20 19:00:44 +02:00
Tanguy Pruvot
d8a23fa970
Tune quark part of Xn funcs
...
based on klaus commits, will increase a bit speed of most algos
PS: main increase is due to the register count tuning in Makefile
and for skein512 on linux, its the ROTL64
but almost no changes on X11 : 2648MH/s vs 2630 before
2014-10-20 03:15:17 +02:00
Tanguy Pruvot
ba33492592
blake: return to ptarget 6:7 compare
...
clz can be erroneous, ex 0xE0 vs 0xF0
2014-09-19 05:01:16 +02:00
Tanguy Pruvot
91eea0d76b
blake: remove int cudaMemcpyToSymbol for MSVC
...
use clz (leading zeros) asm func for a fast gpu compare of ptarget[6]:[7]
add also missing windows ctz/clz host functions
New NEOS speed: 227MH to 270MH (Gigabyte 750Ti Black Edition)
2014-09-13 17:31:01 +02:00
Tanguy Pruvot
c3eb66683a
Import djm34 qubit, deep and doom algos
...
Indent, and put commonly used functions proto. in cuda_helper.h
And add them to --cputest function
Also change the color option to --nocolor, -C is no more needed
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
(Which is tired to remove these german copy/pasted comments)
2014-09-10 00:26:55 +02:00
Tanguy Pruvot
13bb9d267e
Remove debug rpc, already exists with -P
2014-09-09 21:59:03 +02:00
Tanguy Pruvot
64e8cd3f98
add x17 algo, cleaned djm34 commit
...
todo: visual studio...
2014-08-23 22:44:17 +02:00
Tanguy Pruvot
3f6ebc10cc
whirlpool: x64 asm is very slow (30ms win32 vs 90)
2014-08-22 04:09:16 +02:00
Tanguy Pruvot
912ef1215d
small reg tunes, rename whirlcoin to whirl
2014-08-21 02:57:10 +02:00
Tanguy Pruvot
1fbcbbacc4
Add whirlcoin and optimize x11 luffa (maxrregcount)
2014-08-20 07:49:22 +02:00
Tanguy Pruvot
4bc23048b5
x15: use djm34 code with asm xor64 + my rot64
...
some optimizations could be done later, after whirlcoin integration
2014-08-20 05:54:47 +02:00
Tanguy Pruvot
d9ea5f72ce
Remove duplicated defines present in cuda_helper.h
...
also add cudaDeviceReset() on Ctrl+C for nvprof
2014-08-19 03:29:11 +02:00
Tanguy Pruvot
a9a3ad8afc
cuda: check for errors on cuda mem alloc
2014-08-17 22:41:05 +02:00
Christian Buchner
f22ae4ebde
forgot this file in previous commit
2014-05-03 21:09:43 +02:00