Tanguy Pruvot
2d98d127f8
groestl: enhance sp andmask optimisation
...
profile of quark_groestl512_gpu_hash_64_quad()
before: 35.692ms
sp : 35.151ms
new: 35.061ms
2014-11-09 00:20:39 +01:00
Tanguy Pruvot
e7beac6b1c
x11: tiny sp_ opt on jh512 (0.05ms)
...
modified a bit.. (and removed the mixed dos end of lines ^M)
also, remove the max reg count, now determined with __launch_bounds__
2014-11-09 00:20:39 +01:00
Tanguy Pruvot
4c3964539f
Fix vc debug builds, missing symbols
2014-11-06 17:42:01 +01:00
sp-hash
5be6811dcf
x11: echo and cubehash optimization
...
echo : 40.056ms -> 39.241ms
cube : 14.490ms -> 13.511ms
cube hash change look like useless (__device__ code in generally inlined)
but the reality proves that cuda documentation is wrong...
tpruvot: fixed dos lines ending in echo,
and used my style for cuda function attributes
2014-11-06 15:17:26 +01:00
Tanguy Pruvot
12fafd5687
Try to reconnect on pool duplicates
...
reduce log announces and define uchar in miner.h
2014-11-04 15:14:24 +01:00
Tanguy Pruvot
5e8ff5226b
update curl prebuilt libs to a light 7.38.0
...
curl built from tpruvot/curl-for-windows project with the HTTP_ONLY define
This project doesnt require SSH, LDAP and all the internel protocols ;)
Remove 200KB to the final binaries
2014-11-04 14:47:28 +01:00
Tanguy Pruvot
187e293f71
blake: some fine tuning + cleanup
2014-11-03 20:55:03 +01:00
Tanguy Pruvot
5bc969fa57
Some work on data alignment
...
linux: add -march=native (we build it ourself) and some other flags
+ remove unused vars (seen with -Wall)
2014-11-03 16:40:13 +01:00
Tanguy Pruvot
93bb428bdf
blake: rewrite the cache system
...
Unlike other hash algos, blake256 compute the hash
with blocks of 64 bytes.
We can do the first part on the cpu, only the 4 last int32
are computed on gpu (including the tested nonce)
Previous method was also using this kind of cache with a crc.
Blake Hash Speed: +5%
2014-11-03 16:33:59 +01:00
Tanguy Pruvot
b191d713a0
s3: reduce a bit the intensity on windows
2014-10-26 11:18:59 +01:00
Tanguy Pruvot
f7849d36a1
Update README for 1.4.6
2014-10-26 09:43:32 +01:00
Tanguy Pruvot
6169bf683b
Add S3 Algo (1Coin)
...
Simple addition of the algo using existing X11 code
2014-10-26 09:10:58 +01:00
Tanguy Pruvot
93f4409dde
simd: then reindent the code
...
no changes, only error checks (cuda safe call)
2014-10-25 23:03:20 +02:00
Tanguy Pruvot
b465fe6825
optimize x11 simd512 (+100KH/s)
...
change picked from tsiv repo
2014-10-25 22:15:43 +02:00
Tanguy Pruvot
1b241df5c0
cubehash and luffa funnel shit (from klaus)
...
No gain... but i like this define, more readable in luffa ;)
2014-10-20 19:06:27 +02:00
Tanguy Pruvot
2de9b1375b
prepare next version
2014-10-20 19:00:44 +02:00
Tanguy Pruvot
7bdebdb5ff
README fixes
2014-10-20 06:34:57 +02:00
Tanguy Pruvot
db8681c1db
update readme and fix SM 3.0 build
2014-10-20 06:27:02 +02:00
Tanguy Pruvot
f737f7f0cb
Fix usage and big strings on windows (colors rel.)
...
vsnprintf doesnt return the len on windows on fail, so use _vscprintf
2014-10-20 05:39:48 +02:00
Tanguy Pruvot
1ee1462011
msvc: fix the LTCG warning
2014-10-20 05:39:44 +02:00
Tanguy Pruvot
d8a23fa970
Tune quark part of Xn funcs
...
based on klaus commits, will increase a bit speed of most algos
PS: main increase is due to the register count tuning in Makefile
and for skein512 on linux, its the ROTL64
but almost no changes on X11 : 2648MH/s vs 2630 before
2014-10-20 03:15:17 +02:00
Tanguy Pruvot
0720797f1b
Add proper keccak-256 (maxcoin)
...
Cleaned from djm34 repo, tuned for the 750 Ti
2014-10-17 06:46:20 +02:00
Tanguy Pruvot
cdc29336f7
stats: compute work difficulty from target
2014-09-30 10:03:12 +02:00
Tanguy Pruvot
9f3c6b0520
Include windows curl and openssl prebuilt libs
...
Curl 7.35 without SSH2
OpenSSL 1.0.1e
ZLib 1.2.8
built with https://github.com/peters/curl-for-windows
2014-09-30 06:25:38 +02:00
Tanguy Pruvot
4f326576d2
implement X-Mining-Hashrate header
...
remove midstate extension, seems only used in sha256/scrypt
and prepare noncerange, need a pool which supports that to finish...
2014-09-29 08:24:12 +02:00
Tanguy Pruvot
799b230af2
enhance solo mining, update http headers
...
and prepare next version...
2014-09-28 15:34:44 +02:00
Tanguy Pruvot
c0b5513316
Try some obscure cuda flags (kbomba)
...
http://devblogs.nvidia.com/parallelforall/separate-compilation-linking-cuda-device-code/
2014-09-27 13:58:29 +02:00
Tanguy Pruvot
a6fcc8fdb6
use cudart_static.lib, keep SM 5.0 by default
...
SM 5.2 works also on the 750 Ti but if we specify both at compile time,
hash speed will be reduced (the 750Ti will use 5.2 which is not optimal)
2014-09-27 12:53:19 +02:00
Tanguy Pruvot
5579b91cfb
build for both GM104 and GM204
...
For the GTX 750 and new 970/980
also fix -a luffa parameter for 1.4.4 release
2014-09-27 09:46:52 +02:00
Tanguy Pruvot
ba33492592
blake: return to ptarget 6:7 compare
...
clz can be erroneous, ex 0xE0 vs 0xF0
2014-09-19 05:01:16 +02:00
Tanguy Pruvot
91eea0d76b
blake: remove int cudaMemcpyToSymbol for MSVC
...
use clz (leading zeros) asm func for a fast gpu compare of ptarget[6]:[7]
add also missing windows ctz/clz host functions
New NEOS speed: 227MH to 270MH (Gigabyte 750Ti Black Edition)
2014-09-13 17:31:01 +02:00
Tanguy Pruvot
9efe0b965d
blake: only use high part of target on gpu
...
Add another few MH/s boost :)
2014-09-13 00:15:34 +02:00
Tanguy Pruvot
cc296a0618
stratum: check if job was read
2014-09-13 00:15:25 +02:00
Tanguy Pruvot
8925a7551f
blake: final cleanup (225MH/s)
2014-09-11 20:16:16 +02:00
Tanguy Pruvot
347d4e4928
blake: +8MH/s on linux, weird optimisation
...
Like doom/luffa, using a int pos make the proc faster
2014-09-11 02:33:34 +02:00
Tanguy Pruvot
23f0cee61f
Add cuda error checks on qubit algos
...
And rename doom to luffa, like djm34
2014-09-11 02:20:52 +02:00
Tanguy Pruvot
1aec4555cc
Tune reg. count for qubit (luffa) algos
2014-09-11 00:50:27 +02:00
Tanguy Pruvot
31f77b6524
Put bloc height extraction in a function
2014-09-10 16:50:17 +02:00
Tanguy Pruvot
edf756deb5
update readme
2014-09-10 10:49:41 +02:00
Tanguy Pruvot
80d6e09ca6
Merge branch 'qubit'
2014-09-10 00:31:07 +02:00
Tanguy Pruvot
402e70f636
Update VS Project
2014-09-10 00:27:01 +02:00
Tanguy Pruvot
7cc5222394
Move common check_cpu functions to root
2014-09-10 00:27:01 +02:00
Tanguy Pruvot
c3eb66683a
Import djm34 qubit, deep and doom algos
...
Indent, and put commonly used functions proto. in cuda_helper.h
And add them to --cputest function
Also change the color option to --nocolor, -C is no more needed
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
(Which is tired to remove these german copy/pasted comments)
2014-09-10 00:26:55 +02:00
Tanguy Pruvot
474ee97d6e
Merge branch 'blake-dev' into blake
2014-09-09 22:02:19 +02:00
Tanguy Pruvot
429266346c
Prepare version 1.4.2
2014-09-09 21:59:03 +02:00
Tanguy Pruvot
13bb9d267e
Remove debug rpc, already exists with -P
2014-09-09 21:59:03 +02:00
Tanguy Pruvot
9e5ec398b2
Purge anti-dup data on target change
2014-09-09 21:59:03 +02:00
Tanguy Pruvot
cec5baea95
enable colors by default, except for syslog
...
debug: show compared hash diffs in color
2014-09-09 21:59:03 +02:00
Tanguy Pruvot
3ed36f285b
try to prevent gpu pauses
2014-09-09 21:59:03 +02:00
Tanguy Pruvot
402e416853
Add pentablake algo (-a penta)
...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2014-09-09 21:58:58 +02:00