The core problem was the cuda hefty Thread per block set to high
but took me several hours to find that...
btw... +25% in heavy 12500 with 256 threads per block... vs 128 & 512
if max reg count is set to 80...
Was maybe my fault, but the benchmark mode was
always recomputing from nonce 0.
Also fix blake if -d 1 is used (one thread but second gpu)
stats: do not use thread id as key, prefer gpu id...
Unlike other hash algos, blake256 compute the hash
with blocks of 64 bytes.
We can do the first part on the cpu, only the 4 last int32
are computed on gpu (including the tested nonce)
Previous method was also using this kind of cache with a crc.
Blake Hash Speed: +5%
use clz (leading zeros) asm func for a fast gpu compare of ptarget[6]:[7]
add also missing windows ctz/clz host functions
New NEOS speed: 227MH to 270MH (Gigabyte 750Ti Black Edition)
Blake256: squashed commit...
Squashed commit of the following:
commit c370208bc92ef16557f66e5391faf2b1ad47726f
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Wed Sep 3 13:53:01 2014 +0200
hashlog: prepare store of scanned range
commit e2cf49a5e956f03deafd266d1a0dd087a2041c99
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Wed Sep 3 12:54:13 2014 +0200
stratum: store server time offset in context
commit 1a4391d7ff
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Tue Sep 2 12:40:52 2014 +0200
hashlog: prevent double computing on jobs already done
commit 049e577301
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Wed Sep 3 09:49:14 2014 +0200
tmp blake log
commit 43d3e93e1a
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Wed Sep 3 09:29:51 2014 +0200
blake: set a max throughput
commit 7e595a36ea
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Tue Sep 2 21:13:37 2014 +0200
blake: cleanup, remove d_hash buf, not in a chain
host: only bencode if gpu hash was found
commit de80c7e9d1
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Tue Sep 2 12:40:44 2014 +0200
blake: remove unused parameter and fix index in d_hash
that reduce the speed to 92MH/s but the next commit
give us 30 more
so, todo: merge the whole checkhash proc in gpu_hash
and remove this d_hash buffer...
commit 2d42ae6de5
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Tue Sep 2 05:09:31 2014 +0200
stratum: handle a small cache of submitted jobs
Prevent to send duplicated shares on some pools like hashharder..
This cache keeps submitted job/nounces of the last 15 minutes
so, remove exit on repeated duplicate shares,
the submitted cache now handles this problem.
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
commit 1b8c3c12fa
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Tue Sep 2 03:38:57 2014 +0200
debug: a new boolean to log or not json rpc data
commit 1f99aae0ff
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Mon Sep 1 18:49:23 2014 +0200
exit on repeated duplicate shares (to enhance)
create a new function proper_exit() to do common stuff on exit...
commit 530732458a
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Mon Sep 1 12:22:51 2014 +0200
blake: use a constant for threads, reduce mallocated d_hash size
and clean a bit more...
commit 0aeac878ef
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Mon Sep 1 06:12:55 2014 +0200
blake: tune up and cleanup, ~100 MH/s on a normal 750Ti
tested on linux and windows (x86 binary)...
but there is a high number of duplicated shares... weird
commit 4a52d0553b
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Mon Sep 1 10:22:32 2014 +0200
debug: show json methods, hide hash/target if ok
commit 1fb9becc1f
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Mon Sep 1 08:44:19 2014 +0200
cpu-miner: sort algos by name, show reject reason
commit bfe96c49b0
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Mon Aug 25 11:21:06 2014 +0200
release 1.4, update README...
commit c17d11e377
Author: Tanguy Pruvot <tanguy.pruvot@gmail.com>
Date: Sun Aug 31 08:57:48 2014 +0200
add "blake" 256, 14 rounds (for NEOS blake, not BlakeCoin)
also remove "missing" file, its old and not compatible with ubuntu 14.04
to test on windows
blake: clean and optimize
Release v1.4 with blake (NEOS)
that reduce the speed to 92MH/s but the next commit
give us 30 more
so, todo: merge the whole checkhash proc in gpu_hash
and remove this d_hash buffer...