Tanguy Pruvot
5bf1f98200
various fixes for SM 2.1 and the benchmark
...
X11+ algos and quark are not compatible for the moment
but these ones are :
Benchmark results for Gigabyte GTX 460 (SM 2.1 / 1 GB):
blakecoin : 159090.5 kH/s, 1 MB, 1048576 thr.
blake : 70208.9 kH/s, 1 MB, 1048576 thr.
bmw : 122802.6 kH/s, 65 MB, 2097152 thr.
deep : 3533.6 kH/s, 33 MB, 524288 thr.
fugue256 : 43177.9 kH/s, 17 MB, 524288 thr.
heavy : 4118.2 kH/s, 147 MB, 524032 thr.
keccak : 18673.1 kH/s, 129 MB, 2097152 thr.
luffa : 28816.0 kH/s, 257 MB, 4194304 thr.
lyra2 : 213.7 kH/s, 570 MB, 65536 thr.
mjollnir : 3895.6 kH/s, 147 MB, 524032 thr.
nist5 : 1101.4 kH/s, 67 MB, 1048576 thr.
penta : 501.6 kH/s, 21 MB, 327680 thr.
skein : 5432.4 kH/s, 65 MB, 1048576 thr.
skein2 : 6788.9 kH/s, 33 MB, 524288 thr.
whirlpool : 688.5 kH/s, 33 MB, 524288 thr.
zr5 : 122.5 kH/s, 86 MB, 262144 thr.
9 years ago
Tanguy Pruvot
d195f2e8a2
intensity: do not reduce throughput before init
...
Else the memory allocated could be less than required later
btw, use the new "cuda" function to apply intensity/throughput
9 years ago
Tanguy Pruvot
c6dcc5e5cf
benchmark: show mem and default throughput in results
...
and prepare a new function to get the default intensity
also, take care of multiple threads per gpu...
9 years ago
Tanguy Pruvot
8db5a0bc9e
blake: change dynamic round system
...
blakecoin was conflicting with lyra2, set the rounds more properly
9 years ago
Tanguy Pruvot
5f12943de5
whirlpool: add algo free function + vstudio
9 years ago
Tanguy Pruvot
b641bfdf8b
diff: rename functions like cpuminer-multi
...
more proper, intuitive...
9 years ago
Tanguy Pruvot
e1c4b3042c
algos: add functions to free allocated resources
...
Will be used later for algo switching
not really tested yet...
9 years ago
Tanguy Pruvot
5308898d1c
start v1.7, apply new prototypes to all algos
9 years ago
Tanguy Pruvot
3d3f2e2cb5
warnings: use the right device id (device_map[thr_id])
10 years ago
Tanguy Pruvot
e6112e878d
cleanup: use unsigned throughput parameters
...
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
10 years ago
Tanguy Pruvot
26b51a557b
Allow different intensity per device
...
and clean the old variables, no more required
10 years ago
Tanguy Pruvot
2a5233f56e
api: report throughput when default
10 years ago
Tanguy Pruvot
cafd4477d7
Handle a maximum of 16 gpus (vs 8 before)
...
Some cards have 2 gpus on board...
10 years ago
Tanguy Pruvot
a66d78e692
reduce lyra2 blake and pentablake cpu load
10 years ago
Tanguy Pruvot
49a73971c4
Enhance stale work detection + throughput fixes
...
seems to resolve solo mining lock on share.
export also computed solo work diff in api (not perfect)
In high rate algos, throughput should be unsigned...
This fixes keccak, blake and doom problems
And change terminal color of debug lines, to be selectable in putty,
color code is not supported in windows but selection is ok there.
10 years ago
Tanguy Pruvot
c5b349e079
Add Lyra2 algo, based on Vertcoin published code
...
Seems to be djm34 work, i recognize the code style ;)
Code was cleaned/indented and adapted to my fork...
Only usable on the test pool until 16 december 2014!
10 years ago
Tanguy Pruvot
c3bdb623e8
Check and submit multiple nonces in one loop
...
Added to most algos, checkhash function scans a big range
and can find multiple nonces at once if the difficulty is low.
Stop ignoring them, submit second one if found...
Clean the draft code for rc=2 implemented for blake and pentablake
btw... fix the reduced displayed hashrate when a nonce is found...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
56ed0fed05
blake: remove hashharder workaround
10 years ago
Tanguy Pruvot
8ad180cc70
various small changes
...
heavy: reduce by 256 threads default intensity to all -i 20
cuda: put static thread init bools outside the code (made once)
api: fix nvml header to build without
10 years ago
Tanguy Pruvot
1b65cd05cc
heavy: add error checks, fix strict aliasing and linux
...
The core problem was the cuda hefty Thread per block set to high
but took me several hours to find that...
btw... +25% in heavy 12500 with 256 threads per block... vs 128 & 512
if max reg count is set to 80...
10 years ago
Tanguy Pruvot
6ae28162db
various extern cleanup + api history uids and gpu SM
...
uids could be useful to create graphes from history data
Note: please do a clean build after this commit (changes in miner.h)
10 years ago
Tanguy Pruvot
73f22b237a
Prepare trap of hardware/mem failures
10 years ago
Tanguy Pruvot
b4ef7b981f
scan range: add boundary check, cant be > UINT32_MAX
10 years ago
Tanguy Pruvot
438308b3a2
Rework benchmark mode and min/max range
...
Was maybe my fault, but the benchmark mode was
always recomputing from nonce 0.
Also fix blake if -d 1 is used (one thread but second gpu)
stats: do not use thread id as key, prefer gpu id...
10 years ago
Tanguy Pruvot
b128312efb
cuda: store device SM in a global var
...
sample usage made for blake and fugue (higher intensity for SM5.2)
add these to cuda_helper and clean unused code
10 years ago
Tanguy Pruvot
11c5ec810d
Handle intensity param in all algos
...
and add a check related to start/max nounce params
10 years ago
Tanguy Pruvot
4c3964539f
Fix vc debug builds, missing symbols
10 years ago
Tanguy Pruvot
12fafd5687
Try to reconnect on pool duplicates
...
reduce log announces and define uchar in miner.h
10 years ago
Tanguy Pruvot
187e293f71
blake: some fine tuning + cleanup
10 years ago
Tanguy Pruvot
5bc969fa57
Some work on data alignment
...
linux: add -march=native (we build it ourself) and some other flags
+ remove unused vars (seen with -Wall)
10 years ago
Tanguy Pruvot
93bb428bdf
blake: rewrite the cache system
...
Unlike other hash algos, blake256 compute the hash
with blocks of 64 bytes.
We can do the first part on the cpu, only the 4 last int32
are computed on gpu (including the tested nonce)
Previous method was also using this kind of cache with a crc.
Blake Hash Speed: +5%
10 years ago
Tanguy Pruvot
ba33492592
blake: return to ptarget 6:7 compare
...
clz can be erroneous, ex 0xE0 vs 0xF0
10 years ago
Tanguy Pruvot
91eea0d76b
blake: remove int cudaMemcpyToSymbol for MSVC
...
use clz (leading zeros) asm func for a fast gpu compare of ptarget[6]:[7]
add also missing windows ctz/clz host functions
New NEOS speed: 227MH to 270MH (Gigabyte 750Ti Black Edition)
10 years ago
Tanguy Pruvot
9efe0b965d
blake: only use high part of target on gpu
...
Add another few MH/s boost :)
10 years ago
Tanguy Pruvot
8925a7551f
blake: final cleanup (225MH/s)
10 years ago
Tanguy Pruvot
347d4e4928
blake: +8MH/s on linux, weird optimisation
...
Like doom/luffa, using a int pos make the proc faster
10 years ago
Tanguy Pruvot
cec5baea95
enable colors by default, except for syslog
...
debug: show compared hash diffs in color
10 years ago
Tanguy Pruvot
402e416853
Add pentablake algo (-a penta)
...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
42eafcbe85
Put CRC-32 function in a new unit
...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
9140e7f8ad
Release 1.4.1, with blake cache (220MH/s)
10 years ago
Tanguy Pruvot
5ccd166916
blake: introduce pdata head cache (speed x2)
10 years ago
Tanguy Pruvot
65909ec3b7
blake: handle case when 2 hashes are found in a call
10 years ago
Tanguy Pruvot
383b184549
Add support for blakecoin (-a blakecoin)
...
Blakecoin use an old variant of Blake 256.
Speed : 190 MHash/s (vs 25 in cudaMiner)
Restore support of this algo (was in cudaminer before)
10 years ago
Tanguy Pruvot
52ec8830b1
blake: blakecoin variant now works
10 years ago
Tanguy Pruvot
ecc86af102
blake: sometimes faster, or not
10 years ago
Tanguy Pruvot
3356e6f8bf
blake: some more KH/s on linux
10 years ago
Tanguy Pruvot
12fefe5de0
blake: add a few more MH/s, prepare blakecoin
10 years ago
Tanguy Pruvot
5682b7d241
blake: add also blakecoin (8-rounds) variant
10 years ago
Tanguy Pruvot
e1159629b4
blake: typo for windows on last commit
10 years ago
Tanguy Pruvot
e0487aac46
blake: typo for windows on last commit
10 years ago