Commit Graph

46 Commits

Author SHA1 Message Date
Tanguy Pruvot
113e22de2e blake: prevent empty scan ranges with multiple gpus
in some cases, an empty scan range was possible in benchmark..
2015-11-01 22:14:17 +01:00
Tanguy Pruvot
61ff92b5b4 never interrupt global benchmark with found nonces
fix some algo weird hashrates (like blake)
and reset device between algos, for better accuracy

but this reset doesnt seems enough to bench all algos correctly...

to test on linux, could be a driver issue...

heavy: fix first alloc and indent with tabs...
2015-11-01 21:12:50 +01:00
Tanguy Pruvot
355b835ae0 benchmark: enhance the mem leak detection
reduce "false" warnings, and ignore unrelated/small ones <= 1 MB

On windows the gpu memory can be allocated by other processes

+ some cleanup in algos... (free/gpulog)
2015-10-16 22:04:30 +02:00
Tanguy Pruvot
4868c412b0 windows: add support for SM 2.1, drop SM 3.5 (x86)
Mostly to do compatibilty tests, SM 2.1 support is very limited

SM 3.0 code should run on SM 3.5 (only a few cards use this arch)

As i can't test SM 3.5, its best to let users do their own tests...
2015-10-15 23:02:35 +02:00
Tanguy Pruvot
a7d54cd7ef blake: no need to fail on init, no big alloc 2015-10-15 20:10:58 +02:00
Tanguy Pruvot
6a9280a045 lyra2v2: set a better TPB for intensity 20 (sm52)
use sp forced unroll in skein and do some cleanup...
2015-10-15 02:01:34 +02:00
Tanguy Pruvot
5bf1f98200 various fixes for SM 2.1 and the benchmark
X11+ algos and quark are not compatible for the moment

but these ones are :

Benchmark results for Gigabyte GTX 460 (SM 2.1 / 1 GB):

   blakecoin :     159090.5 kH/s,     1 MB,  1048576 thr.
       blake :      70208.9 kH/s,     1 MB,  1048576 thr.
         bmw :     122802.6 kH/s,    65 MB,  2097152 thr.
        deep :       3533.6 kH/s,    33 MB,   524288 thr.
    fugue256 :      43177.9 kH/s,    17 MB,   524288 thr.
       heavy :       4118.2 kH/s,   147 MB,   524032 thr.
      keccak :      18673.1 kH/s,   129 MB,  2097152 thr.
       luffa :      28816.0 kH/s,   257 MB,  4194304 thr.
       lyra2 :        213.7 kH/s,   570 MB,    65536 thr.
    mjollnir :       3895.6 kH/s,   147 MB,   524032 thr.
       nist5 :       1101.4 kH/s,    67 MB,  1048576 thr.
       penta :        501.6 kH/s,    21 MB,   327680 thr.
       skein :       5432.4 kH/s,    65 MB,  1048576 thr.
      skein2 :       6788.9 kH/s,    33 MB,   524288 thr.
   whirlpool :        688.5 kH/s,    33 MB,   524288 thr.
         zr5 :        122.5 kH/s,    86 MB,   262144 thr.
2015-10-14 02:59:54 +00:00
Tanguy Pruvot
fc84c719e9 lyra2: improve cuda implementation (part 1, SM5+)
based on the new djm34 method, 2x faster than first version

cleaned and tuned for the GTX 750/960 (linux / cuda 6.5)
2015-10-13 00:57:29 +02:00
Tanguy Pruvot
d195f2e8a2 intensity: do not reduce throughput before init
Else the memory allocated could be less than required later

btw, use the new "cuda" function to apply intensity/throughput
2015-10-11 05:01:41 +02:00
Tanguy Pruvot
c6dcc5e5cf benchmark: show mem and default throughput in results
and prepare a new function to get the default intensity

also, take care of multiple threads per gpu...
2015-10-11 04:38:28 +02:00
Tanguy Pruvot
8db5a0bc9e blake: change dynamic round system
blakecoin was conflicting with lyra2, set the rounds more properly
2015-10-11 03:46:30 +02:00
Tanguy Pruvot
c2214091ae benchmark: free last memory leaks on algo switch
remains my original lyra2 implementation to fix... (cuda_lyra2.cu)

I guess some kind of memory overflow force the driver to allocate
memory... but was unable to free it without device reset.
2015-10-10 02:15:32 +02:00
Tanguy Pruvot
4e1e03b891 benchmark: store all algos results + cuda fixes
Note: lyra2, lyra2v2 and script seems to have problems
to coexist with other algos... to run after some of them...

moved lyra2 first and skip scrypt/jane for the moment...

Only stored in memory for now.. to display a table after the bench

ccminer -a auto --benchmark

Results may be exported later to a json file...
2015-10-09 02:07:08 +02:00
Tanguy Pruvot
922c2a5cd7 algos: free allocated mem for algo switch
All can be freed propertly now, except script (reset) and lyra2 (leak)
2015-10-08 21:35:30 +02:00
Tanguy Pruvot
ee93927fac diff: use the new function in all algos 2015-10-07 20:10:15 +02:00
Tanguy Pruvot
5f12943de5 whirlpool: add algo free function + vstudio 2015-10-06 23:53:03 +02:00
Tanguy Pruvot
b641bfdf8b diff: rename functions like cpuminer-multi
more proper, intuitive...
2015-10-06 23:37:13 +02:00
Tanguy Pruvot
e1c4b3042c algos: add functions to free allocated resources
Will be used later for algo switching

not really tested yet...
2015-09-25 07:51:57 +02:00
Tanguy Pruvot
5308898d1c start v1.7, apply new prototypes to all algos 2015-09-23 15:42:17 +02:00
Tanguy Pruvot
8f98bde4fb lyra2v2: improve cubehash with uint2 2015-09-06 13:49:52 +02:00
Tanguy Pruvot
21f5435420 lyra2: improve skein256 component 2015-08-23 09:46:48 +02:00
Tanguy Pruvot
01f3183c31 bmw algo for MDT, with midstate
which could be extracted from json too

replace a satcoin by another one ;)

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2015-08-22 15:01:51 +02:00
Tanguy Pruvot
b256ca47a0 bmw256: reduce target array size 2015-08-22 12:30:07 +02:00
Tanguy Pruvot
53cd591956 lyra2v2, bmw256 and cubehash256 cleanup + diff fix 2015-08-18 11:10:58 +02:00
Tanguy Pruvot
d4e191610e Import and adapt lyra2v2
not tested on windows and with SM <= 5
2015-08-18 09:27:11 +02:00
Tanguy Pruvot
b02f79b58b lyra2: recover the kH/s lost in last commit 2015-06-06 00:25:04 +00:00
Tanguy Pruvot
2b43d57d42 lyra2: simplify skein code (no perf changes) 2015-06-05 23:32:43 +02:00
Tanguy Pruvot
e95712a2ea lyra2: reduce blake message len. 2015-06-05 22:40:29 +02:00
Tanguy Pruvot
2f541065fb cuda_helper: rename correctly hiword/loword functions 2015-05-12 17:13:58 +02:00
Tanguy Pruvot
03c3b7d341 Various algos cleanup + lyra2 sec nonce fix 2015-05-10 18:49:22 +02:00
Tanguy Pruvot
34fd408440 lyra2: get a second nonce per gpu scan 2015-05-10 03:20:13 +02:00
Tanguy Pruvot
3d3f2e2cb5 warnings: use the right device id (device_map[thr_id]) 2015-04-23 09:41:56 +02:00
Tanguy Pruvot
38e6672d70 Allow test of SM 2.1/3.0 binaries on newer cards
Implementation based on klausT work.. a bit different

This code must be placed in a common .cu file,
cuda.cpp is not compiled with nvcc and doesnt allow cuda code...
2015-03-28 12:00:53 +01:00
KlausT
ae8e863591 remove uint32_t cast 2015-03-12 01:01:47 +01:00
Tanguy Pruvot
77c737ff72 various small changes and update readme 2015-03-08 16:33:53 +01:00
Tanguy Pruvot
e6112e878d cleanup: use unsigned throughput parameters
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
2015-02-28 14:05:09 +01:00
Tanguy Pruvot
26b51a557b Allow different intensity per device
and clean the old variables, no more required
2015-01-24 11:17:29 +01:00
Tanguy Pruvot
9f2dd3ee60 Remove some useless conversions
do not impact perfs neither...
2015-01-24 08:00:22 +01:00
Tanguy Pruvot
2a5233f56e api: report throughput when default 2015-01-22 06:28:59 +01:00
Tanguy Pruvot
cafd4477d7 Handle a maximum of 16 gpus (vs 8 before)
Some cards have 2 gpus on board...
2015-01-22 04:55:27 +01:00
Tanguy Pruvot
a66d78e692 reduce lyra2 blake and pentablake cpu load 2014-12-19 09:16:55 +01:00
Tanguy Pruvot
63e3387dbb lyra2: add sm30 device compat (skein256) 2014-12-16 14:19:07 +01:00
Tanguy Pruvot
fa7d744a6c lyra2: make_uint2 and set pool difficulty 2014-12-15 09:48:27 +01:00
Tanguy Pruvot
49a73971c4 Enhance stale work detection + throughput fixes
seems to resolve solo mining lock on share.
export also computed solo work diff in api (not perfect)

In high rate algos, throughput should be unsigned...
This fixes keccak, blake and doom problems

And change terminal color of debug lines, to be selectable in putty,
color code is not supported in windows but selection is ok there.
2014-12-07 12:58:41 +01:00
Tanguy Pruvot
ef8a73d6aa keccak: not compatible with second nonces (was broken)
Use djm34 new uint2 method to get a +40% boost (115 to 153MH/s)
2014-12-06 13:55:13 +01:00
Tanguy Pruvot
c5b349e079 Add Lyra2 algo, based on Vertcoin published code
Seems to be djm34 work, i recognize the code style ;)

Code was cleaned/indented and adapted to my fork...

Only usable on the test pool until 16 december 2014!
2014-12-06 11:28:26 +01:00