1
0
mirror of https://github.com/GOSTSec/ccminer synced 2025-01-12 07:48:33 +00:00
Commit Graph

81 Commits

Author SHA1 Message Date
Tanguy Pruvot
e388c11c02 blake2s fix and more missing cuda arch (for the benchmarks) 2017-03-08 13:13:52 +01:00
Tanguy Pruvot
2cdf2ddd43 Add missing real cuda arch checks 2017-03-08 09:19:10 +01:00
Tanguy Pruvot
c66e8622b3 api: report per thread cpu hash checks (ACC/REJ)
+ update all algos for that...
2017-02-07 06:26:02 +01:00
Tanguy Pruvot
6440a9bf41 windows: some default intensity adjustments 2017-01-30 02:31:44 +01:00
Tanguy Pruvot
b47d9acaf5 readme + small warnings detected by vstudio 2017-01-29 22:23:05 +01:00
Tanguy Pruvot
0ff75791e5 migrate 2nd nonce storage of most algos
This allow to keep pdata[19] as cursor between scans, and later, to sort them..

remains... heavy, scrypt, sia...
2017-01-29 05:46:45 +01:00
Tanguy Pruvot
50534789bc Release 1.8.4 2016-12-21 20:35:09 +01:00
Tanguy Pruvot
44bd244fc4 blake2s improved
based on alexis work, with the new work->nonces
2016-12-21 19:44:20 +01:00
Tanguy Pruvot
a43205a84f decred: multiple nonces code cleanup
The double loop is not useful, and prefer the __thread attribute
to enhance the code readability (remove the 2D host arrays).

squashed: return to host 2D array to allow the free
2016-09-27 22:50:52 +02:00
Tanguy Pruvot
9eead77027 diff: show by default, rework shares diff storage
This will allow later more gpu candidates.

Note: This is an unfinished work, we keep the previous behavior for now
To finish this, all algos solutions should be migrated and submitted nonces attributes stored.
Its required to handle the different share diff per nonce and fix the possible solved count error (if 1/2 nonces is solved).
2016-09-27 09:03:24 +02:00
Tanguy Pruvot
2f57ee9157 bench: skip the disabled whirlpoolx
+ veltor free
+ some missed/extra log things...
2016-09-27 01:41:49 +02:00
Tanguy Pruvot
34e97bf3e6 Show intensity on init for all algos 2016-09-27 00:33:06 +02:00
Tanguy Pruvot
2ee8bc9791 nvapi: do not print that on normal -D 2016-06-24 10:14:58 +02:00
Tanguy Pruvot
eae4ede111 decred: return to previous implementation + second nonce
seems better on windows and a bit easier to read...
2016-06-23 03:54:33 +02:00
Tanguy Pruvot
c643b3b900 decred: and even faster implementation by Alexis
optimized for the 9xx and more recent, same results on the 750 Ti
+ restore second nonce support not present in nicehash published version

Better on linux at least...
2016-06-23 00:36:28 +02:00
Tanguy Pruvot
7e490693e0 decred: nicehash/alexis improvement 2016-06-22 22:32:23 +02:00
Tanguy Pruvot
0deb9a2aca win32: implement a nvapi.dll wrapper like nvml
Allow to get/set missing infos like the power limit on x86

squashed for a better min/max and device mapping

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2016-06-21 05:16:12 +02:00
Tanguy Pruvot
710d9292af fix duplicates on skein2 and blake2s (nonce endian) 2016-05-18 02:53:53 +02:00
Tanguy Pruvot
c0fca5c932 decred: magic improvement in one line
+ ifdef the 4WAY commented code...
2016-04-04 17:49:54 +02:00
pallas1
ebf885d482 ~10% speedup 2016-04-02 22:21:31 +02:00
alexis78
be1f64446a vanilla: sync with MrM4D, remove SSE2 midstate computation
was not useful and hard to read...
2016-03-23 11:39:34 +01:00
Tanguy Pruvot
5a69056ee5 blake2s cleanup 2016-03-13 19:36:01 +01:00
Tanguy Pruvot
7ffe65c262 blake2s algo
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2016-03-13 16:50:32 +01:00
Tanguy Pruvot
d58490911a decred: remove some useless double flip 2016-02-28 18:19:32 +01:00
Tanguy Pruvot
a823cca7f9 decred: allow custom extranonce sizes
the extranonce is already placed after header in job.coinbase
2016-02-19 15:52:17 +01:00
Tanguy Pruvot
096f136c36 enhance vanilla second nonce check 2016-02-19 11:31:00 +01:00
Tanguy Pruvot
4944e1a098 mrM4D vnl, with some changes 2016-02-19 11:31:00 +01:00
Tanguy Pruvot
7c9ec8629f decred: handle a second nonce 2016-02-18 22:47:03 +01:00
Tanguy Pruvot
6e95407dcf decred algo for longpoll/getwork
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2016-02-11 07:10:46 +01:00
Tanguy Pruvot
da64c50059 blake: some more tuning and cleanup 2016-01-31 17:07:11 +01:00
Tanguy Pruvot
7c1137f335 blake: small change for the second nonce 2016-01-28 03:05:25 +01:00
Tanguy Pruvot
934f0e5054 blake: reduce intensity (and fix older devices) 2016-01-27 20:04:19 +01:00
Tanguy Pruvot
4a7e239d7c blake: merge sp improvements, start 1.7.2 dev..
to be tested on old arch too...
2016-01-27 18:30:06 +01:00
Tanguy Pruvot
a237601747 1.7.1 release
set schedule flags to reduce linux cpu usage without MyStreamSynchronize()
2016-01-26 20:43:16 +01:00
Tanguy Pruvot
e50556b637 various changes, cleanup for the release
small fixes to handle better the multi thread per gpu

explicitly report than quark is not compatible with SM 2.1 (compact shuffle)
2015-11-04 14:59:59 +01:00
Tanguy Pruvot
113e22de2e blake: prevent empty scan ranges with multiple gpus
in some cases, an empty scan range was possible in benchmark..
2015-11-01 22:14:17 +01:00
Tanguy Pruvot
61ff92b5b4 never interrupt global benchmark with found nonces
fix some algo weird hashrates (like blake)
and reset device between algos, for better accuracy

but this reset doesnt seems enough to bench all algos correctly...

to test on linux, could be a driver issue...

heavy: fix first alloc and indent with tabs...
2015-11-01 21:12:50 +01:00
Tanguy Pruvot
355b835ae0 benchmark: enhance the mem leak detection
reduce "false" warnings, and ignore unrelated/small ones <= 1 MB

On windows the gpu memory can be allocated by other processes

+ some cleanup in algos... (free/gpulog)
2015-10-16 22:04:30 +02:00
Tanguy Pruvot
4868c412b0 windows: add support for SM 2.1, drop SM 3.5 (x86)
Mostly to do compatibilty tests, SM 2.1 support is very limited

SM 3.0 code should run on SM 3.5 (only a few cards use this arch)

As i can't test SM 3.5, its best to let users do their own tests...
2015-10-15 23:02:35 +02:00
Tanguy Pruvot
a7d54cd7ef blake: no need to fail on init, no big alloc 2015-10-15 20:10:58 +02:00
Tanguy Pruvot
6a9280a045 lyra2v2: set a better TPB for intensity 20 (sm52)
use sp forced unroll in skein and do some cleanup...
2015-10-15 02:01:34 +02:00
Tanguy Pruvot
5bf1f98200 various fixes for SM 2.1 and the benchmark
X11+ algos and quark are not compatible for the moment

but these ones are :

Benchmark results for Gigabyte GTX 460 (SM 2.1 / 1 GB):

   blakecoin :     159090.5 kH/s,     1 MB,  1048576 thr.
       blake :      70208.9 kH/s,     1 MB,  1048576 thr.
         bmw :     122802.6 kH/s,    65 MB,  2097152 thr.
        deep :       3533.6 kH/s,    33 MB,   524288 thr.
    fugue256 :      43177.9 kH/s,    17 MB,   524288 thr.
       heavy :       4118.2 kH/s,   147 MB,   524032 thr.
      keccak :      18673.1 kH/s,   129 MB,  2097152 thr.
       luffa :      28816.0 kH/s,   257 MB,  4194304 thr.
       lyra2 :        213.7 kH/s,   570 MB,    65536 thr.
    mjollnir :       3895.6 kH/s,   147 MB,   524032 thr.
       nist5 :       1101.4 kH/s,    67 MB,  1048576 thr.
       penta :        501.6 kH/s,    21 MB,   327680 thr.
       skein :       5432.4 kH/s,    65 MB,  1048576 thr.
      skein2 :       6788.9 kH/s,    33 MB,   524288 thr.
   whirlpool :        688.5 kH/s,    33 MB,   524288 thr.
         zr5 :        122.5 kH/s,    86 MB,   262144 thr.
2015-10-14 02:59:54 +00:00
Tanguy Pruvot
fc84c719e9 lyra2: improve cuda implementation (part 1, SM5+)
based on the new djm34 method, 2x faster than first version

cleaned and tuned for the GTX 750/960 (linux / cuda 6.5)
2015-10-13 00:57:29 +02:00
Tanguy Pruvot
d195f2e8a2 intensity: do not reduce throughput before init
Else the memory allocated could be less than required later

btw, use the new "cuda" function to apply intensity/throughput
2015-10-11 05:01:41 +02:00
Tanguy Pruvot
c6dcc5e5cf benchmark: show mem and default throughput in results
and prepare a new function to get the default intensity

also, take care of multiple threads per gpu...
2015-10-11 04:38:28 +02:00
Tanguy Pruvot
8db5a0bc9e blake: change dynamic round system
blakecoin was conflicting with lyra2, set the rounds more properly
2015-10-11 03:46:30 +02:00
Tanguy Pruvot
c2214091ae benchmark: free last memory leaks on algo switch
remains my original lyra2 implementation to fix... (cuda_lyra2.cu)

I guess some kind of memory overflow force the driver to allocate
memory... but was unable to free it without device reset.
2015-10-10 02:15:32 +02:00
Tanguy Pruvot
4e1e03b891 benchmark: store all algos results + cuda fixes
Note: lyra2, lyra2v2 and script seems to have problems
to coexist with other algos... to run after some of them...

moved lyra2 first and skip scrypt/jane for the moment...

Only stored in memory for now.. to display a table after the bench

ccminer -a auto --benchmark

Results may be exported later to a json file...
2015-10-09 02:07:08 +02:00
Tanguy Pruvot
922c2a5cd7 algos: free allocated mem for algo switch
All can be freed propertly now, except script (reset) and lyra2 (leak)
2015-10-08 21:35:30 +02:00
Tanguy Pruvot
ee93927fac diff: use the new function in all algos 2015-10-07 20:10:15 +02:00