Tanguy Pruvot
d195f2e8a2
intensity: do not reduce throughput before init
...
Else the memory allocated could be less than required later
btw, use the new "cuda" function to apply intensity/throughput
9 years ago
Tanguy Pruvot
4e1e03b891
benchmark: store all algos results + cuda fixes
...
Note: lyra2, lyra2v2 and script seems to have problems
to coexist with other algos... to run after some of them...
moved lyra2 first and skip scrypt/jane for the moment...
Only stored in memory for now.. to display a table after the bench
ccminer -a auto --benchmark
Results may be exported later to a json file...
9 years ago
Tanguy Pruvot
922c2a5cd7
algos: free allocated mem for algo switch
...
All can be freed propertly now, except script (reset) and lyra2 (leak)
9 years ago
Tanguy Pruvot
ee93927fac
diff: use the new function in all algos
9 years ago
Tanguy Pruvot
e1c4b3042c
algos: add functions to free allocated resources
...
Will be used later for algo switching
not really tested yet...
9 years ago
Tanguy Pruvot
5308898d1c
start v1.7, apply new prototypes to all algos
9 years ago
Tanguy Pruvot
e3548f46f3
drop animecoin support
...
no more really minable... just minable in french
9 years ago
Tanguy Pruvot
4709668995
jh512: rewrite and optimize with asm swap
...
5% improvement by the vshl asm swap functions, mixed shl+add inst.,
Add also xchg(x, y) func and XCHG(x, y) define in cuda_helper for later use...
other jh changes are mainly for the beauty of the code...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
a55b148ecc
windows: fix missing off_t include
10 years ago
Tanguy Pruvot
ed4927fcd0
quark/x11: set signed int hashPosition vars to off_t
...
groestl (and keccak?) seems faster with 64bit vars (off_t or int64_t)...
10 years ago
Tanguy Pruvot
ebe95aac2f
bmw512: cleanup after cuda 7 bug fix
10 years ago
Tanguy Pruvot
0224d4705e
skein: fix wrong hashes seen on x11 with cuda 7
...
Look like a stream synch problem, not related to cuda 7 headers or cudart
The threadfence() added doesnt changes performances, and could also
be related to the random cpu validation errors... so keep it for all.
Note: the 80-bytes variant used in skein2 doesn't seems affected.
10 years ago
Tanguy Pruvot
123fe287b6
x11: temporary workaround for cuda 7.0
10 years ago
Tanguy Pruvot
d9b0312897
x64: fix some size_t warnings
10 years ago
Tanguy Pruvot
051ba521be
skein2: minimal host changes
10 years ago
Tanguy Pruvot
2f541065fb
cuda_helper: rename correctly hiword/loword functions
10 years ago
Tanguy Pruvot
2113be6eec
blake80: some changes and launch bounds, no perf changes
10 years ago
Tanguy Pruvot
3d3f2e2cb5
warnings: use the right device id (device_map[thr_id])
10 years ago
Tanguy Pruvot
275a028935
skein: compute midstate first
...
"Real" optimization based on KlausT precalc
10 years ago
Tanguy Pruvot
e7ae27137e
x11/qubit: remove some extra MyStreamSynchronize
...
only one per loop is required to prevent 100% cpu usage
10 years ago
Tanguy Pruvot
163430daae
Skein/Skein2 SM 3.0 devices support
...
+ code cleanup
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
d58d53f2b2
update README, small changes, prepare release 1.6.1
...
still need a SM 3.0 fix for skein...
10 years ago
Tanguy Pruvot
48515ad707
groestl: rename included cuda files
10 years ago
Tanguy Pruvot
37395eefe4
skein: restore previous x11 speed
10 years ago
Tanguy Pruvot
4f43abb402
bmw512: indent and restore SM 3.0 compat
...
could be also the source of the problem seen with CUDA 7
restored the code before sp/klaus changes for SM 3.0 devices...
10 years ago
Tanguy Pruvot
38e6672d70
Allow test of SM 2.1/3.0 binaries on newer cards
...
Implementation based on klausT work.. a bit different
This code must be placed in a common .cu file,
cuda.cpp is not compiled with nvcc and doesnt allow cuda code...
10 years ago
Tanguy Pruvot
f86784ee56
Add skein algo (Skeincoin, Myriad, Unat...)
...
SKEIN512 + SHA256
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
a37e909db9
Add zr5 algo (for SM 3.5+)
...
uint4 copy + keccak cleanup, groestl: small uint4 opt
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
9734186a37
jh512: import and improve klaus and sp changes
...
did not import the extra final function, which should stay compatible
with the common cuda_check_hash()
10 years ago
KlausT
ae8e863591
remove uint32_t cast
10 years ago
Tanguy Pruvot
e6112e878d
cleanup: use unsigned throughput parameters
...
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
10 years ago
Tanguy Pruvot
09c3ac6b4b
linux: fix missing dirname include
10 years ago
Tanguy Pruvot
2d5e8aaced
anime: fix uint2 error (bmw)
10 years ago
KlausT
a452c330dd
quark: remove unused variables
10 years ago
Tanguy Pruvot
26b51a557b
Allow different intensity per device
...
and clean the old variables, no more required
10 years ago
Tanguy Pruvot
768b5ccb76
import bmw512 uint2 changes from sp
...
+ some cleanup... 15KH/s won (750Ti)
10 years ago
Tanguy Pruvot
9f2dd3ee60
Remove some useless conversions
...
do not impact perfs neither...
10 years ago
Tanguy Pruvot
2a5233f56e
api: report throughput when default
10 years ago
Tanguy Pruvot
cafd4477d7
Handle a maximum of 16 gpus (vs 8 before)
...
Some cards have 2 gpus on board...
10 years ago
Tanguy Pruvot
b521acb480
groestl: use sp bitslice enhancement, prepare SM 2.x variant
...
todo: simd512 SM 2.x variant (shfl op), and groestl/myriad functions
10 years ago
Tanguy Pruvot
ec5a48f420
x11: small simd512 gpu_expand improvement
10 years ago
Tanguy Pruvot
1e24e4899c
skein: uint2 optimisation with SM 3.0 compat (+15KH)
...
Thanks to sp and djm34 for this fast uint64 storage alternative
10 years ago
Tanguy Pruvot
2585e10814
keccak uint2 optimisation for SM>3.0 (x11 +40KH/s)
...
based on djm34 keccak 256-bit changes, and keep SM3.0 compat
affect most other algos too (quark, nist5, x13...)
10 years ago
Tanguy Pruvot
c3bdb623e8
Check and submit multiple nonces in one loop
...
Added to most algos, checkhash function scans a big range
and can find multiple nonces at once if the difficulty is low.
Stop ignoring them, submit second one if found...
Clean the draft code for rc=2 implemented for blake and pentablake
btw... fix the reduced displayed hashrate when a nonce is found...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
10 years ago
Tanguy Pruvot
118a6be361
checkhash: simplify the common function
...
use klaus trivial function, the old code has always been a bit weird..
split cuda_check_cpu_hash_64 in two functions, keep old for branched stuff
10 years ago
Tanguy Pruvot
c218c3f514
quark/anime: +100KH, bmw tpb was not correct
...
This small change also enhance a bit x11..17 algos
10 years ago
Tanguy Pruvot
8ad180cc70
various small changes
...
heavy: reduce by 256 threads default intensity to all -i 20
cuda: put static thread init bools outside the code (made once)
api: fix nvml header to build without
10 years ago
Tanguy Pruvot
6ae28162db
various extern cleanup + api history uids and gpu SM
...
uids could be useful to create graphes from history data
Note: please do a clean build after this commit (changes in miner.h)
10 years ago
Tanguy Pruvot
73f22b237a
Prepare trap of hardware/mem failures
10 years ago
Tanguy Pruvot
fe4ad36b73
intensity: sign warnings fixes min(i,u)
10 years ago