Tanguy Pruvot
b9da6c67f5
improve jh512 with vectors (nist5,quark,sib,x11+,zr5)
...
the main improvement is to reduce asm calls to read global mem
but, a few more regs are used (68 mini vs 64 on SM 5.2)
so reduce the forced launch bounds to allow 80 or 128 regs per thread
Note: cuda 6.5 seems not able to store with v4.u32... (7.5 is fine)
st.global.v4.u32 [%rd2], {%r3783, %r3824, %r3823, %r3822};
st.global.v2.u32 [%rd2+16], {%r3821, %r3820};
st.global.u32 [%rd2+24], %r3819;
st.global.u32 [%rd2+28], %r3818;
st.global.u32 [%rd2+44], %r3814;
st.global.u32 [%rd2+40], %r3815;
...
todo, check alexis variant.. but wanted to keep this code before in git...
8 years ago
Tanguy Pruvot
6440a9bf41
windows: some default intensity adjustments
8 years ago
Tanguy Pruvot
2152fd102d
lbry cleanup, and proper error on cuda 6.5
...
both merged and unmerged implementations are broken with CUDA 6.5
No perf changes...
8 years ago
Tanguy Pruvot
aaef92cab2
nvml: workaround for beta drivers 378.49 clocks
...
even nvidia-smi doesnt report the right pascal clocks
8 years ago
Tanguy Pruvot
b47d9acaf5
readme + small warnings detected by vstudio
8 years ago
Tanguy Pruvot
c8ff854456
sia was migrated too...
8 years ago
Tanguy Pruvot
0ff75791e5
migrate 2nd nonce storage of most algos
...
This allow to keep pdata[19] as cursor between scans, and later, to sort them..
remains... heavy, scrypt, sia...
8 years ago
Tanguy Pruvot
5a77d36635
groestl: explain code and improve perf on SM 2.x
...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
8 years ago
Tanguy Pruvot
feb99d020f
skein: merge the double implementations in one
...
based on alexis skein kernels, tested ok on SM 2.1 and 3.0
code is a bit hard to read but... well... users dont care :p
8 years ago
Tanguy Pruvot
16ac9b688f
x13+: improve and clean a bit fugue512
...
reduce constant mem and load global data in bulk (vectors)
8 years ago
Tanguy Pruvot
013cda1cd2
ccminer: show first block diff even with -q
8 years ago
Tanguy Pruvot
496052e47d
xmr: vstudio warning fix about mpcount linkage
...
and move ptr type cast defines to common cuda helper
8 years ago
Tanguy Pruvot
dc816b4673
xmr: nicehash nonce prefix/hack support (v2)
8 years ago
Tanguy Pruvot
def9888bd5
xmr: prefer 32bit uint4 and smaller offsets in core
...
also prefer ulong2 shared load to be closer to the ptx
8 years ago
Tanguy Pruvot
214f392778
xmr: default settings with card attributes
8 years ago
Tanguy Pruvot
94aa6b8e91
ccminer: allow 192 car. for the username
8 years ago
Tanguy Pruvot
588c7ba361
xmr: dont use shared mem hack, windows dont like
8 years ago
Tanguy Pruvot
bd030db5d1
xmr: vectors rewrite, now the phase2 is using only 40 regs
...
no more constant memory used for aes.
tested only on linux cuda 8 for now... wip
8 years ago
Tanguy Pruvot
23be7f308d
xmr: link the --bfactor setting (0-11)
8 years ago
Tanguy Pruvot
e231343060
xmr: make it more smooth on windows with defaults
...
also improve a bit the 750 ti on linux...
8 years ago
Tanguy Pruvot
12ae185594
hwmonitor: efficiency unit and clean dead code
8 years ago
Tanguy Pruvot
0dd022779b
power monitoring thread + some api changes
...
based on alexis monitoring thread idea, but which only use one thread
note: other api changes will come soon, related to that
8 years ago
Tanguy Pruvot
242aa4144b
scanlog: remember sharediff for multiple nonces
...
rpc2: handle properly secondary nonce(s) + api ping time fix
be sure to fully recompile, structures are changed
8 years ago
Tanguy Pruvot
93adb56c8e
handle cryptonight light variant
...
Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
8 years ago
Tanguy Pruvot
39aad5a003
xmr: allow to set intensity on command line
8 years ago
Tanguy Pruvot
804b5b5f53
xmr: be safe with diff divide
8 years ago
Tanguy Pruvot
2479ffaaa2
xmr: fix decimal diff + aes cleanup
...
change default launch config to -l 32x16 to handle the 750 Ti better
not definitive, doing tests..
8 years ago
Tanguy Pruvot
c1f1ad9280
xmr: stabilize the final kernel
8 years ago
Tanguy Pruvot
066a569357
import xmr, to finish
...
todo: fix jh cuda and wrong decimal diff (0xffff problem ?)
8 years ago
Tanguy Pruvot
2bbccc5ff4
wildkeccak, basic stratum port of rpc 2.0
...
scratchpad delete fix and redownload, reduce rejects
(work in progress)
8 years ago
Tanguy Pruvot
099389f64f
ccminer: be more quiet with -q, skip header noise
8 years ago
Tanguy Pruvot
50534789bc
Release 1.8.4
8 years ago
Tanguy Pruvot
c11901260a
limit per gpu hashrate logs to 3 sec intervals
...
may be required for very fast algos, like blake2s
8 years ago
Tanguy Pruvot
44bd244fc4
blake2s improved
...
based on alexis work, with the new work->nonces
8 years ago
Tanguy Pruvot
ce6a8da188
cuda: prevent ptxas crash with -n
8 years ago
Tanguy Pruvot
397472818d
prepare 1.8.4 release
8 years ago
Tanguy Pruvot
36aedbb48e
veltor update, 10x faster :p
...
From Alexis work, sib hash rate 200% also..
8 years ago
Tanguy Pruvot
3eba451d4c
nvml: add Elsa vendor and workaround for Colorful pid
...
Colorful (and Inno3D) only set their vid, with an empty product id
8 years ago
Tanguy Pruvot
c27f3139aa
update startup credits
8 years ago
Tanguy Pruvot
056098dd86
update readme
8 years ago
Tanguy Pruvot
7b82915032
cuda 8
8 years ago
Tanguy Pruvot
225f25a6b9
uint2: remove the slower asm in operators funcs
8 years ago
Tanguy Pruvot
665de3a1f2
sia: use the new work share diff
8 years ago
Tanguy Pruvot
1a31d4d2d6
sia: move specific code in a new rpc unit
...
part 1: longpoll stuff (nanopool)
8 years ago
Tanguy Pruvot
f84c83afe5
nvml: force 64bits types for mem sizes
...
size_t can be a bit... imprevisible on x86
8 years ago
Tanguy Pruvot
5a0b779434
api: use the new throughput2intensity func
8 years ago
Tanguy Pruvot
a43205a84f
decred: multiple nonces code cleanup
...
The double loop is not useful, and prefer the __thread attribute
to enhance the code readability (remove the 2D host arrays).
squashed: return to host 2D array to allow the free
8 years ago
Tanguy Pruvot
6f6cf966f8
lbry: new share diff and duplicate fix
...
when 2 nonces were found, the next scan was not at the right value
Doesn't really affect mining performance...
8 years ago
Tanguy Pruvot
9eead77027
diff: show by default, rework shares diff storage
...
This will allow later more gpu candidates.
Note: This is an unfinished work, we keep the previous behavior for now
To finish this, all algos solutions should be migrated and submitted nonces attributes stored.
Its required to handle the different share diff per nonce and fix the possible solved count error (if 1/2 nonces is solved).
8 years ago
Tanguy Pruvot
2f57ee9157
bench: skip the disabled whirlpoolx
...
+ veltor free
+ some missed/extra log things...
8 years ago