mirror of
https://github.com/GOSTSec/ccminer
synced 2025-01-08 22:07:56 +00:00
b9da6c67f5
the main improvement is to reduce asm calls to read global mem but, a few more regs are used (68 mini vs 64 on SM 5.2) so reduce the forced launch bounds to allow 80 or 128 regs per thread Note: cuda 6.5 seems not able to store with v4.u32... (7.5 is fine) st.global.v4.u32 [%rd2], {%r3783, %r3824, %r3823, %r3822}; st.global.v2.u32 [%rd2+16], {%r3821, %r3820}; st.global.u32 [%rd2+24], %r3819; st.global.u32 [%rd2+28], %r3818; st.global.u32 [%rd2+44], %r3814; st.global.u32 [%rd2+40], %r3815; ... todo, check alexis variant.. but wanted to keep this code before in git... |
||
---|---|---|
.. | ||
cuda_bmw512_sm3.cuh | ||
cuda_bmw512.cu | ||
cuda_jh512.cu | ||
cuda_quark_blake512_sp.cuh | ||
cuda_quark_blake512.cu | ||
cuda_quark_compactionTest.cu | ||
cuda_quark_groestl512_sm2.cuh | ||
cuda_quark_groestl512.cu | ||
cuda_quark_keccak512.cu | ||
cuda_quark.h | ||
cuda_skein512.cu | ||
groestl_functions_quad.h | ||
groestl_simple.cuh | ||
groestl_transf_quad.h | ||
nist5.cu | ||
quarkcoin.cu |