17 Commits

Author SHA1 Message Date
Tanguy Pruvot
0a0fd33cac attempt to reduce shared mem errors 2016-08-06 12:56:02 +02:00
Tanguy Pruvot
0d9d3520ac simd: add support for SM 2.1 devices
Add support for x11..x17, s3, fresh and qubit

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2015-11-01 12:37:52 +01:00
Tanguy Pruvot
e21c75793a Revert "x11: improve aes (shavite/echo)"
make a lot of cpu validation errors on windows,
to be double checked in the next version...

This reverts commit 1187a6e7e3211f0216111554a55b685687003b11.
2015-06-23 09:27:40 +02:00
Tanguy Pruvot
1187a6e7e3 x11: improve aes (shavite/echo)
shavite is faster, echo doesn't really change due to the reg. overload

This changes allow custom lauchbounds without other code changes and improve
the portability against different devices.

also set a minimum throughput to 1024 for these algos (shared mem req. size)
2015-06-19 05:23:06 +02:00
Tanguy Pruvot
e6112e878d cleanup: use unsigned throughput parameters
Yes, its a big commit, was waiting 1.6 to do that...
Sorry for your possible merge issues ;)
2015-02-28 14:05:09 +01:00
Tanguy Pruvot
c88750332c simd512: restore SM3/3.5 perfs
Simple change which affect all algos based on SIMD512

fresh, qubit, s3, x11 to x17...
2014-11-23 19:07:06 +01:00
Tanguy Pruvot
94c9945fe6 cubeluffa: Fix indent and add some static prefixes
use git "show -w <commithash>" to see changes

Duplicated functions in merged Cube+Luffa could be cross linked without
2014-11-23 07:17:20 +01:00
Tanguy Pruvot
73f22b237a Prepare trap of hardware/mem failures 2014-11-20 18:44:25 +01:00
Tanguy Pruvot
fdd5d29071 x11: shavite and echo from sp (now ok on win32)
Previous echo commit was only increasing linux performance, and reducing
windows perf compared to the 1.4.9, this one seems to give at least
the 1.4.9 on windows, and the same on linux...

Shavite optimisation seems ok on both (use now 64 registers)

the launch_bounds will force the number of registers, so remove specific
Makefile rules on linux...

manual "cherry pick" with fixed line endings and some adaptations
2014-11-16 17:34:50 +01:00
sp-hash
e18a54e8fc sp echo optimisation + cleanup
Original Commit :
Removed sharedmem and reduced calculations with precalcing (ECHO hash).
750ti + 20KHASH(x11)

tpruvot notes:
Real change is more of 10 KH/s on stock clocks (but real)
launch bounds disabled, no perf increase with 64 registers
2014-11-16 03:08:46 +01:00
sp-hash
5be6811dcf x11: echo and cubehash optimization
echo : 40.056ms -> 39.241ms
cube : 14.490ms -> 13.511ms

cube hash change look like useless (__device__ code in generally inlined)
but the reality proves that cuda documentation is wrong...

tpruvot: fixed dos lines ending in echo,
and used my style for cuda function attributes
2014-11-06 15:17:26 +01:00
Tanguy Pruvot
b4e690b486 sources: swith to UTF-8 2014-08-21 08:27:48 +02:00
Tanguy Pruvot
d9ea5f72ce Remove duplicated defines present in cuda_helper.h
also add cudaDeviceReset() on Ctrl+C for nvprof
2014-08-19 03:29:11 +02:00
Tanguy Pruvot
06763c20b1 Implement x14 (cuda + cpu functions)
Project was updated for VS2013 and CUDA SDK 6.5

add also a --cputest function to dump cpu hash results

TODO: x15 is not fully functional, but first loop seems ok

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2014-08-12 14:47:03 +02:00
Christian Buchner
d99b91ea65 adding third party X13 and Diamond Groestl code contributions. 2014-06-15 14:31:20 +02:00
Christian Buchner
3b21069504 bump to revision V1.1 with Killer Groestl 2014-06-14 01:43:28 +02:00
Christian Buchner
af07302b4b v1.0 - Yo, I heard y'all like X11 2014-05-10 00:29:59 +02:00