Tanguy Pruvot 4709668995 jh512: rewrite and optimize with asm swap
5% improvement by the vshl asm swap functions, mixed shl+add inst.,

Add also xchg(x, y) func and XCHG(x, y) define in cuda_helper for later use...

other jh changes are mainly for the beauty of the code...

Signed-off-by: Tanguy Pruvot <tanguy.pruvot@gmail.com>
2015-06-16 08:20:48 +02:00
..
2015-05-17 04:56:42 +02:00
2015-05-17 04:56:42 +02:00