Wladimir J. van der Laan 16240f43a5
Merge #10821: Add SSE4 optimized SHA256
6b8d872 Protect SSE4 code behind a compile-time flag (Pieter Wuille)
fa9be90 Add selftest for SHA256 transform (Pieter Wuille)
c1ccb15 Add SSE4 based SHA256 (Pieter Wuille)
2991c91 Add SHA256 dispatcher (Pieter Wuille)
4d50f38 Support multi-block SHA256 transforms (Pieter Wuille)

Pull request description:

  This adds an SSE4 assembly version of the SHA256 transform by Intel, and uses it at run time if SSE4 instructions are available, and use a fallback C++ implementation otherwise. Nearly every x86_64 CPU supports SSE4. The feature is only enabled when compiled with `--enable-experimental-asm`.

  In order to avoid build dependencies and other complications, the original Intel YASM code was translated to GCC extended asm syntax.

  This gives around a 50% speedup on the SHA256 benchmark for me.

  It is based on an earlier patch by @laanwj, though only includes a single assembly version (for now), and removes the YASM dependency.

Tree-SHA512: d31c50695ceb45264291537b93c0d7497670be38edf021ca5402eaa7d4e1e0e1ae492326e28d4e93979d066168129e62d1825e0384b1b906d36f85d93dfcb43c
2017-07-20 20:28:35 +02:00
..
2017-07-20 09:03:53 -07:00
2017-07-15 14:28:40 +02:00
2017-06-22 19:18:10 +03:00
2017-06-22 19:18:10 +03:00
2017-07-08 13:33:01 -07:00
2017-07-08 13:33:01 -07:00
2017-07-15 14:28:40 +02:00
2017-06-09 10:25:26 +02:00
2017-07-17 11:56:00 +02:00
2017-06-09 10:25:26 +02:00
2017-06-05 16:33:35 -04:00
2017-06-22 19:18:10 +03:00
2017-07-08 13:33:01 -07:00
2017-07-08 13:33:01 -07:00
2017-06-09 10:25:26 +02:00
2017-07-08 13:33:01 -07:00
2017-07-07 10:45:31 -07:00
2017-07-14 19:24:17 +00:00