Optimized C++ Implementation

The optimized x86-64 implementation uses AVX2, AES-NI, and other ISA extensions.

We measured the performance using a single core of a workstation running a AMD Zen 3 Ryzen 9 5950X processor at 3.4 GHz (with clock boosting disabled) and 128 GiB memory. The system was otherwise idle (load average 0.01), so while Simultaneous Multi-Threading was enabled it likely did not affect the results significantly. Each individual test can be run with memory usage below 19 MiB. The computer was running Linux 6.6.40, and the implementations were built with GCC 14.1.1.

FAEST Variant

Runtimes Sizes in Bytes
KeyGen Sign Verify sk
pk
sig
ms Mcyc ms Mcyc ms Mcyc
128s 0.002 0.005 3.761 12.787 2.877 9.783 32 32 4506
128f 0.002 0.005 0.507 1.722 0.415 1.413 32 32 5924
192s 0.003 0.011 16.084 54.687 12.438 42.290 40 48 11260
192f 0.003 0.011 2.072 7.045 1.788 6.079 40 48 14948
256s 0.004 0.013 22.450 76.330 21.925 74.546 48 48 20696
256f 0.004 0.013 3.256 11.071 3.012 10.241 48 48 26548
EM-128s 0.002 0.005 2.766 9.403 2.176 7.398 32 32 3906
EM-128f 0.002 0.005 0.413 1.404 0.327 1.113 32 32 5060
EM-192s 0.003 0.009 11.553 39.282 10.659 36.239 48 48 9340
EM-192f 0.003 0.009 1.523 5.177 1.372 4.665 48 48 12380
EM-256s 0.004 0.013 18.372 62.465 17.570 59.738 64 64 17984
EM-256f 0.004 0.013 2.775 9.436 2.566 8.725 64 64 23476

Reference C Implementation

The reference implementation is slower than the optimized implementation above, but follows the algorithms given in the specification more closely.

Old Implementations

  • x86-64 C implementation with AVX2, AES-NI, and other ISA extensions for the NIST Round 1 submission. Superceded by the C++ version above.

  • Initial Rust implementation for our Crypto 2023 paper. Note that this is for an older version of our protocol, which uses different primitives and is incompatible with the specification.