More

less_less · 2025-10-11T21:36:35 1760218595

Annoyingly, while that d = e^-1 usually isn't used in practice (except in cases where you care about side-channel / fault resistance more than the 4x speedup), the Carmichael totient itself still is used in practice. At least if you want to conform to FIPS 186-5 / SP800-56B, which says that the private key includes d = e^-1 mod the Carmichael totient LCM(p-1,q-1), even if you're going to use the CRT. And that means you have to compute LCM(p-1,q-1), which also has side-channel considerations.

less_less · 2025-10-11T21:20:53 1760217653

Do the standards require strong primes for RSA? I think FIPS doesn't ... it gives you that option, either for the legacy reasons or to get a proof with Pocklington's theorem that (p,q) really are prime, but just choosing a random (p,q) and running enough rounds of Miller-Rabin on them is considered acceptable IIRC.

AnotherGoodName · 2025-10-12T01:37:17 1760233037

Yeah see https://en.wikipedia.org/wiki/Strong_prime#Factoring-based_c...

There is probably a newer standard superseeding that but it is there in the ansi standards

less_less · 2025-07-10T17:08:44 1752167324

Internally, most signature algorithms use hash functions. RSA-PSS, EdDSA and ML-DSA use them to provide something like randomness, and the security analysis of those signature schemes includes arguments assuming (in some very particular, technical ways) that the hash function outputs "look random".

Classical DSA and ECDSA do not use hash functions this way, but in my opinion they aren't stronger for it: they're basically assuming instead that some other mathematical function "looks random", which seems riskier than assuming that about a hash function. I've heard that the reason for this is to get around Schnorr's patent on doing it with hash functions, which has since expired.

The SHA3 and SHAKE hash functions (underlying e.g. ML-DSA) are explicitly designed to "look random" as well.

There are some signature schemes that try not to make such strong assumptions: in particular SLH-DSA targets properties more like first- and second-preimage resistance, target-collision-resistance, and so on.

austin-cheney · 2025-07-10T22:43:27 1752187407

All the algorithms you mention are PKI. RSA uses two large prime numbers. I don't see what hash sequences have to do with this at all.

PKI isn't even really about randomness. RSA does use a kind of randomness to generate its large primes, but that is beneficial and not required. The primary consideration is the math to reverse guess a factor of two primes or the square root of a large number, or something else computers currently find cheap to compute in one way but extremely expensive to reverse.

less_less · 2025-07-10T23:16:12 1752189372

The intro textbook descriptions of cryptographic systems omit a lot of very important details.

When using RSA to sign a message m, in practice you don't send m^d mod N. That would generally be insecure, depending on what kinds of messages your system sends and/or accepts. In practical systems, instead you hash m, and then adjust the hash through a (possibly randomized) process called "padding" to be a value in [0,N). There are different standards for padding, and the better designs use additional hashing.

The security of the system depends in part on the hashed-then-padded message "looking random", i.e. not having structure that can be exploited by an attacker. It turns out to be tricky to formalize what exact randomness property you need, so cryptosystems are often analyzed in the "random oracle model" (ROM) in which the hash function has impossibly strong randomness properties.

It seems that usually, if you use a strong hash function, a scheme that's proved secure in the ROM is secure in real life (or at least it's not the ROM part that breaks); the counterexamples are usually really contrived. This article is about a somewhat-less-contrived, but still not quite realistic, example where something that's secure in the ROM would break due to the ROM being an unrealistic model.

less_less · 2025-07-10T16:46:21 1752165981

As I understand the paper, the point is that Fiat-Shamir does *not* give a correct proof of the program's output.

They gave a (maliciously constructed) program whose outputs are pairs (a,b) where certainly a != b (instead the program is constructed such that a = b+1 always). But you can get the corresponding Fiat-Shamir protocol to accept the statement "I know a secret x such that Program(x) = (0,0)", which is clearly a false statement.

less_less · 2025-07-03T18:09:56 1751566196

Adding to some other comments in the thread: finding missing or extra numbers is closely related to error-correcting codes, especially binary linear codes. In an error-correcting code, you have a string of bits or symbols, with symbol x_i appearing at position i. You choose the code so that valid sequences have a certain mathematical property, and then if one or a few symbols are corrupted, then you can use that property to correct the errors. The property is typically that a certain linear function called the "syndrome" is zero, meaning that sum(x_i * G_i) = 0 where each G_i is some strategically chosen vector, particular to the code. The math for how to correct is particular to the chosen G_i, and it's a really interesting field of study.

In a typical error-correcting code usage, you have an encoder which takes your message, and adds some extra symbols at the end which are calculated so that the syndrome is zero. Then when receiving your message, the receiver calculates the syndrome and if it's not zero, they know that at least one error has occurred. By using the code's decoding algorithm, they can figure out the fewest (and thus hopefully most likely) number of changes which would result in that error syndrome, and use this information to (hopefully) correct the transmission error.

For the missing numbers problem, you can set x_i to "how many times does the number i appear?". Then since the syndrome is sum(x_i * G_i), you can compute the syndrome on an unordered list of the i's. You are expecting the syndrome to be the same as the syndrome of full set 1...n, so when it is not, you can figure out which few x_i's are wrong that would lead to the syndrome you observed. You have an advantage because you know how many numbers are missing, but it's only a slight one.

The author's solution is called the Hamming code: you set F(i) = i, and you do the additions by xoring. Using error-correcting codes generalize to more missing numbers as well, including using xor, but the math becomes more complicated: you would want to use a fancier code such as a BCH or Goppa code. These also use xor, but in more complicated ways.

less_less · 2025-07-03T16:24:47 1751559887

If you imagine a polynomial L(z) that's zero at all the missing numbers, you can expand the coefficients out. For example, with 2 missing numbers (x,y), you have:

   L(z) = z^2 - (x+y)z + xy.

You already have x+y, but what's xy? You can compute it as ((x+y)^2 - (x^2 + y^2))/2. This technique generalizes to higher powers, though I forget the exact details: basically you can generate the coefficients of L from the sums of powers with a recurrence.

Then you solve for the roots of L, either using your finite field's variant of the quadratic formula, or e.g. just by trying everything in the field.

* But wait, this doesn't actually work! *

Over fields of small characteristic, such as F_2^m, you need to modify the approach and use different powers. For example, in the equations above, I divided by 2. But over F_2^m in the example shown above, you cannot divide by 2, since 2=0. In fact, you cannot solve for (x,y) at all with only x+y and x^2 + y^2, because

  (x+y)^2   =   x^2 + y^2 + 2xy   =   x^2 + y^2 + 0xy (since 2=0)   =   x^2 + y^2

So having that second polynomial gives you no new information. So you need to use other powers such as cubes (a BCH code), or some other technique (e.g. a Goppa code). My sibling comment to yours describes the BCH case.

less_less · 2025-07-03T15:53:43 1751558023

This will depend on the field, and for F_2^m you want odd powers: sum(x), sum(x^3), sum(x^5) etc. Using sum(x^2) won't help because squaring over F_2^m is a field homomorphism, meaning that sum(x^2) = sum(x)^2.

This is also how BCH error-correction codes work (see https://en.wikipedia.org/wiki/BCH_code): a valid BCH codeword has sum(x^i where bit x is set in the codeword) = 0 for t odd powers i=1,3,5, ... Then if some bits get flipped, you will get a "syndrome" s_i := sum(x^i where bit x was flipped) for those odd powers. Solving from the syndrome to get the indices of the flipped bits is the same problem as here.

The general decoding algorithm is a bit involved, as you can see in the Wikipedia article, but it's not horribly difficult:

  • First, extend the syndrome: it gives sum(x^i) for odd i, but you can compute the even powers s_2i = s_i^2.

  • The syndrome is a sequence of field values s_i, but we can imagine it as a "syndrome polynomial" S(z) := sum(s_i z^i).  This is only a conceptual step, not a computational one.

  • We will find a polynomial L(z) which is zero at all errors z=x and nowhere else.  This L is called a "locator" polynomial.  It turns out (can be checked with some algebra) that L(z) satisfies a "key equation" where certain terms of L(z) * S(z) are zero.  The key equation is (almost) linear: solve it with linear algebra (takes cubic time in the number of errors), or solve it faster with the Berlekamp-Massey algorithm (quadratic time instead, maybe subquadratic if you're fancy).

  • Find the roots of L(z).  There are tricks for this if its degree is low.  If the degree is high then you usually just iterate over the field.  This takes O(#errors * size of domain) time.  It can be sped up by a constant factor using Chien's search algorithm, or by a logarithmic factor using an FFT or AFFT.

You can of course use a different error-correcting code if you prefer (e.g. binary Goppa codes).

Edit: bullets are hard.

Further edit just to note: the "^" in the above text refers to powers over the finite field, not the xor operator.

nullc · 2025-07-03T23:26:22 1751585182

Yesterday I linked to an implementation (with complexity quadratic in the number of errors) I helped to create in another comment in this thread.

> constant factor using Chien's search algorithm

Chien's search is only really reasonable for small field sizes... which I think doesn't really make sense in this application, where the list is long and the missing elements are relatively few.

Fortunately in characteristic 2 it's quite straight forward and fast to just factor the polynomial using the berlekamp trace algorithm.

less_less · 2025-07-04T08:05:08 1751616308

Oh yeah, factoring the polynomial is also a good idea. For a long enough list that ought to be better than AFFT too.

Straw · 2025-07-03T18:09:12 1751566152

Good catch, thank you!

less_less · 2025-06-27T08:30:00 1751013000

The data-dependent prefetcher is a cool feature, though you do have to be careful with side-channel issues, so some of them can disable it with the Data-Independent Timing bit or similar.

At this point I'm kinda expecting CPU vendors to stop putting as many Spectre mitigations in the main core, and just have a small crypto core with full-fat arithmetic, less hardware for memory access, less speculation, and careful side-channel hardening. You still have to block Meltdown and other large vulnerabilities on the main cores, but if someone wants to protect elliptic curves from weird attacks? Try to set the DIT bit, trap into the OS, and get sent to the hardened core.

less_less · 2025-05-30T18:09:58 1748628598

Neat, but if you're using this in cryptographic code (one of the main consumers of bignums), keep in mind that secret data reaching branches is usually a side-channel risk. Sure, it's only 1 time in 2^64 on random data, but if you're depending on that, then you have to consider whether an attacker can choose data that will make it happen more often.

If you can substitute a cmov without control flow then it's probably safer, e.g. c1 |= c0 & seq(s1,-1) or so, so long as you can make sure the compiler won't turn it into a branch.

It does add a data dependency though ...

brucehoult · 2025-05-31T04:59:56 1748667596

Yes, for cryptography you'd like to have constant time, but this has to be an awfully low bandwidth channel!

A `cmov` will have the same serialisation problem as `adc` but on machines without carry it might still leave you better off than the obvious `add s,a,b; sltu co,s,a; add s,s,ci; sltu t,s,ci; or co,co,t`.

less_less · 2025-05-26T21:06:36 1748293596

If you want to use Bloom filters for compression, you might want to consider binary fuse filters, ribbon filters or similar which avoid the 1/ln(2) leading factor in space usage.