Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It would be more satisfying to learn why hash of nan is not guaranteed to be the same. It feels like a bug.
 help



At the standards level, NaN payload propagation isn't guaranteed, regardless of any other issues.

> payload propagation isn't guaranteed

Yes and no:

`If an operation has a single NaN input and propagates it to the output, the result NaN's payload should be that of the input NaN (this is not always possible for binary formats when the signaling/quiet state is encoded). If there are multiple NaN inputs, the result NaN's payload should be from one of the input NaNs; the standard does not specify which.'


My guess is that no one ever bothered to define hash(nan), which should, IMHO, be nan.

nan isn't anything. It's an early attempt at None when no/few (common) languages had that concept.

That python allows nan as an index is just so many kinds of buggy.


For binary operations, NaN values compare as unordered.

The IEEE 754 Specification requires that >,<,= evaluate to False.

Saying that two incomparable objects become comparable let alone gain equally would break things.

We use specific exponents and significands to represent NaNs but they have no numerical meaning.

I am actually surprised python got this correct, often NaN behavior is incorrect out of convenience and causes lots of issues and side effects.


Probably just due to encoding. NaN is all 1s for the exponent and non-zero mantissa, so that's 2^23 - 1 possible values for f32

The hash is the same. But a hash set has to use == in case of equal hashes (to avoid collisions).

It's not always the same:

  >>> hash(float('nan'))
  271103401
  >>> hash(float('nan'))
  271103657

Yes. The CPython hash algorithm for floats (https://github.com/python/cpython/blob/main/Python/pyhash.c#...) special-cases the non-finite values: floating-point infinities hash to special values modeled on the digits of pi (seriously! See https://github.com/python/cpython/blob/main/Include/cpython/...), and NaNs fall through ultimately to https://github.com/python/cpython/blob/main/Include/internal... which is based on object identity (the pointer to the object is used rather than its data).

maybe it's that multiple bit patterns can be NaN and these are two different ones? In IEEE-754, a number with all the exponent bits set to 1 is +/-infinity if the fraction bits are all zero, otherwise it's NaN. So these could be values where the fractions differ. Can you see what the actual bits it's setting are?



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: