Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Popcount walks: next, previous, toward and nearest (marc-b-reynolds.github.io)
48 points by ingve on Nov 14, 2023 | hide | past | favorite | 20 comments


I like these kinds of bit twiddling puzzles.

One that I recently did was for iterating over possible errors of weight w in a quantum stabilizer code (for [1]). In a classical code this would just be the problem in the blog post, since there's only one type of error (bit flips). In a quantum code you have bit flips (X) but also phase flips (Z) and combination-bit-and-phase-flips (Y) all having weight 1. So the problem becomes: given ints (x, z) find the next (x2, z2) such that popcnt(x2 | z2) == popcnt(x | z).

The solution I came up with, which is likely very suboptimal, was to iterate over single-integer solutions for a given weight. Then, for each single-integer solution m, iterate over the pairs x,z where x|z == m. Here's python pseudocode for the second part:

    def pair_sat_increment(x: int, z: int, m: int) -> Tuple[int, int]:
        """Returns the next (x, z) such that x | z == m."""
        inc = x & z
        up = ~inc
        inc |= ~m
        inc += 1
        inc &= m
        up &= inc
        z &= inc | ~x
        z ^= x & up
        x ^= up
        return x, z
[1]: https://github.com/quantumlib/Stim/issues/397


A simple way to do the latter is

  for(u64 x = -m & m; ;x = (x - m) & m) {
    const u64 r = x ^ m;
    for(u64 t = -x & x; ; t = (t - x) & x) {
      const u64 z = r | t;
      // Use (x,z)
      if(t == 0) break;
    }
    if(x==0) break;
  }


Nice. You need to initialize x to 0 instead of -m&m though, or you miss the solution with x=0.


x=0 is the last solution to be hit the way I wrote it.


It's possible I misdiagnosed the issue. But I implemented the code and ran it and counted the solutions, and one was missing. I changed that line and it fixed it, and the solution that was missing is the one I described.

In any case, it's a nice succinct trick for iterating through the values compatible with a mask. Getting the boundary conditions right is less important than knowing the trick.


As a competitive programmer, I’ve seen similar ‘magic’ tricks (including this one) here: https://github.com/kth-competitive-programming/kactl/blob/ma... (page 23, 10.5.1)


> How does that work? I don't really know, I pulled this out of the void with a SAT solver. Basically magic.

That's pretty cool. How do you ask an SAT solver for "the nearest integer with the same population count", and how do you describe the available bitwise operations to it?


It would be more like an SAT-guided program synthesis---you can exhaustively generate all functions with at most n operations, and you can describe that particular operation as a boolean formula, so you can feed all of them to filter ones that are provably equivalent.

As for the actual boolean formula, I believe that he used his own creation, Haroldbot [1] which can evaluate bitwise expressions and enumerate all cases where the expression evaluates to true. Since simple formulas for previous and next integer with the same popcount, it should be doable. I've even constructed a concrete formula with the suggested formula:

    let n = (let t = x + (x & -x) in let u = x & ~t in t ^ ((u >>s tzcnt(u)) >>s 1)) in
    let p = (let v = x - (~x & (x + 1)) in let w = ~x & v in v ^ ((w >>s tzcnt(w)) >>s 1)) in
    let c = (let y = -x & (x + 1) in x ^ y ^ (y >> 1)) in
    (n == -1) || (p == 0) || (
        (p < x) && (popcnt(p) == popcnt(x)) &&
        (x < n) && (popcnt(x) == popcnt(n)) &&
        ((x - p > n - x) && (c == n)) || ((x - p <= n - x) && (c == p)))
    )
...where `n` is for the next such integer (or all 1s), `p` is for the previous such integer (or all 0s), `c` is for the closest such integer, and assuming that both `p` and `n` exists, `c` should be the closest of either `p` or `n`. This expression can be mechanically translated to a single boolean formula, which Haroldbot internally does for complicated formulas [2].

[1] http://haroldbot.nl/

[2] It apparently uses binary decision diagrams for simpler cases: http://haroldbot.nl/how.html


Interesting! I didn't initially click to the fact that Harold's snippet was solving a different problem than the one previously talked about, so was confused why it would "get stuck" flicking between two states after shrinking the rightmost run (of 1s or 0s), but that is indeed the desired behaviour if you want the nearest (not the next or previous) integer.

I actually found the author's complementing trick for getting the previous integer more interesting (because more applications):

>However since we already have a version that walks to the next we can simply apply a linear transform to map/unmap the input/result. The ones complement (~x) reverses the notion of unsigned integer ordering so the follow function walks to the previous


This could be useful for combinatorial tasks such as iterating over all the ways to draw 𝑘 items without replacement from a bag containing 𝑛 distinct items. You start out with 1<<𝑘 – 1 (which has its 𝑘 low-order bits set to 1) and keep going up to the next higher integer with the same popcount until bit 𝑛 is set to 1, at which point you stop.


What kind of UB the author is talikng about?


If x is n bits long, it's undefined behavior to evaluate x >> n. You can only shift by amounts smaller than the register width. IMO this is one of the more annoying instances of UB. Precisely because of situations like this, where you shift by "number of leading zeroes" which can end up being the width of the register.


This is one of the classic examples where C++ should clearly have picked Unspecified Behaviour rather than Undefined Behaviour but too late now. Nobody wants UB here, and the set of things CPUs actually might provide is enumerable.

Maybe I should suggest avoiding the same fate for Rust's unchecked_shl and unchecked_shr. (These aren't stabilized yet unlike the safe options and the rotates)


"C++ should clearly have picked Unspecified Behaviour rather than Undefined Behaviour but too late now"

It's never too late to replace Undefined Behaviour with Unspecified Behaviour.


> C++ should clearly have picked Unspecified Behaviour rather than Undefined Behaviour but too late now.

Why is it too late? Changing it in a later version of the standard wouldn't be a further constraint on any written code; any code which is currently correct will remain correct, and some code which could have done literally anything will now only be able to have one of a few possible behaviours.


The shift instruction is defined differently depending on the ISA. For example, on some CPUs "x >> 32 == 0" and on others "x >> 32 == x". Both shift designs are reasonable and C has to accommodate both. Defining this case would require putting a branch instruction in front of the shift instruction on many architectures. Obviously this would be a large performance regression.

Consider how long it took C++ to deprecate support for integers that are not two's complement, when every piece of silicon for many decades has been a two's complement design. In the case of shift instructions, it is defined differently across ISAs used today.


That is not a good pro argument.

When this operation is undefined, you must always put a branch instruction in front of it, on any architecture, regardless if it happens to do what you want.

Even when you know precisely the type of CPU on which the program will be run, you would need to use inline assembly to be sure of the behavior of the program.

When you can guarantee that the operand will never have the undesired value, then no branch instruction would be used on any architecture, regardless how the C standard would define the operation.

It is always better when the programming language standard defines completely all operations.

In the worst case, when two possible operation definitions are widespread in hardware implementations, two distinct operations should be defined in the language, allowing the programmer to choose the one that is more efficient on the intended target, but knowing that the program will remain correct regardless for which target it is compiled.


Why would having the standard set the behaviour as "unspecified" (or "implementation-defined") require a branch?

Surely under those circumstances, the compiler can document doing whatever the target ISA does as its behaviour, and just output the shift as-is?


I suppose it would break any currently compliant C++ compiler for which the target platform's CPU will throw an exception (or even less likely, return a random number) when shifting by a full word.

But I'd also suppose there aren't many such CPUs. Does anyone know of any?


> I suppose it would break any currently compliant C++ compiler

Maybe not?

I'm pretty sure that "trap" is an allowed behaviour for NaNs in some parts of the standard already, if properly documented. I don't see why unspecified/implementation-defined behaviour for shifts couldn't be allowed to do the same, on appropriate targets?

But also, the new behaviour should only be required of compilers in the to-be-written `--std=c++3X` (or whatever) mode. Any compilers conforming to an earlier version of the standard can already do whatever they're currently doing.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: