Hacker Newsnew | past | comments | ask | show | jobs | submit | slooonz's commentslogin

You’re just doing brute force, but with extra steps. It turns out that partial collisions are more common than you think, and it’s not particularly hard to find some.

Here, a 186-bits partial collision, found in less than two minutes on my CPU, by brute force :

sha256(cbfad45814d54d1d56d30de387d957ed3b50e06270ad6e4b897f4a0000000000) = 692207e28eb8dd3eb4f8fab938ea5103faa1060c3bbed204f564e10c65d06b33 sha256(cbfad45814d54d1d56d30de387d957ed3b50e06270ad6e4be8c33e0000000000) = 006347a21f7c9b3eb4fa52b75d0e5a03dbe556b579d6d2867d44c38c06546f6f

(in Python :

>>> hashlib.sha256(bytes.fromhex("cbfad45814d54d1d56d30de387d957ed3b50e06270ad6e4b897f4a0000000000")).hexdigest()

'692207e28eb8dd3eb4f8fab938ea5103faa1060c3bbed204f564e10c65d06b33'

>>> hashlib.sha256(bytes.fromhex("cbfad45814d54d1d56d30de387d957ed3b50e06270ad6e4be8c33e0000000000")).hexdigest()

'006347a21f7c9b3eb4fa52b75d0e5a03dbe556b579d6d2867d44c38c06546f6f'

>>> a = hashlib.sha256(bytes.fromhex("cbfad45814d54d1d56d30de387d957ed3b50e06270ad6e4b897f4a0000000000")).digest()

>>> b = hashlib.sha256(bytes.fromhex("cbfad45814d54d1d56d30de387d957ed3b50e06270ad6e4be8c33e0000000000")).digest()

>>> sum((x^y^0xff).bit_count() for (x, y) in zip(a, b))

186

)

Intuition pump : the expected number of equal bits for two random inputs is 128.


Yes it should, because hopefully errors are logged and reported and can be acted upon. Missing name doesn’t.


If the error isn’t repairable by the user, blocking them from using the app entirely is mean. If the error screen has a message telling the user where to go to set their name, that’s fine but annoying. If the error screen tells the user they can’t use the app until someone checks a dashboard and sees a large enough increase in errors to investigate, that’s a bigger problem.


This reads like a dogmatic view of someone who hasn’t worked on a project that’s a million plus lines of code where something is always going wrong, and crashing the entire program when that’s the case is simply unacceptable.


> something is always going wrong

I hate this sentence with a passion, yet it is so so true. Especially in distributed systems, gotta live with it.


Why not both?


That's how you get this feature: https://wiki.php.net/rfc/deprecate-bareword-strings.

tldr: undefined constants were treated as a string (+ a warning), so `$x = FOO` was `$x = "FOO"` + a warning if `FOO` was not a defined constant. Thankfully this feature was removed in PHP 8.


It was goofy and fun-looking when the first blog did it.

Now that everyone and its dog does those "goofy" illustrations, I find them insufferable.


I haven't seen "everyone and its dog" doing anything of this sort - the vast majority of blogs nowadays seem to be indistinguishable from one another, just bland and barely styled text.

I am enjoying how bothered people are by it, though.


I think you should clearly spell out how the key is derived.

From the description, I believe it's random string hard-coded in the executable + user-provided password => AES key ?

Also… "full offline", but "my API is on a digital ocean droplet" ? What ?

(I guess the API to generate a .exe with its own random string ? But again, very unclear of what it is, what’s it’s doing, and how)


You’ve mostly got it.

The API on DigitalOcean just builds a fresh .exe and embeds a unique seed into the runtime build. After that, everything is offline.

The seed is made server-side during creation as a SHA-256 over a timestamp + jitter source, and is written as (vault-seed.txt)in the .zip.

Inside the .exe, the vault reads its own embedded seed. The KDF is thus:

key = SHA256(SHA256(password||seed))

Each encryption uses a fresh 16-byte IV and encodes ciphertexts as:

ivHex + "." + encryptedHex

So the text is encrypted and locked behind an AES key and an ivHex per click.

The result is simply: same password + different builds => different AES keys, because each .exe has a different seed baked into it.

That’s what creates the conditions where two environments can never decrypt each-other’s ciphertexts and can be generated endlessly.


Operational concerns trumps raw performances most of the time. Stored procedures live in a different CI/CD environment, with a different testing framework (if there’s even one), on a different deployment lifecycle, using a different language than my main code. It is also essentially an un-pinnable dependency. Too much pain for the gain.

Now, give me ephemeral, per-connection procedures (call them unstored procedures for fun) that I can write in the language I want but that run on provider side, sure I’ll happily use them.


> Stored procedures live in a different CI/CD environment

They don't have to. The procedural SQL commands can be source controlled along with the rest of the codebase. Transmitting the actual command text to the SQL server that does all the work is not the inefficient part.


Why do they live in separate CI?


That was my first thought, an aimless dialogue is going to go toward content-free idle chat. Like humans talking about weather.


> Like humans talking about weather.

As someone who was always fascinated by weather, I dislike this characterization. You can learn so much about someone’s background and childhood by what they say about the weather.

I think the only people who think weather is boring are people who have never lived more than 20 miles away from their place of birth. And even then anyone with even a bit of outdoorsiness (hikes, running, gardening, construction, racing, farming, cycling, etc) will have an interest in weather and weather patterns.

Hell, the first thing you usually ask when traveling is “What’s the weather gonna be like?”. How else would you pack


They failed hard with Claude 4 IMO. I just can't have any feedback other than "What a fascinating insight" followed by a reformulation (and, to be generous, an exploration) of what I said, even when Opus 3 has no trouble finding limitations.

By comparison o3 is brutally honest (I regularly flatly get answers starting with "No, that’s wrong") and it’s awesome.


Agreed that o3 can be brutally honest. If you ask it for direct feedback, even on personal topics, it will make observations that, if a person made them, would be borderline rude.


Isn't that what "direct feedback" means?

I firmly believe you should be able to hit your fingers with a hammer, and in the process learn whether that's a good idea or not :)


Yes. It's definitely a good thing.


o3 can be very honest.

But I also find it can get very fixated that some position it has adopted is right, and will then start hallucinating like crazy in defence of that fixation, and then get stuck in a defensive loop of defending its hallucinations with even more hallucinations-by hallucinations I mean stuff like producing lengthy citation lists of invented articles, and then when you point out they don’t exist, claiming stuff like “well when I search PubMed they do”, and when you point out its DOIs are made-up it apologises for the “mistake” and just makes up some more


Thank god.


Thanks for this, I just tried the same "give me feedback on this text" prompt against both o3 and Claude 4 and o3 was indeed much more useful and much less sycophantic.


Do knowledge cutoff dates matter anymore? The cutoff for o3 was 12 months ago, while the cutoff for Claude 4 was five months ago. I use these models mostly for development (Swift, SwiftUI, and Flutter), and these frameworks are constantly evolving. But with the ability to pull in up-to-date docs and other context, is the knowledge cutoff date still any kind of relevant factor?


I understood from the ancestor comments that they are specifically talking about aspects of answer quality that are very unlikely to be related to the training cut-off date.

Unless you're talking about AI-generated training data, maybe.


Um, yeah... I made a faulty context switch there.


Mostly males. I’m French and "Claude can be female" is a almost a TIL thing (wikipedia says ~5% of Claudes are women in 2022 — and apparently this 5% is counting Claudia).


Didn't know that, thanks!

(According to this source, it's more ~12% females https://www.capeutservir.com/prenoms/prenom.php?q=Claude)


> A priority was to make setup and trial easy for non-technical users

That’s a very strange priority. Why would non-technical users would be interested in a file format and a Python library ?


Yes. This is partly why this article is crap. k*G is never defined, and is the core operation in ECC (also: the article insist on using an elliptic curve in R, but you need to do it on a finite field, because on a smooth curve you can just use a smooth interpolation to solve the logarithm — and obviously once you go on a finite field the curve no longer looks nice, it’s just a seemingly random cloud of points).

Very roughly speaking, putting the complication of "point at infinity" problem under the rug, a characteristic feature of a EC is that a straight line passing through two points on the curve will pass through a third point on the curve (yes, unless you take a vertical line, point at infinity). So you can define an "addition of points on the curve" : take two points A and B, draw a straight line passing through them, name the third intersection point between the line and the curve C, declare A + B = C (actually there’s a symmetry around the x axis involved for the usual properties of addition to hold, another complication, let's sweep it under the rug too).

(for A = B, take the tangent of the curve at A ; in R you can see that it works because you can take the limit as B goes arbitrarily close to A : that gives you the tangent ; in a finite field that’s less obvious but the algebraic proof is the same)

So k*G is just G + G + ... + G, k times.

If you want more details, your favorite reasoning LLM can do a better job explaining what I’ve swept under the rug.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: