sgsjchs's comments

sgsjchs · 2026-01-24T19:46:43 1769284003

It's the other way around.

sgsjchs · 2025-11-30T13:09:35 1764508175

The trick is to provide dense rewards, i.e. not only once full goal is reached, but a little bit for every random flailing of the agent in the approximately correct direction.

thegeomaster · 2025-11-30T13:39:28 1764509968

Article talks about all of this and references DeepSeek R1 paper[0], section 4.2 (first bullet point on PRM) on why this is much trickier to do than it appears.

[0]: https://arxiv.org/abs/2501.12948

Jaxan · 2025-11-30T14:04:24 1764511464

How do you know the correct direction? Isn’t the point of learning that the right path is unknown to start with?

jsnell · 2025-11-30T14:52:32 1764514352

The correct solutions and the viable paths probably are known to the trainers, just not to the trainee. Training only on problems where the solution is unknown but verifiable sounds like the ultimate hard mode, and pretty hard to justify unless you have a model that's already saturated the space of problems with known solutions.

(Actually, "pretty hard to justify" might be understating it. How can we confidently extract any signal from a failure to solve a problem if we don't even know if the problem is solvable?)

robotresearcher · 2025-11-30T18:41:33 1764528093

Your hard mode is exactly the situation that RL is used, because it requires neither a corpus of correct examples, nor insight into the structure of a good policy.

> How can we confidently extract any signal from a failure to solve a problem if we don't even know if the problem is solvable?)

You rule out all the stuff that doesn’t work.

Yes this is difficult and usually very costly. Credit assignment is a deep problem. But if you didn’t find yourself in a hard mode situation, you wouldn’t be using RL.

sgsjchs · 2025-10-10T09:21:48 1760088108

I, too, enjoy the craftsmanship, but at the end of the day what matters is that the software works as required, how you arrive at that point doesn't matter.

AdieuToLogic · 2025-10-11T01:01:17 1760144477

For me, it is not a matter of craftsmanship so much as a repeatable approach for growing the minds of junior engineers such that they have the best chance to succeed.

sgsjchs · 2025-10-09T23:48:27 1760053707

> I still don't understand this decision.

Variable declaration `T v;` means "declare `v` such that expression `v` has type `T`". Variable declaration `T *p` means declare `p` such that the expression `*p` has type `T`". etc.

kopirgan · 2025-10-10T03:14:14 1760066054

Nice explanation!

sgsjchs · 2025-10-06T16:42:34 1759768954

But in C that's just syntax sugar for pointer math.

uecker · 2025-10-07T06:01:53 1759816913

It still makes it possible to have bounds checking. (And it is also not true anymore for C2Y.)

pjmlp · 2025-10-06T16:46:57 1759769217

Except it is more obvious what is the intention, it is about clarity to the reader.

1718627440 · 2025-10-06T21:50:08 1759787408

My point was indeed, that if you don't use pointer arithmetic in C, that means that you don't use arrays. I mean when you declare arrays of a fixed size, you can also declare an equivalent number of primitive variables instead, but I would find that inconvenient. Hence the question.

smj-edison · 2025-10-06T22:38:10 1759790290

If I remember correctly, he meant that only array accesses are used, because their length can be checked (as all arrays have a static length due to no dynamic memory).

uecker · 2025-10-07T06:05:28 1759817128

Indeed, this is what many people do. But even if you use dynamic memory, if you replace pointer arithmetic by array indexing, you get bounds checking. And in C this also works for arrays of run-time length.

1718627440 · 2025-10-07T22:16:30 1759875390

But can't I put any pointer arithmetic in array brackets, so it wouldn't limit anything?

uecker · 2025-10-09T16:02:33 1760025753

Whatever index you compute can be checked against a bound.

1718627440 · 2025-10-10T17:41:41 1760118101

2[a*b] What bound?

uecker · 2025-10-11T09:42:44 1760175764

This does not even compile. For array indexing,

array[expression]

if "array" has a bound whatever expression evaluates to can be checked against the bound of array. If "array" is not a bounded array but a pointer or an unbounded array, then this does not work, but my point is that it is easy to avoid such code.

sgsjchs · 2025-10-06T16:32:59 1759768379

You very rarely would actually want scalar types which don't map directly to hardware supported ones anyway.

sgsjchs · 2025-09-14T01:13:56 1757812436

Why would you want to store arbitrary individual passwords instead of deriving them with on demand from the service name/domain and a common secret?

snailmailman · 2025-09-14T01:16:11 1757812571

If you are doing that,

- what if some site has weird password requirements and the derived password doesn’t work

- what if a site gets hacked and you need to rotate one password.

If you have to store data per-site anyway because of those cases, may as well just store passwords. You can (and should) still generate extremely high entropy passwords.

merlincorey · 2025-09-14T01:18:20 1757812700

Additionally, you can store other data for example one could have scans of important documents that are stored in Pass which means they are GPG encrypted and backed by a git repository so they are versioned and shared across multiple machines.

lucb1e · 2025-09-14T01:57:25 1757815045

indeed. Additionally:

- if your secret leaks and you don't know it (or you do know, but you need some time to change it), the attacker not only gets the snapshot of your password manager but also can derive all future passwords you'll generate, or past ones you long forgot about

- there's no way to know what you've entered before, since it's stateless. With data stored in a manager, I know what username I used and can associate other data. If your uniqueifying input is the domain, and let's say HN would become hn.yc or whatever and you visit it again in ten years, you'd have to remember that hn.yc accepts the password of what you entered as news.ycombinator.com

I have to admit though, hash(name+secret)=password is so simple and beautiful that it draws IT people like a fine artwork draws visitors. But for me, that doesn't outweigh the practical issues

akerl_ · 2025-09-14T01:17:29 1757812649

Because the former works with any site and circumstance and the latter does not.

gmuslera · 2025-09-14T01:53:40 1757814820

Not all sites are safe, either by design or by people running them. Having a common secret+service name as password AND having at least one of those sites leaking your plaintext password could mean that your derivation may go public and all your other passwords and services fall because of that.

listeria · 2025-09-14T02:57:06 1757818626

presumably the derivation would involve a cryptographically secure, non-reversible function so as to not compromise the secret should one of them be leaked.

jibal · 2025-09-14T05:43:17 1757828597

"deriving them" != op<+>

sgsjchs · 2025-09-06T20:34:19 1757190859

A socket.

spacechild1 · 2025-09-07T00:30:13 1757205013

How so? Doesn't your socket class have a default constructor and a notion of open and closed?

sgsjchs · 2025-09-07T01:05:03 1757207103

If the moves were destructive, I'd design it to have the default constructor call `::socket` and destructor call `::close`. And there wouldn't be any kind of "closed" state. Why would I want it?

spacechild1 · 2025-09-07T01:24:14 1757208254

Your socket class would have no default constructor? And you would never want to close the socket before the object's lifetime ends? Really?

sgsjchs · 2025-09-07T09:33:41 1757237621

In this case, I would want the address family and protocol to be statically known, so it would have default constructor. But for example, a file might not have one, sure. As for closing before lifetime ends, why? I can just end lifetime. Wrap it in an optional if the type system can't figure it out like with a struct member.

spacechild1 · 2025-09-07T10:08:03 1757239683

> so it would have default constructor.

And what's the underlying value of such a default constructed socket? I assume it would be -1 resp. INVALID_SOCKET, in which case the destructor would have to deal with it.

> Wrap it in an optional if the type system can't figure it out like with a struct member.

So you essentially must wrap it in an optional if you want to use it as a member variable. I find this rather pointless as sockets already have a well-defined value for empty state (-1 resp. INVALID_SOCKET). By wrapping it in a optional you are just wasting up to 8 bytes.

Sure, you can implement a socket class like that, but it's neither necessary nor idiomatic C++.

sgsjchs · 2025-09-07T10:52:44 1757242364

> And what's the underlying value of such a default constructed socket? I assume it would be -1 resp. INVALID_SOCKET

No, as explained, the default value would be the result of `::socket` call, i.e. a fresh OS-level socket.

> So you essentially must wrap it in an optional if you want to use it as a member variable.

No, you only must wrap it if you really want this closed state to exist.

> Sure, you can implement a socket class like that, but it's neither necessary nor idiomatic C++.

Obviously. Because the moves are not destructive. If they were, this design would be superior. And the wasted space for optional is solvable, just like for non-nullable pointers.

spacechild1 · 2025-09-07T20:33:06 1757277186

> If they were, this design would be superior.

I see how destructive moves would slightly simplify the implementation, but what difference would it make apart from that? (Don't get me wrong, I totally think that destructive moves are a good idea in general, I just don't see the qualitative difference in this particular case.)

> And the wasted space for optional is solvable, just like for non-nullable pointers.

In the case of non-nullable pointers the library author knows that they can use NULL as a sentinel value and write a corresponding specialization. But what could you possibly do with an arbitrary user-defined class?

sgsjchs · 2025-09-07T22:52:50 1757285570

> what difference would it make

The same difference as making pointers always non-nullable and reintroducing nullability via an optional wrapper only when semantically appropriate.

> what could you possibly do with an arbitrary user-defined class

Just add some customization points to std::optional so that users can define which value of the class to treat as noneopt internally.

spacechild1 · 2025-09-08T07:52:29 1757317949

> The same difference as making pointers always non-nullable and reintroducing nullability via an optional wrapper only when semantically appropriate.

Again, I don't see what this has to do with destructive moves. If you want a socket class that always refer to an open socket, you can already do that. Same for non-nullable pointer wrappers. Conversely, destructive moves don't prevent you from implementing a socket class with a close() method. These concepts are really orthogonal.

> Just add some customization points to std::optional so that users can define which value of the class to treat as noneopt internally.

How is this supposed to work? The very point of your socket class is that it always contains a valid socket handle. Once you introduce a sentinel value, you are back to square one. If the optional class is able to construct a socket with the sentinel value, so is the user.

sgsjchs · 2025-09-08T11:11:10 1757329870

> Again, I don't see what this has to do with destructive moves. If you want a socket class that always refer to an open socket, you can already do that.

Technically you can, but it's unreasonable to create an os-level socket just to put into the moved-out object where it will be immediately destroyed again. This is not an issue when the moves are destructive.

> How is this supposed to work? The very point of your socket class is that it always contains a valid socket handle. Once you introduce a sentinel value, you are back to square one. If the optional class is able to construct a socket with the sentinel value, so is the user.

That's not true. The sentinel value need not be exposed in the public interface of the class, it can only be accessible via the customization point of the optional.

spacechild1 · 2025-09-08T12:59:44 1757336384

> Technically you can, but it's unreasonable to create an os-level socket just to put into the moved-out object where it will be immediately destroyed again. This is not an issue when the moves are destructive.

No, the class can use a sentinel value internally only to mark moved-from objects. That's exactly where we actually started the conversation. That's why I said that destructive moves would only somewhat simplify the move operations, but not make a qualitative difference (in this area).

> The sentinel value need not be exposed in the public interface of the class, it can only be accessible via the customization point of the optional.

Since the optional would need to construct an instance with the sentinel value, I thought that the "sentinel" constructor must be public. However, you might be right that one could write a template specialization that contains the template argument as a friend class. In this case you could use a private constructor. Note that the destructor still has to handle the sentinel value... But I guess this is just something you have to accept.

sgsjchs · 2025-09-08T17:22:55 1757352175

> No, the class can use a sentinel value internally only to mark moved-from objects. That's exactly where we actually started the conversation.

The issue is that the "moved-from" state is exposed to the user when the moves are not destructive. The author of the class has to consider behavior for every method in sentinel state, even when it's just to assert that the state isn't sentinel or "lol it's UB". And the user has to be careful not to accidentally misuse an object in sentinel state. Just like how every time you touch a nullable pointer you have to consider if it can be null and what to do in that case. As long as the sentinel state is exposed at all (via non-destructive move), there is little gain in not providing full support for it. However, with destructive moves the sentinel value either doesn't exist at all or only exists completely internally as an optimization, and all this mental overhead disappears.

spacechild1 · 2025-09-08T21:34:28 1757367268

I see your point. Just a few things:

1. This is only relevant when using such class as a local variable. Member variables are typically not moved-from.

2. In my understanding the user has the freedom to specify what constitutes a "valid but unspecified state" and it would be perfectly ok to mandate that anything you can do with a moved-from object is to either destroy or reassign it.

3. The problems with the state of moved-from objects from the perspective of a library author could have been prevented simply by imposing stricter requirements in the standard (e.g. every usage except destruction, and possible reassignment, shall be UB).

4. With all the issues you've pointed out, it is still be perfectly possible and reasonable to design a socket class your way (= no closed socket state) in C++, yet somehow most people seem to prefer open() and close() methods instead of modelling the state with an optional. Even in the presence of destructive moves, I don't think that one way is necessarily better than the other and it is mostly a matter of culture and personal preference.

All the being said, I definitely agree that destructive moves are good thing, in particular if the compiler prevents you accidentally accessing moved-from objects (which is a mistake that is very easy to make in C++).

sgsjchs · 2025-09-08T22:47:10 1757371630

Indeed, the "valid but unspecified state" refers only to some types defined in the he standard library. It essentially means that you can only call methods which have no preconditions and don't depend on what that state is, e.g. assignment or destruction, or something like string::clear or vstring::assign if you want defined outcomes. In general each type is free to guarantee whatever the author wants about the moved from state, e.g. moved-from std::unique_ptr is always null.

7jjjjjjj · 2025-09-07T05:01:32 1757221292

With destructive moves, you can end an object's lifetime whenever you want.

spacechild1 · 2025-09-07T07:48:56 1757231336

How would I use such a socket class as a member variable? How do I reopen the socket?

sgsjchs · 2025-09-07T09:34:25 1757237665

Reopen by constructing and assigning a new socket.

spacechild1 · 2025-09-07T18:08:46 1757268526

So I essentially have to wrap it in something like std::optional. Well, that's certainly one way to write a socket class, but I'd say it's not idiomatic C++. (I have never seen a socket class being implemented like that.)

sgsjchs · 2025-09-07T22:40:23 1757284823

You don't need optional in this case, the assignment would just destroy the old socket and immediately move the new one in its place.

spacechild1 · 2025-09-08T07:34:05 1757316845

Well, reopening a socket implies that I have manually closed the socket, which does require an optional with your implementation.