Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

An AI class that I took decades ago had just a 1 day session on "AI ethics". Somehow despite being short, it was memorable (or maybe because it was short...)

They said ethics demand that any AI that is going to pass judgment on humans must be able to explain its reasoning. An if-then rule says this, or even a statistical correlation between A and B indicates that would be fine. Fundamental fairness requires that if an automated system denies you a loan, a house, or a job, it be able to explain something you can challenge, fix, or at least understand.

LLMs may be able to provide that, but it would have to be carefully built into the system.





I'm sure you could get an LLM to create a plausible sounding justification for every decision? It might not be related to the real reason, but coming up with text isn't the hard part there surely

> I'm sure you could get an LLM to create a plausible sounding justification for every decision.

That's a great point: funny, sad, and true.

My AI class predated LLMs. The implicit assumption was that the explanation had to be correct and verifiable, which may not be achievable with LLMs.


It seems solvable if you treat it as an architecture problem. I've been using LangGraph to force the model to extract and cite evidence before it runs any scoring logic. That creates an audit trail based on the flow rather than just opaque model outputs.

It's not. If you actually look at any chain-of-thought stuff long enough, you'll see instances where what it delivers directly contradicts the "thoughts."

If your AI is *ist in effect but told not to be, it will just manifest as highlighting negative things more often for the people it has bad vibes for. Just like people will do.


Yes, they will, they'll rationalize whatever. This is most obvious w/ transcript editing where you make the LLM 'say' things it wouldn't say and then ask it why.

It sounds like you're saying we should generate more bullshit to justify bullshit.

They said "could", not "should".

I believe the point is that it's much easier to create a plausible justification than an accurate justification. So simply requiring that the system produce some kind of explanation doesn't help, unless there are rigorous controls to make sure it's accurate.


> Fundamental fairness requires that if an automated system denies you a loan, a house, or a job, it be able to explain something you can challenge, fix, or at least understand.

That could get interesting, as most companies will not provide feedback if you are denied employment.


Fair point. Maybe the requirement should be that the automated system provide an explanation that some human could review for fairness and correctness. While who receives the explanation may be a separate question, the drawback of LLMs judging people is that said explanation may not even exist.

This is the law in the EU, I think

the way i understand it is that the law says decisions must be reviewed by a human (and i am guessing should also be overrideable), but this still leaves the question how the review is done and what information the human has to make the review.

I hate this. An explanation is only meaningful if it comes with accountability, knowing why I was denied does me no good if I have no avenue for effective recourse outside of a lawsuit.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: