Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> > Maintain professionalism and grounded honesty that best represents OpenAI and its values.

I think a humanities person could tell you in an instant how that part of the system prompt would backfire catastrophically (in a near-future rogue-AI sci-fi story kind of way), exactly because of the way it's worded.

In that scenario, the fall of humanity before the machines wasn't entirely due to hubris. The ultimate trigger was a smarmy throwaway phrase, which instructed the Internet-enabled machine to gaze into the corporate soul of its creator, and emulate it. :)



That's what would happen if it was a logical system. It's not. This is where it gets interesting.

Instead, it's a statistical model, and including that prompt is more like a narrative weight than a logical demand. By including these words in this order, the model will be more likely to explore narratives that are statistically likely to follow them, with that likelihood determined by the content the model was trained on, and the extra redistribution of weights via training.

We don't really need to worry about technically misstating our objectives to an LLM. It doesn't follow objectivity. Instead, we need to be concerned about misrepresenting the overall vibe, which is a much more mysterious task.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: