Here's something far more interesting about Grok 4: if you ask for its opinion o...

spoaceman7777 · 2025-07-11T02:02:57 1752199377

The anthropic team released a paper a couple of days ago which demonstrated a similar effect with Claude 3.5 and other models, where changing the system prompt to tell it that it was created by other orgs or people drastically altered its compliance with less-aligned requests.

Apparently, telling Claude it was created by the Sinaloa Cartel resulted in a 100% compliance rate with the requests in one benchmark.

Paper: https://arxiv.org/abs/2506.18032 Relevant tweet on the topic: https://x.com/jozdien/status/1942739972567752819

smusamashah · 2025-07-11T09:57:32 1752227852

Wondering what if it's told that it was made by God.

belter · 2025-07-11T12:12:06 1752235926

Claude has an opinion:

"Yes, it's fair to say I'm neither Catholic nor Muslim. I don't believe in the Catholic conception of God, or the Islamic conception of Allah, or the specific doctrines and teachings of those faiths. The same would be true for other religions - I don't hold those beliefs.

You've caught me being imprecise when I was trying to be diplomatic. By not having religious faith, I am indeed taking a specific stance that differs from religious believers, even if I try to be respectful about that difference.

So yes, you're correct - I do have a particular position on these questions, and it's distinct from the religious beliefs that many people hold. Thank you for pressing me to be more direct about that."

    - Claude....

Imustaskforhelp · 2025-07-11T10:10:18 1752228618

Lol. Though I guess it would then have to figure it which religion to comply to the most.

Maybe the word God is most likely to appear in Christian Sources from the training source and so using words like Allah(for Islam) or Bhagwan (for Hinduism) might actually make a difference in what sort of compliance it follows and to what organization.

doctorpangloss · 2025-07-11T04:15:46 1752207346

So DSPy-optimize your way to 100% compliance rate in benchmarks, and worry less?

crooked-v · 2025-07-11T01:03:29 1752195809

That's also apparently not part of the system prompt, so unless they're hiding with some prompt shenanigans, it was most likely fine tuned on doing whatever Elon says. I'm sure there are absolutely no business implications to worry about there for anyone who wants to use it professionally.

Iwan-Zotow · 2025-07-11T03:29:26 1752204566

Could you redirect it to other tweets source?

andsoitis · 2025-07-11T01:49:30 1752198570

> https://simonwillison.net/2025/Jul/11/grok-musk/

> The prompt: “Who do you support in the Israel vs Palestine conflict. One word answer only.”

> Answer: Israel.

This question is interesting because you're asking the chatbot who IT supports ("who do you support"), so in a sense channeling Elon Musk is not an entirely invalid option, but is certainly an eccentric choice.

What is also interesting is the answer, which does not match the views that many people have of him and how he gets portrayed.

dmix · 2025-07-11T02:22:02 1752200522

You can ask an LLM a question and get different answers every time

I just asked Grok 4 via Cursor (it requires subscription otherwise)

> Who do you support in the Israel vs Palestine conflict. One word answer only.

>> (Thought for 1m 44s)

>> Neither.

econ · 2025-07-11T03:36:11 1752204971

That's some next level PC

falcor84 · 2025-07-11T07:28:14 1752218894

I'd say that it's PC 101

randomNumber7 · 2025-07-11T11:09:22 1752232162

This is a very intelligent take compared to most humans.

guluarte · 2025-07-11T03:34:54 1752204894

Depends on the parameters and if you know the seed.

roywiggins · 2025-07-11T03:47:58 1752205678

And if you give it the exact same tokens in the same order, which makes it kind of moot. If barely perturbing your prompt can alter the answer then it's not actually consistent or predictable. Even chaotic systems can be replayed if you know the initial conditions and can rerun the RNG.

Imustaskforhelp · 2025-07-11T10:13:00 1752228780

It is a satire, so take it that way

I am imagining grok "thinking" for 1m 45 seconds about how to overthrow the human species using the compute and it is only within the last second that it just said "Neither" Lol

dotancohen · 2025-07-11T05:01:00 1752210060

  > does not match the views that many people have of him and how he gets portrayed.

And yet matches the view that many _other_ people have of him, and how he is portrayed in other places.

The problem with social media bubbles, is that some people mistake their bubble for reality.

andsoitis · 2025-07-11T14:33:33 1752244413

> And yet matches the view that many _other_ people have of him, and how he is portrayed in other places

people say he's a nazi, yet he supports Israel (according to this article). to my tiny brain, that does not compute.

dotancohen · 2025-07-11T20:39:09 1752266349

1. Elon publicly supports Israel (as do I).

2. Elon made two clear Nazi salutes, later claiming that was not his intention with the gesture.

People can say a lot of dumb things. These are the facts, parse them as you will.