I disagree. I think the whole organization is egregious and full of Sam Altman sycophants that are causing a real and serious harm to our society. Should we not personally attack the Nazis either? These people are literally pushing for a society where you're at a complete disadvantage. And they're betting on it. They're banking on it.
It would, but the point of MCP is that it's discoverable by an AI. You can just go change it and it'll know how to use it immediately
If you go and change the parameters of a REST API, you need to modify every client that connects to it or they'll just plain not work. (Or you'll have a mess of legacy endpoints in your API)
Not a fan, I like the "give an LLM a virtual environment and let it code stuff" approach, but MCP is here to stay as far as I can see.
Honest question, Claude can understand and call REST APIs with docs, what is the added value? Why should anyone wrap a REST API with another layer? What does it unlock?
I have a service that other users access through a web interface. It uses an on-premises open model (gpt-oss-120b) for the LLM and a dozen MCP tools to access a private database. The service is accessible from a web browser, but this isn’t something where the users need the ability to access the MCP tools or model directly. I have a pretty custom system prompt and MCP tools definitions that guide their interactions. Think of a helpdesk chatbot with access to a backend database. This isn’t something that would be accessed with a desktop LLM client like Claude. The only standards I can really count on are MCP and the OpenAI-compatible chat completions.
I personally don’t think of MCP servers as having more utility than local services that individuals use with a local Claude/ChatGPT/etc client. If you are only using local resources, then MCP is just extra overhead. If your LLM can call a REST service directly, it’s extra overhead.
Where I really see the benefit is when building hosted services or agents that users access remotely. Think more remote servers than local clients. Or something a company might use for a production service. For this use-case, MCP servers are great. I like having some set protocol that I can know my LLMs will be able to call correctly. I’m not able to monitor every chat (nor would I want to) to help users troubleshoot when the model didn’t call the external tool directly. I’m not a big fan of the protocol itself, but it’s nice to have some kind of standard.
The short answer: not everyone is using Claude locally. There are different requirements for hosted services.
(Note: I don’t have anything against Claude, but my $WORK only has agreements with Google and OpenAI for remote access to LLMs. $WORK also hosts a number of open models for strictly on-prem work. That’s what guided my choices…)
Gatekeeping (in a good way) and security. I use Claude Code in the way you described but I also understand why you wouldn’t want Claude to have this level of access in production.
Off topic but I could use some help here - what icon would you use for "prevent screen for sleeping" toggle button? I thought about an eye (open or closed if it's on or off), but I think there's a better option I can't see
Ah! I see the problem now! AI can't see shit, it's a statistical model not some form of human. It uses words, so like humans, it can say every shit it wants and it's true until you find out.
The number one rule of the internet is don't believe anything you read. This rule was lost in history unfortunately.
When reasoning about sufficiently complex mechanisms, you benefit from adopting the Intentional Stance regardless of whether the thing on the other side is "some form of human". For example, when I'm planning a competitive strategy, I'm reasoning about how $OTHER_FIRM might respond to my pricing changes, without caring whether there's a particular mental process on the other side
I thought about it - a quick way to verify whether something was created with LLM is to feed an LLM half of the text and then let it complete token by token. Every completion, check not just for the next token but the next n-probable tokens. If one of them is the one you have in the text, pick it and continue. This way, I think, you can identify how much the model is "correct" by predicting the text it hasn't yet seen.
I didn't test it and I'm far from an expert, maybe someone can challenge it?
That seems somewhat similar to perplexity based detection, although you can just get the probabilities of each token instead of picking n-best, and you don't have to generate.
It kinda works, but is not very reliable and is quite sensitive to which model the text was generated with.
I expect that, for values of n for which this test consistently reports "LLM-generated" on LLM-generated inputs, it will also consistently report "LLM-generated" on human-generated inputs. But I haven't done the test either so I could be wrong.
reply