I have a service that other users access through a web interface. It uses an on-...

I have a service that other users access through a web interface. It uses an on-premises open model (gpt-oss-120b) for the LLM and a dozen MCP tools to access a private database. The service is accessible from a web browser, but this isn’t something where the users need the ability to access the MCP tools or model directly. I have a pretty custom system prompt and MCP tools definitions that guide their interactions. Think of a helpdesk chatbot with access to a backend database. This isn’t something that would be accessed with a desktop LLM client like Claude. The only standards I can really count on are MCP and the OpenAI-compatible chat completions.

I personally don’t think of MCP servers as having more utility than local services that individuals use with a local Claude/ChatGPT/etc client. If you are only using local resources, then MCP is just extra overhead. If your LLM can call a REST service directly, it’s extra overhead.

Where I really see the benefit is when building hosted services or agents that users access remotely. Think more remote servers than local clients. Or something a company might use for a production service. For this use-case, MCP servers are great. I like having some set protocol that I can know my LLMs will be able to call correctly. I’m not able to monitor every chat (nor would I want to) to help users troubleshoot when the model didn’t call the external tool directly. I’m not a big fan of the protocol itself, but it’s nice to have some kind of standard.

The short answer: not everyone is using Claude locally. There are different requirements for hosted services.

(Note: I don’t have anything against Claude, but my $WORK only has agreements with Google and OpenAI for remote access to LLMs. $WORK also hosts a number of open models for strictly on-prem work. That’s what guided my choices…)