More

phildougherty · 2025-12-23T16:57:42 1766509062

Some of the Z.AI team is doing an AMA on r/localllama https://www.reddit.com/r/LocalLLaMA/comments/1ptxm3x/ama_wit...

phildougherty · 2025-12-11T16:48:23 1765471703

Pasting this whole article in to claude code "improve my codebase taking this article in to account"

minimaxir · 2025-12-11T16:59:30 1765472370

You can just give Claude Code/any modern Agent a URL and it'll retrieve it.

phildougherty · 2025-12-10T14:24:14 1765376654

"If at any point you don’t feel like struggling with ChatGPT, you get an option to continue working inside Adobe’s...". telling

phildougherty · 2025-12-09T19:18:40 1765307920

Kinda weird/unexpected to see goose by block as a founding partner. I am aware of them but did not realize their importance when it comes to MCP.

phildougherty · 2025-12-09T16:36:53 1765298213

Even if it is 10x cheaper and 2x worse it's going to eat up even more tokens spinning its wheels trying to implement things or squash bugs and you may end up spending more because of that. Or at least spending way more of your time.

amarcheschi · 2025-12-09T17:03:11 1765299791

The benchmark of swe places it in a comparable score with respect to open models and just a few points below the top notch models though

phildougherty · 2025-11-25T15:03:07 1764082987

Honestly surprised something like this can get funded

Weves · 2025-11-25T15:27:59 1764084479

"Chat UI" can "feel" a bit thin from an eng/product when you initially think about, and that's something we've had to grapple with over time. As we've dug deeper, my worry about that has gone down over time.

For most people, the chat is the entrypoint to LLMs, and people are growing to expect more and more. So now it might be basic chat, web search, internal RAG, deep research, etc. Very soon, it will be more complex flows kicked off via this interface (e.g. cleaning up a Linear project). The same "chat UI" that is used for basic chat must (imo) support these flows to stay competitive.

On the engineering side, things like Deep Research are quite complex/open-ended, and there can be huge differences in quality between implementations (e.g. ChatGPTs vs Claude). Code interpreter as well (to do it securely) is quite a tricky task.

gip · 2025-11-25T16:58:39 1764089919

My understanding of YC is that they place more emphasis on the founders than the initial idea, and teams often pivot.

That being said, I think there is an opportunity for them to discover and serve an important enterprise use case as AI in enterprise hits exponential growth.

mritchie712 · 2025-11-25T15:38:36 1764085116

w24, those were different times.

koakuma-chan · 2025-11-25T16:24:34 1764087874

Yeah that's like so long ago. But yeah, good luck competing with ChatGPT.

hobofan · 2025-11-25T16:47:37 1764089257

There are many markets (Europe), and highly regulated industries with air-gapped deployments where the typical players (ChatGPT, MS Copilot) in the field are having a hard time.

On another axis, if you are able to offer BYOK deployments and the customers have huge staff with low usage, it's pretty easy to compete with the big players due to their high per-seat pricing.

Weves · 2025-11-25T16:54:54 1764089694

There are also many teams we work with that want to (1) retain model flexibility and (2) give everyone at the company the best model for the job. Every week? a model from a different provider comes out that is better at some tasks than anyone else. It's not great to be locked out from using that model since you're a "ChatGPT" company.

kurtis_reed · 2025-11-25T15:23:23 1764084203

xenospn · 2025-11-25T15:46:48 1764085608

there's a million other project just like this one, many that are much more advanced and mature, including from Vercel. There's no moat.

Weves · 2025-11-25T16:08:01 1764086881

Agree that's a lot of other projects out there, but why do you say the Vercel option is more advanced/mature?

The common trend we've seen is that most of these other projects are okay for a true "just send messages to an AI and get responses" use case, but for most things beyond that they fall short / there a lot of paper cuts.

For an individual, this might show up when they try more complex tasks that require multiple tool calls in sequence or when they have a research task to accomplish. For an org, this might show up when trying to manage access to assistants / tools / connected sources.

Our goal is to make sure Onyx is the most advanced and mature option out there. I think we've accomplished that, so if there's anything missing I'd love to hear about it.

elpakal · 2025-11-25T18:08:17 1764094097

Alright let's say im tasked with building a fancy AI-powered research assistant and I need onyx or Vercel's ai-chatbot sdk. Why would I reach for onyx?

I have used vercel for several projects and I'm not tied to it, but would like to understand how onyx is comparable.

Benefits for my use cases for using vercel have been ease of installation, streaming support, model agnosticity, chat persistence and blob support. I definitely don't like the vendor lock in, though.

Weves · 2025-11-25T22:14:43 1764108883

> ease of installation, streaming support, model agnosticity, chat persistence and blob support

we have all of those!

> how onyx is comparable

For an AI-powered research assistant, Onyx might just work out of the box. We have ~45 connectors to common apps (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...), integrations with the most popular web search providers (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...), and a built in tool calling loop w/ deep research support (https://github.com/onyx-dot-app/onyx/blob/main/backend/onyx/...). If you wanted to customize, you could pretty easily tweak this / add additional tools (or even rip this out completely and build your own agent loop).

elpakal · 2025-11-26T03:37:09 1764128229

awesome, thank you

dbish · 2025-11-25T21:17:28 1764105448

Not wanting to use Vercel is honestly a good enough reason. If you’re a heavy Vercel user you probably aren’t their target market since they’re aiming at enterprise types from what it looks like.

phildougherty · 2025-11-25T20:18:02 1764101882

I wasn't trying to be a hater, i think it is great they got funded for this. It just felt like there are so many free options and alternatives out there that are addressing basically the same things (and look almost exactly the same) it genuinely surprised me.

phildougherty · 2025-10-29T14:50:13 1761749413

Can someone compare/contrast with deepseek-ocr?

phildougherty · 2025-10-16T17:22:50 1760635370

trained to

phildougherty · 2025-10-16T17:02:43 1760634163

Alexa skills are 3rd party add-ons/plugins. Want to control your hue lights? add the phillips hue skill. I think claude skills in an alexa world would be like having to seed alexa with a bunch of context for it to remember how to turn my lights on and off or it will randomly attempt a bunch of incorrect ways of doing it until it gets lucky.

phildougherty · 2025-10-16T16:49:42 1760633382

getting hard to keep up with skills, plugins, marketplaces, connectors, add-ons, yada yada

hiq · 2025-10-16T17:38:36 1760636316

IMHO, don't, don't keep up. Just like "best practices in prompt engineering", these are just temporary workaround for current limitations, and they're bound to disappear quickly. Unless you really need the extra performance right now, just wait until models get you this performance out of the box instead of investing into learning something that'll be obsolete in months.

lukev · 2025-10-16T18:23:47 1760639027

I agree with your conclusion not to sweat all these features too much, but only because they're not hard at all to understand on demand once you realize that they all boil down to a small handful of ways to manipulate model context.

But context engineering very much not going anywhere as a discipline. Bigger and better models will by no means make it obsolete. In fact, raw model capability is pretty clearly leveling off into the top of an S-curve, and most real-world performance gains over the last year have been precisely because of innovations on how to better leverage context.

hiq · 2025-10-16T23:18:30 1760656710

My point is that there'll be some layer doing that for you. We already have LLMs writing plans for another LLM to execute, and many other such orchestrations, to reduce the constraints on the actual human input. Those implementing this layer need to develop this context engineering; those simply using LLM-based products do not, as it'll be done for them somewhat transparently, eventually. Similar to how not every software engineer needs to be a compiler expert to run a program.

spprashant · 2025-10-16T18:17:10 1760638630

I agree with this take. Models and the tooling around them are both in flux. I d rather not spend time learning something in detail for these companies to then pull the plug chasing next-big-thing.

vdfs · 2025-10-16T19:09:12 1760641752

IMO, these are just marketing or new ways of using functions calling, under the hood they all get re-written as tools the model can call

hansmayer · 2025-10-16T17:38:27 1760636307

Well, have some understanding: the good folks need to produce something, since their main product is not delivering the much yearned for era of joblessness yet. It's not for you, it's signalling their investors - see, we're not burning your cash paying a bunch of PhDs to tweak the model weights without visible results. We are actually building products. With a huge and willing A/B testing base.

gordonhart · 2025-10-16T16:59:28 1760633968

Agree — it's a big downside as a user to have more and more of these provider-specific features. More to learn, more to configure, more to get locked into.

Of course this is why the model providers keep shipping new ones; without them their product is a commodity.

dominicq · 2025-10-16T17:28:46 1760635726

Features will be added until morale improves

xpe · 2025-10-16T17:26:22 1760635582

If I were to say "Claude Skills can be seen as a particular productization of a system prompt" would I be wrong?

From a technical perspective, it seems like unnecessary complexity in a way. Of course I recognize there are lot of product decisions that seem to layer on 'unnecessary' abstractions but still have utility.

In terms of connecting with customers, it seems sensible, under the assumption that Anthropic is triaging customer feedback well and leading to where they want to go (even if they don't know it yet).

Update: a sibling comment just wrote something quite similar: "All these things are designed to create lock in for companies. They don’t really fundamentally add to the functionality of LLMs." I think I agree.

tempusalaria · 2025-10-16T17:27:29 1760635649

All these things are designed to create lock in for companies. They don’t really fundamentally add to the functionality of LLMs. Devs should focus on working directly with model generate apis and not using all the decoration.

tqwhite · 2025-10-16T18:09:14 1760638154

Me? I love some lock in. Give me the coolest stuff and I'll be your customer forever. I do not care about trying to be my own AI company. I'd feel the same about OpenAI if they got me first... but they didn't. I am team Anthropic.

marcusestes · 2025-10-16T17:18:15 1760635095

Agreed, but I think it's actually simple.

Plugins include: * Commands * MCPs * Subagents * Now, Skills

Marketplaces aggregate plugins.

input_sh · 2025-10-16T18:34:28 1760639668

It's so simple you didn't even name all of them properly.

adidoit · 2025-10-16T18:23:58 1760639038

All of it is ultimately managing the context for a model. Just different methods

prng2021 · 2025-10-16T16:53:08 1760633588

Yep. Now I need an AI to help me use AI

josefresco · 2025-10-16T18:55:48 1760640948

Joking aside, I ask Claude how to uses Claude... all the time! Sometimes I ask ChatGTP about Claude. It actually doesn't work well because they don't imbue these AI tools with any special knowledge about how they work, they seem to rely on public documentation which usually lags behind the breakneck pace of these feature-releases.

andoando · 2025-10-16T18:47:46 1760640466

Train AI to setup/train AI on doing tasks. Bam

consumer451 · 2025-10-16T17:02:46 1760634166

I mean, that is a very common thing that I do.

wartywhoa23 · 2025-10-16T17:17:49 1760635069

That's why the key word for all the AI horror stories that have been emerging lately is "recursion".

mikkupikku · 2025-10-16T18:14:24 1760638464

"Recursion" is a word that shows up a lot in the rants of people in AI psychosis (believe they turned the chatbot into god, or believe the chatbot revealed themselves to be god.)

consumer451 · 2025-10-16T17:19:26 1760635166

Does that imply no human in the loop? If so, that's not what I meant, or do. Whoever is doing that at this point: bless your heart :)

hansonkd · 2025-10-16T17:06:50 1760634410

Thats the start of the singularity. The changes will keep accelerating and less and less people will be able to keep up until only the AIs themselves know how to use.

AaronAPU · 2025-10-16T19:13:21 1760642001

I don’t think these are things to keep up with. Those would be actual fundamental advances in the transformer architecture and core elements around it.

This stuff is like front end devs building fad add-ons which call into those core elements and falsely market themselves as fundamental advancements.

skybrian · 2025-10-16T17:28:01 1760635681

People thought the same in the ‘90’s. The argument that technology accelerates and “software eats the world” doesn’t depend on AI.

It’s not exactly wrong, but it leaves out a lot of intermediate steps.

xpe · 2025-10-16T17:33:40 1760636020

Yes and as we rely on AI to help us choose our tools... the phenomena feels very different, don't you think? Human thinking, writing, talking, etc is becoming less important in this feedback loop seems to me.

matthewaveryusa · 2025-10-16T17:09:34 1760634574

Nah, we'll create AI to manage the AI....oh

xpe · 2025-10-16T17:31:11 1760635871

abstractions all the way down:

    abstraction
      abstraction
        abstraction
          abstraction
            ...

absturtles · 2025-10-17T01:46:52 1760665612

... absturtles

xpe · 2025-10-17T12:36:55 1760704615

this is pure absturtity! ("absturtlety"?)