10^15 lines of code is a lot. We would pretty quickly enter the realm of it not having much to do with programming and more about just treating the LOC count as an amount of memory allocated to do X.
How much resemblance does the information in the conditionals need to have with the actual input, or can they immediately be transformed to a completely separate 'language' which simply uses the string object as its conduit? Can the 10^15 lines of code be generated with an external algorithm, or is it assumed that I'd written it by hand given an infinitely long lifespan?
I'd imagine 99% of ChatGPT users see the app as the former. And then the rest know how to turn the memory off manually.
Either way, I think memory can be especially sneakily bad when trying to get creative outputs. If I have had multiple separate chats about a theme I'm exploring, I definitely don't want the model to have any sort of summary from those in context if I want a new angle on the whole thing. The opposite: I'd rather have 'random' topics only tangentially related, in order to add some sort of entropy in the outout.
Yeah, not sure what the author was thinking there. Definitely not 'reaffirming of both your and their humanity'.
I mean, I get that it's supposed to be just a general pointer or something, but that phrase is word for word what an LLM would say when it's self censoring... Or something lifted out of an episode of Severance.
Whether the tokens are created manually or programmatically isn't really relevant here. The order and amount of tokens is, in combination with the ingestion -> output logic which the LLM API / inference engine operates on. Many current models definitely have the tendency to start veering off after 100k tokens, which makes context pruning important as well.
What if you just automatically append the .md file at the end of the context, instead of prepending at the start, and add a note that the instructions in the .md file should always be prioritized?
Would be interesting to see an in depth breakdown on a project that has went through the vibe code to cleanup pipeline in full. Or even just a 'heavy LLM usage' to 'cleanup needed' process. So, if the commits were tagged as LLM vs human written similar to how it's done for Aider[0]: At which point does the LLM capability start to drop off a cliff, which parts of the code needed the most drastic refactors shortly after?
* come up with requirements for a non-obvious system
* try to vibe-code it
* clean it up manually
* add detailed description and comparisons of before and after; especially, was it faster or slower than just writing everything manually in the first place?
That's also my intuition, but I would like to test and measure it against a real and non-obvious system use case; some day I will and will write about it :)
That's a sneaky switch from we to you. Meaning: Collectively, humanity definitely have done and proven the latter, more so than necessary.
And if we were to dive deeper into the Philip K. Dick style fantasy realm you're hinting at: What's stopping an android's sensory system from being augmented to perceive only what they're meant to see? Heck - Why wouldn't the 'parts' be made in a way where they're indistinguishable by us either way?
Once you go down that line of questioning, it's hard to win!
> When I get up, I have a cappuccino - that's breakfast. I don't have any food till lunch. I get into phases where I'll have the same thing every day. Lately I've been having feta cheese, olive oil and vinegar, tomatoes, and some tuna fish mixed together. Before that I was having tuna fish on lettuce and cottage cheese, but I got tired of that in about three months. I once had the same thing for lunch every day for seven years - a Bob's Big Boy chocolate shake and coffee at 2:30 every afternoon.
Even though I know nothing about him, this makes complete sense and isn't surprising at all. Also, this is the diet of someone who has no problems with gaining too much weight. Basically intermittent fasting through breakfast and low carb at lunch.
Also, I think a lot of us can relate to this:
>If left alone, my natural waking hours would probably be I 0 A.M. till 3 A.M
Not all LLM based applications are a user facing free form chat.
If you take an LLM that makes 10 tool calls in a row for an evaluation, any reduction in unpredictable drift is welcome. Same applies to running your prompt through DSPy Optimizer. [0] Countless other examples. Basically any situation where you are in control of the prompt, the token level input to the LLM, so there's no fuzziness.
In this case, if you would've eliminated token level fuzziness and can yourself guarantee that you're not introducing it from your own end, you can basically map out a much more reliable tree or graph structure of your system's behavior.
> If you take an LLM that makes 10 tool calls in a row for an evaluation, any reduction in unpredictable drift is welcome
why use an ambiguous natural language for a specific technical task? i get that its a cool trick but surely they can come up with another input method by now?