Hype aside, if you can get an answer to a computing problem with error bars in significantly less time, where precision just isn’t that important (such as LLMs) this could be a game changer.
Precision actually matters a decent amount in LLMs. Quantization is used strategically in places that’ll minimize performance degradation, and models are smart enough so some loss in performance still gives a good model. I’m skeptical how well this would turn out, but it’s probably always possible to remedy precision loss with a sufficiently larger model though.
No that isn’t throwing out data. Activation functions perform a nonlinear transformation to increase the expressivity of a function. If you did two matrix multiplications without ReLU in between, your function contains less information than with a ReLU in between.
Two linear transformations compose into a single linear transformation. If you have y = W2(W1*x) = (W2*W1)*x = W*x where W = W2*W1, you've just done one matrix multiply instead of two. The composition of linear functions is linear.
The ReLU breaks this because it's nonlinear: ReLU(W1*x) can't be rewritten as some W*x, so W2(ReLU(W1*x)) can't collapse either.
Without nonlinearities like ReLU, many layers of a neural network could be collapsed into a single matrix multiplication. This inherently limits the function approximation that it can do, because linear functions are not very good at approximating nonlinear functions. And there are many nonlinearities involved in modeling speech, video, etc.
Maybe other folks’ vibe coding experiences are a lot richer than mine have been, but I read the article and reached the opposite conclusion of the author.
I was actually pretty impressed that it did as well as it did in a largely forgotten language and outdated platform. Looks like a vibe coding win to me.
I have a web site that is sort of a cms. I wanted users to be able to add a list of external links to their items. When a user adds a link to an entry, the web site should go out and fetch a cached copy of the site. If there are errors, it should retry a few times. It should also capture an mhtml single file as well as a full page screenshot. The user should be able to refresh the cache, and the site should keep all past versions. The cached copy should be viewable in a modal. The task also involves creating database entities, DTOs, CQRS handlers, etc.
I asked Claude to implement the feature, went and took a shower, and when I came out it was done.
So CC has a planning mode. Shift-Tab twice to enter planning mode. I wrote out about a paragraph of text for this and it gave me back a todo list. I said "make it so" and it went and did it.
I've seen a common problem for auto-didacts is that, since the advanced and modern concepts outnumber the fundamentals, they often find themselves learning advanced concepts before the basics.
This is especially common in programming with stackoverflow or AIs where the devs look for the quickest and easiest to use solution, pushing the code and complexity beneath the rug under the dependency layer, so that their implementing code looks nice and clean.
It's hard to figure out as a begginer that the simplest and most basic solution are 10 lines of POSIX function calls, instead of three lines of "import solution" "setup solution" "use solution".
I had played around with D some time ago, and wrote some small programs in it for fun and learning. I both liked and disliked things about the language.
there was some Russian dev running a systems tech company, I forget his name, living in Thailand, like in koh samui or similar place. he used D for his work, which was software products. came across him on the net. I saw a couple of his posts about D.
one was titled, why D, and the other was, D as a scripting language.
It’s a little like go in that it compiles quickly enough to replace scripts while still yielding good enough performance for a lot of systems tasks. It predates go and I wish Google had just supported D, it’s a much nicer language IMO
Move to EKS and you still need a k8s engineer, but one who also knows AWS, and you also pay the AWS premium for the hosting, egress, etc. It might make sense for your use case but I definitely wouldn’t consider it a cost-saving measure.