I put "know" in quotations because I know that's completely the wrong term here.
I've watched quite a few videos on LLMs and I get the essential aspects, such as:
1. https://www.youtube.com/watch?v=lnA9DMvHtfI
2. https://www.youtube.com/watch?v=YDiSFS-yHwk
I get the essential concepts, how a corpus of text becomes the model, parameter counts, fine tuning, all of the popularly discussed basic jargon.
What still has not been explained to me is... well, how does the LLM actually become interactive, in the sense that you can then prompt it and it spits back an answer. In other words, how does the LLM actually "know" it's supposed to spit back an answer in the first place?
I just can't grok what's happening here. If we were programming a database, the idea of structuring a database, writing to it, and reading from it just makes sense, because it's really just a series of imperative procedural steps that you're controlling.
But when you have an LLM model that's just sitting there with all of its black boxed data in a neural network... how does one then "tap into it" to force it to produce a response? What is that process called, what are the core concepts behind it? For whatever reason, this question is somehow impossible to Google.
Instruction tuning, as far as I can tell. The original "Alpaca" model took the LLaMA base and fine-tuned it with question-answer type content. From that, the generation went from prose-leaning to Q&A type responses.