What are theses things called Parameters?
Are they sentences? Are they words? Are they something else?
What actually is a language model file?
I was under the impression that LLMs were single files, but I realised today that I hadn’t actually double checked this.
Then I wondered, is there a really simple example for how to build your own tiny, tiny LLM … and even if it wasn’t predicting words correctly yet, would be good to know how that works.
So, what are my thoughts on this:
- I am assuming some sort of tool (maybe PyTorch?) is used for building this?
- How would I prepare the original data set of files
There’s the Olama package, which lets you use any LM
- Are they singular files?
- Are there different types or formats of LLM files?
- Are these LLMs stored in RAM memory, and/or GPU memory?
- What resources (time, energy) does it take to train LLMs?
- Who is training LLMs at the moment?
- What does customising a model mean?
Moondream 2
On my travels today I discovered Moondream 2 … it was on the list of Llama models that I was reading through. Will look into this another time. It’s a micro LLM for vision.
Leave a Reply